s******d 发帖数: 303 | 1 if I use infile and if statement, it tooks a long time for sas to go through
the 150G. I wonder if there is an efficient way to deal with this issue.
Thanks so much. |
|
s******d 发帖数: 303 | 2 需要 infile 的是个很大的csv file 有2000 samples, 每个sample 作了 500 次检查,
每次检查包括3个项目. 现在的格式是 sample*tests
sample1, exam1, outcome1, outcome2, outcome3
sample1, exam2, outcome1, outcome2, outcome3
....
....
sample1, exam500, outcome1, outcome2, outcome3
现在需要从这个大csv file 读取数据, 每个sample 一个file. 每个file 包括500次
exam 以及相应的3个outcomes.
由于我不需要读取所有的sample, 只是挑选其中的495个sample,然后每个sample输出成
一个file. 问题是
1)我不想手工重复495次操作,
2)那个csv file 很大,如果我为了每一个sample 都重新读一边这个csv file, 时间
会很长。
有没有什么办法读一篇csv file 就自动输出495 sample 的file, |
|
l*****k 发帖数: 587 | 3 use perl to read your samples into an array, then
loop through your file
system(grep sample infile > sample.csv)
I also think combination of uniq, awk commands can also achieve
that in shell
查, |
|
p********a 发帖数: 5352 | 4 这个不是软件的问题,是逻辑的问题。用SAS INFILE读一次输入到495个FILES就可以了
。SAS那个SINGLE @就是专门HOLD变量值,检测是否继续读下去的。唉,和你说再多也
没用啊。 |
|
F*******1 发帖数: 75 | 5 请教一个SAS 数据读入的问题. 我有个样本文件. 见附件.
我只需要第6行到第11行的数据. 在data statement 中, 我可以用firstobs=6 来定位
起事行. 那用什么来定位末位行呢? 我试了lastobs=11 不work. 谢谢!
如果我先读入所有数据,再提取需要的数据.我的sas script 如下. 但结果不对, 第7行
到第11行的数据丢了. 是什么原因呢? 谢谢!
data RT1;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile "\\mkscsas01\saswork\20090108_asm_rtmcp_final.csv" delimiter =
',' MISSOVER DSD lrecl=55010 firstobs=6 ;
format Pnode $12. MCPType $12.;
INPUT Pnode $ Zone $ MCPType $ HE1 - HE24;
run; |
|
o****o 发帖数: 8077 | 6 try the following code. At least on my PC, given the data structure, it
works on the data sets downloaded from the website
the point is to use Text Pointer in Data step;
/* take final.csv as one of the csv file on the website*/
filename final "c:\final.csv";
data test;
infile final delimiter=',' truncover dsd;
input @"MISO Wide,-," type :$10. HE1-HE23;
if ^missing(type) then output;
run; |
|
j******1 发帖数: 62 | 7 112.
The contents of the raw data file size are listed below?
72 95
the following program is submitted:
data test;
infile 'size';
input @1 height 2. @4 weight 2;
Which one of the following is the value of the variable weight in the output
data set?
A 2
B 72
C 95
D .(missing numeric value)
the answer is A. I have tried this program, the result was 2.
Then I changed 2 after weight to 1, the result is 7.
Weight should read valve @4, right?
Thanks a lot. |
|
A**P 发帖数: 260 | 8 要读入下面的数据格式:
Date,Open,High,Low,Close,Volume,Adj Close
05Mar2009,47.56,51.95,46.98,,0,50.17
04Mar2009,48.02,48.83,45.02,47.56,0,47.56
为了正确处理第一行的missing value,使用了DSD option。程序如下:
data index.vix;
infile "Z:\public\vix.csv" dlm=',' dsd firstobs=2;
input Date anydtdte. Open High Low Close Volume AdjClose;
run;
SAS always assign missing values to variable Open. Can anyone help? |
|
p********a 发帖数: 5352 | 9 YAHOO的股市DATA?你把DATE FORMAT改成DATE9.就可以了
看看俺的MACRO
%macro getdata(tic);
FILENAME myurl URL "http://ichart.finance.yahoo.com/table.csv?s=&tic";
DATA &tic;
INFILE myurl FIRSTOBS=2 missover dsd;
format date yymmdd10.;
INPUT Date: yymmdd10. Open High Low Close Volume Adj_Close ;
if date>=today()-180;
RUN; |
|
h******e 发帖数: 1791 | 10 DATA toads;
infile 'D:\My Documents\My SAS Files\toadjump.dat';
input toadname $ weight jump1 jump2 jump3;
proc print data=toads;
title 'toad jump';
run;
这段代码死活不work,总是说 “Physical file does not exist, D:\My Documents\
My SAS Files\sasuser.toadjump.dat”,可这个文件明明就在那里。多谢。 |
|
j*****t 发帖数: 83 | 11 Base 123
14.
Ranch 。。。。。
Split。。。。。
Condo.........
Twostory.......
Ranch..........
Split.........
Split..........
data;
Infile
Input Style $@
if style = 'condo' or Style='Ranch';
input .....
HOw many obersvations will the output data set obtain?
答案是3个
13.
一样的data,只是 if 后面接了then答案也变成了7。
为什么加了then,答案就是7了呢?程序不是也要经过一个if的判断吗? |
|
d*****g 发帖数: 4081 | 12 有这么一句我想翻到R里面用
INFILE LOCATION(**.txt) truncover FIRSTOBS = 2 DLM = ',.';
高手给解释一下吧!!谢谢了 |
|
d*******1 发帖数: 175 | 13 I have a data file which is in a txt format,which includes a lot of columns
and is delimited by "|". I only want to pull a few of fields from data files
into SAS dataset.
How could I do that?
Any help would be appreicated. |
|
|
|
t**s 发帖数: 156 | 16 try define the type of the variables first |
|
|
Y**********8 发帖数: 67 | 18 好吧!暂时先承认下来
确实还处于扫盲阶段
ADV tutor还没看完
不过您这鄙视有点赤裸裸了
如此傲慢的版大
&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&
受刺激之下运行以下的程序,
DATA models;
INFILE 'd:\MyData\Models.dat' TRUNCOVER;
INPUT Model $ 1-12 Class $ Price Frame $ 28-38;
RUN;
%LET bikeclass = Mountain;
* Use a macro variable to subset;
PROC PRINT DATA = models NOOBS;
WHERE Class = "&bikeclass";
FORMAT Price DOLLAR6.;
TITLE "Current Models of &bikeclass Bicycles";
RUN;
证明确实不是非正版SAS的问题。是我的问题。
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
|
b*******g 发帖数: 513 | 19 别的我不知道,但在data步中infile一个外部的excel data file, 如果这个外部
excel file列数很长,或者列数不长,但每一个数据,小数点后精确好多位,在这种情
况下如不用lrecl=比较大一个数字这个option,则不能正确读入数据。 |
|
w*********a 发帖数: 3 | 20 Question 15.
data numrecords;
infile 'file-specification';
input @1 patient $15. relative $ 16-26@;
if relative='children' then
input @ 54 diagnosis $15. @;
else if relative= 'parents' then
input @28 doctor $15. clinic $ 44-53 @54 diagnosis $15. @;
input age;
run;
How many raw data records are read during each iteration of the DATA
step during execution?
A.1 B.2 C.3 D.4
Question 39.
A raw data file is listed below:
John McCloskey 35 71
June Rosesette 10 43
TinekeJones 9 37
The following SAS |
|
o******6 发帖数: 538 | 21 ☆─────────────────────────────────────☆
cyear (cyear) 于 (Tue May 13 19:56:31 2008) 提到:
有个.txt文件,
A
B
C 25 9
D 14 3
怎样能生成象下面这样的SAS 文件:
A B C 25 9
A B D 14 3
谢谢!
☆─────────────────────────────────────☆
cyear (cyear) 于 (Tue May 13 21:55:30 2008) 提到:
没人帮忙呀?急。。。
☆─────────────────────────────────────☆
sir ( 郎 ) 于 (Tue May 13 22:04:02 2008) 提到:
data one;
iNFILE CARDS MISSOVER;
input c3 $1 n1 n2;
cards;
A
B
C 25 9
D 14 3
;
run;
data two;
length c1 $1;length c2$1;
se |
|
y****2 发帖数: 34 | 22 data one;
infile "..\data.txt";
input a$;
input b$;
do i =1 to 2;
input c$ e f;
output;
end;
drop i;
run; |
|
y****g 发帖数: 285 | 23 certainly, you can read in txt file into SAS, you can use proc import or
data, infile ... |
|
k***i 发帖数: 2 | 24 谢谢楼上两位阿,不过还是不太明白你们那个。
我写了一个iml, 显示an error:No data set is currently open for input.
能帮我看看吗?谢谢
proc iml;
infile '\\pantera.campus.xxx.local\st_my_docs$\xxxx\My Documents\Book1.xls';
read all var {'navps'} into x;
read all var {'DistrTotal'} into y;
n=nrow(x);
do i=1 to n-1;
r[i,1]=(x[i+1,1]-x[i,1]+y[i+1,1])/x[i,1];
end;
print r;
quit; |
|
b******e 发帖数: 539 | 25 加一行code在input statement前:
infile datalines truncover; |
|
s**********e 发帖数: 63 | 26 data name;
infile "path" dsd;
length location $15;
input location;
run;
|
|
l**********9 发帖数: 148 | 27 Why you use missover and dsd toghter? they totally have the same function in
processing the missing data. The default delimiter in dsd is ',',so you do
not need delimiter = ','. I think just "infile '...' dsd" will be ok.
Use two or more INPUT statements to read the data will be more helpful.Don`t
forget adding @ at the end of each INPUT statement. |
|
y***t 发帖数: 644 | 28 What happens when the fourth iteration of the DATA step is complete?
data perm.orders (drop=type);
sas base 220 P127
5.
表见附件
infile produce;
retain Fruit;
input type $1. @;
if type='F' then input @3 fruit $7.;
if type='V';
input @3 Variety : $16. @20 Price comma5.;
run;
a. All of the values in the program data vector are written to the data set
as the third
observation.
b. All of the values in the program data vector are written to the data set
as the fourth
observation.
c. The values for Fruit, |
|
g*******y 发帖数: 380 | 29 后面的问题用missover?
两个都是infile statement里面的选项。 |
|
d*******1 发帖数: 854 | 30 比如你的SAS文件叫test.
data test;
a=1;b=1; output;
a=2;b=2; output;
run;
用下面的code, SAS-> CSV-> SAS-> 加# -〉CSV
proc export data=test
outfile='c:\test.csv' replace;
run;
data testx;
infile 'c:\test.csv' truncover;
input raw $1-100;
run;
data testx;
set testx;
if _n_=1 then raw='#'||trim(left(tranwrd(raw,',',',#')));
run;
data _null_;
set testx;
file 'c:\testx.csv' dlm='09'x;
put raw;
run; |
|
o******6 发帖数: 538 | 31 ☆─────────────────────────────────────☆
libra (秤子) 于 (Tue Mar 3 18:01:17 2009) 提到:
how to import all the files in one folder at one time?
suppose I have hundreds txt files. it's too much typing work to infile them
one by one
☆─────────────────────────────────────☆
ursveronique (ursveronique) 于 (Tue Mar 3 19:06:51 2009) 提到:
用MACRO...
☆─────────────────────────────────────☆
libra (秤子) 于 (Tue Mar 3 19:25:09 2009) 提到:
the problem is, there is no rules of the names of those files... |
|
y***q 发帖数: 99 | 32 SAS BASE 123的第13道
A raw data file is listed below:
ranch,1250,2,1,sheppard avenue,"$64,000"
split,1190,1,1,rand street,"$65,850"
condo,1400,2,1.5,market street,"80,050"
twostory,1810,4,3,garris street,"$107,250"
ranch,1500,3,3,kemble avenue,"$86,650"
split,1615,4,3,west drive,"94,450"
split,1305,3,1.5,graham avenue,"$73,650"
data work.condo_ranch;
infile 'e:\SASdata\base13.xls' dsd;
input style $ @;
if style = 'condo' or style = 'ranch' then
input sqfeet bedrooms baths street $ price : dollar10. |
|
e****8 发帖数: 200 | 33 我选了c,非常不确定
Item 20 of 63 Mark item for review
The following SAS program is submitted:
data WORK.TEMP;
length A B 3 X;
infile RAWDATA;
input A B X;
run;
What is the length of variable A?
A.
3
B.
8
C.
WORK.TEMP is not created - X has an invalid length.
D.
Unknown. |
|
y***q 发帖数: 99 | 34 过2天就考试了,有几道题目不知道应该选哪个答案,真诚请教谢谢了
Q30
You're attempting to read a raw data file and you see
the following messages displayed in the SAS Log:
NOTE: Invalid data for Salary in line 4 15-23.
RULE: ----+----1----+----2----+----3----+----4----+----5--
4 120104 F 46#30 11MAY1954 33
Employee_Id=120104 employee_gender=F Salary=. birth_date=-2061 _ERROR_=1 _N
_=4
NOTE: 20 records were read from the infile 'c:\employees.dat'.
The minimum record length was 33.
The maximum r |
|
h****9 发帖数: 26 | 35 Item 30 of 70 Mark item for review
You're attempting to read a raw data file and you see
the following messages displayed in the SAS Log:
NOTE: Invalid data for Salary in line 4 15-23.
RULE: ----+----1----+----2----+----3----+----4----+----5--
4 120104 F 46#30 11MAY1954 33
Employee_Id=120104 employee_gender=F Salary=. birth_date=-2061 _ERROR_=1 _N
_=4
NOTE: 20 records were read from the infile 'c:\employees.dat'.
The minimum record length was 33.
The maximum r |
|
h****9 发帖数: 26 | 36 Item 59 of 70 Mark item for review
Given the contents of the raw data file TYPECOLOR.DAT:
----+----10---+----20---+----30
daisyyellow
The following SAS program is submitted:
data FLOWERS;
infile 'TYPECOLOR.DAT' truncover;
length
Type $ 5
Color $ 11;
input
Type $
Color $;
run;
What are the values of the variables Type and Color?
A.
Type=daisy, Color=yellow
B.
Type=daisy, Color=w
C.
Type=daisy, Color=daisyyellow
D.
Type= |
|
z**k 发帖数: 378 | 37 你说的都是非法的,'01Jan1960'd这样的用法是在给变量赋值,应该是在coding时就给
出的,所以
格式比较死板,你可以写 x=1000 为什么还要写 x='1,000'n 呢。
用datalines或者infile方式读数据的话就比较灵活了,1,000可以用comma8.格式来读
取,不
过"1993-09-07"我就不清楚了,似乎SAS要求年份要在末尾,either mmddyy or ddmmyy |
|
h****9 发帖数: 26 | 38 非常感谢download的回复,但是advance 130中
101. The following SAS program is submitted:
data temp;
length a 1 b 3 x;
infile 'file reference';
input a b x;
run;
What is the result?
A.The data set TEMP is created, but variable X is not created.
B.The data set TEMP is created and variable X has a length of 8.
C.The data set TEMP is not created because variable A has an invalid length.
D.The data set TEMP is not created because variables A and B have invalid
lengths.
Answer: C
如果这个答案c 是对的,那么第20题为什么不是c 呢? |
|
J********i 发帖数: 50662 | 39 sas的话,
data outlier;
infile '';
input ****;
if **** OR ****;
run;
就可以了吧
【 以下文字转载自 Quant 讨论区 】
发信人: liujx80 (xuxu), 信区: Quant
标 题: bond price data clearn
发信站: BBS 未名空间站 (Fri Feb 12 13:33:07 2010, 美东)
如果有每天从dealer那弄到大量bond的数据,有什么统计的方法能否自动发现outlier?
任何异常报价都归属于outlier |
|
D******6 发帖数: 6211 | 40 我用SAS读一个中文数据库,字符字段
源文件如下:
id name address
1, 张三,北京市东城区
2,李四,北京市西城区
。。。
。。。
我用的代码如下:
DATA name;
infile 'G:\name.csv' DLM = ',' DSD MISSOVER;
input id $ name $ address $;
读到SAS里的结果如下:
id name address
1 张三 北京市东
2 李四 北京市西
现在出现的问题是,如果address太长或者任何字符字段长过8个都读不进去,读到SAS
里以后,只有4个中文字符,也就是字节长8 。不是说以$这样结尾读数据都是按照有多
长读多长么?还是哪里有什么限制,我没有打开?
谢谢指点! |
|
a********s 发帖数: 188 | 41 suppose your data test.txt is under C:
data test;
infile "c:\test.txt" dlm=";";
input date mmddyy10. city $ state $ zip;
run; |
|
R*******c 发帖数: 249 | 42 只是call,R中的结果可以写到xls或者其他文件中,然后让matlab再读取
请问system怎么用呢?
我在网上搜过,有人建议用:system('R CMD BATCH infile outfile');但是我试过,
不行 |
|
f*********8 发帖数: 165 | 43 id refid
1 NP_001003407/// NP_001003408 /// NP_002304 /// NP_006711
2 NP_001135417 /// NP_001604
3 NP_00499494
我想从一个record生成多个observations, 但是每个record对应的obs个数不等,特殊
符号是‘///’。请问应该怎末处理啊?
我的问题是打印出来的结果只有8位,但是每个值的长度不是固定的,一旦写成$12.,"/"也被读进去了 (比如说 ‘/// NP_00100’)。 怎末该这个code啊?多谢!
data new;
infile "C:............" missover dlm="///" ;
input id $ refid $ @;
num=0;
do while (refid ne ' ');
num+1;
output;
input r |
|
A*******s 发帖数: 3942 | 44 comment:
1. dlm="/" SAS默认两个或以上的delimiter为一个
2. :$12. informat 之前加colon,sas会在遇到space和delimiter时停止读入
3. id 4. 这个我不是很确定。如果将id的informat改成$,似乎sas会将后面的空格
和refid一起读入,然后保留前八位再把后面的空格去掉。我不知道怎么解决,只能把
id当做numeric读入就行了。
改了一下,运行好像没错,sample code如下
data new;
infile "**********" missover dlm="/" ;
input id 4. refid :$12. @;
num=0;
do while (refid ne ' ');
num+1;
output;
input refid :$12. @;
end;
run;
proc print;
run |
|
f*********8 发帖数: 165 | 45 多谢多谢。
我试了这个 id 4.怎末id 都变成 missing 了?!
infile的format 是不是有问题。我的数据存成 .csv
其实真正的id是char type,而且大于8位,就像你说的会把后面的空格和部分refid读进
去。我只好用id来简化问题。不过可以用这个数字id map 回原来的id。
能否在指教一下,如果id 是char, 大于8位。 id 和refid之间的dlm=" ",但是refid
dlm='/'有没有办法把两个variable 同时正确读入呢?谢谢。
例如:
id refid
14454544_a NP_001003407/// NP_001003408 /// NP_002304 /// NP_006711
2222222222222_b NP_001135417 /// NP_001604 |
|
D******n 发帖数: 2836 | 46 data a1;
infile './yourdata' missover;
length refid $ 32;
input id $ refid $ @;
do while (refid ne ' ');
refid=compress(refid,"/");
if (refid ne ' ') then output;
input refid $ @;
end;
run;
proc print ;run;
,"/"也被读进去了 (比如说 ‘/// NP_00100’)。 怎末该这个code啊?多谢! |
|
y****n 发帖数: 46 | 47 data temp;
length id refid $12;
infile cards truncover ;
input @;
id=scan(_infile_,1,' ///');
i=2;
do while (scan(_infile_,i,' ///') ne ' ' );
refid=Scan(_infile_,i,' ///');
output;
I=i+1;
end;
keep id refid;
cards;
1 NP_001003407/// NP_001003408 /// NP_002304 /// NP_006711
2 NP_001135417 /// NP_001604
3 NP_00499494
;
run; |
|
h*****d 发帖数: 295 | 48 Sorry can not input Chinese now.
The original data is saved in a txt file. The last characters of some of the
datalines are missing. SAS jump to next line to read the first character
when such missing value occurs.
Any idea how to make SAS know there is a characer missing at the end of
these lines?
Thanks and bow~
data is like:
a134M
b246F
c321F
d234
e345M
g456
h987M
..
code I am trying
data temp;
infile xxxx;
input id $ 1. score 3.0 gender $ 1.;
run;
the output of the data is like
a134M
b246F
c |
|
a***r 发帖数: 420 | 49 我理解错了,继续抛砖引玉:
data a(keep=a);
input A $ 15. B C $;
datalines;
11/asdsd/890.00 89 gh
123/yuu/8.9 89 ji
;
run;
data a;
set a;
file "e:\temp.txt";
put a;
run;
data b;
infile "e:\temp.txt" dlm='/';
input var1 var2 $ var3;
run; |
|
w*********y 发帖数: 7895 | 50 我最近在练习SAS PROGRAMMING, 但是碰到这个难题. OUTPUT总是
不对. 加了 DLM=','后完全不对, 只用MISSING的话, DAYS那一个
基本是MISSING. 大家帮我看看吧. 谢谢大家了.
data kids;
infile datalines dlm=',' missover;
input subj origin $ sex $ grade $ type $ missingdays @;
do until (missingdays=.);
output;
input days @;
end;
input;
datalines;
1 A M F0 SL 2,11,14
2 A M F0 AL 5,5,13,20,22
;
run; |
|