第5页 - 关于infile的讨论汇总 - 话题女王

y****n
发帖数: 46

I don't know if this is what you want.
data kids;
length temp_days $100;
infile datalines dlm=' ' truncover;
input subj origin $ sex $ grade $ type $ temp_days $;
wn=countw(temp_days,',');
do rec=1 to wn;
days=input(scan(temp_days,rec),best.);
output;
end;
drop temp_days wn;
datalines;
1 A M F0 SL 2,11,14
2 A M F0 AL 5,5,13,20,22
run;

s***1
发帖数: 343

来自主题: Statistics版 - 今天刚去考了SAS ADV

可不可以向楼主请教两个题，先谢谢了！！
11题ziqidonglai强帖给的答案是C，楼主可不可以帮忙解释一下code哪里有错吗？我自
己做的时候选了B。
The following SAS code is submitted:
data WORK.TEMP WORK.ERRORS / view=WORK.TEMP;
infile RAWDATA;
input Xa Xb Xc;
if Xa=. then output WORK.ERRORS;
else output WORK.TEMP;
run;
Which of the following is true of
the WORK.ERRORS data set?
A.
The data set is created when the
DATA step is submitted.
B.
The data set is created when the view
TEMP is u

p********a
发帖数: 5352

来自主题: Statistics版 - [合集] 请教一个SAS数据input的问题

☆─────────────────────────────────────☆
footprint08 (just do it) 于 (Sun Mar 21 02:44:22 2010, 美东) 提到:
id refid
1 NP_001003407/// NP_001003408 /// NP_002304 /// NP_006711
2 NP_001135417 /// NP_001604
3 NP_00499494
我想从一个record生成多个observations, 但是每个record对应的obs个数不等，特殊
符号是‘///’。请问应该怎末处理啊？
我的问题是打印出来的结果只有8位，但是每个值的长度不是固定的，一旦写成$12.，"/"也被读进去了（比如说 ‘/// NP_00100’）。怎末该这个code啊？多谢！
data new;
infile "C:............" missover dlm="///" ;
input id $ refid $

p*****o
发帖数: 543

来自主题: Statistics版 - SAS MACRO VARIABLE的问题

%macro none();
data tem;
infile "a.txt" pad;
length block1 - block4 $ 20;
INPUT
%do i = 1 %to 4;
block&i. $ 1-20
%end;
;
run;
%mend none;
%none();
如果我其中$ 1-20 想改成跟i有关的，比如：$ （i-1)*20 - i*20,该如何实现呢？
谢谢！

p********a
发帖数: 5352

来自主题: Statistics版 - 从yahoo finance下载数据，R or Python？

%macro getdata(tic);
FILENAME myurl URL "http://ichart.finance.yahoo.com/table.csv?s=&tic";
DATA &tic;
INFILE myurl FIRSTOBS=2 missover dsd;
format date yymmdd10.;
INPUT Date: yymmdd10. Open High Low Close Volume Adj_Close ;
*if date>=today()-180;
RUN;
%mend;
%getdata(SPY);

d********0
发帖数: 44

来自主题: Statistics版 - SAS数据输入疑问

假设我在data.csv里面有20列的数据，每一列代表了一个variable (x1,x2,...,x20)。
我怎么才能读入我想要的variable？比如，我只需要第6列，第20列对应的数据(x6,
x20)。
小弟只会用infile, input,把所有变量都读进去，然后再使用x6与x20。这显然不好。
有什么更有效率的方法吗？请大牛指教！

b2
发帖数: 427

来自主题: Statistics版 - 关于读数据紧急求助,包子答谢,谢谢了先

我又一组数据,20个变量,大约700万个观测值存在csv里面。
1 双击文件用excel打开一部分，会损坏原始文件么？
2 这二十个变量再csv文件里被分布在3个列里面，用|分割，具体情况是：
1）每个列里面含有的变量数不同，即对于某些观测值column 1可能含5个变量，而对
于其他的观测，
第一列可能有8个变量；
2）某些观测，有的变量内容每分隔在不同的列里面；
3）同一变量内部，每个观测值的长度也不同；
我试过
data _null;
infile 'path' dsd firstobs=2 dlm=',' dlm='|';
input v1 $ v2 $ ... v20 $;
run;
或者
1 读入SAS；2输出每个列到新的csv文件中；但是每个列有含有不同数量的变量数。比
较麻烦，
请问有谁能指点我一下，谢谢！

y******0
发帖数: 401

来自主题: Statistics版 - help on importing csv file to SAS

using Data xx; infile ..;

A****t
发帖数: 141

来自主题: Statistics版 - 怎么在EXCEL里把一个column的A/B分到两个column里？

试了一下，好像可以
A/B/C/D/E/F/G
A/B/C/D/E/F/G
A/B/C/D/E/F/G
data one;
infile "..." dlm='/';
input (x1-x7) ($);
run;

s*******f
发帖数: 148

来自主题: Statistics版 - 怎么在EXCEL里把一个column的A/B分到两个column里？

DATA temp;
INFILE CARDS DLM='/';
INPUT a1 1-1 a2 3-3 (b1-b7) ($);
CARDS;
1 1 A/B/C/D/E/F/G
1 2 A/B/C/D/E/F/G
2 1 A/B/C/D/E/F/G
2 2 A/B/C/D/E/F/G
;
RUN;

s******y
发帖数: 352

来自主题: Statistics版 - 用SAS找单词

this is where Regx come in handy.
data test;
infile cards truncover;
input text $150.;
Master_stat=^^prxmatch('/(?<=statistics\s{20})Masters|Masters(?=\s{20}
statistics)/io',text);
cards;
The department of statistics Masters
The department of statistics Masters
masters statistics sucks
;
run;
proc print;
run;

t*****n
发帖数: 167

来自主题: Statistics版 - Base问题请教

有人做过这题吗？
17.Given the following data step:
data WORK.GEO;
infile datalines;
input City $20.;
if City='Tulsa' then
State='OK';
Region='Central';
if City='Los Angeles' then
State='CA'
Region='Western';
datalines;
Tulsa
Los Angeles
Bangor
;
run;
After data step execution, what will data set WORK.GEO contain?
A.
City State Region

w*******t
发帖数: 928

来自主题: Statistics版 - 请教一个sas的sort date变量问题

data one;
infile cards;
input date $1-12;
format date1 mmddyy10.;
date1=input(compress(date,' .'), date11.);
cards;
09-Aug.-2009
11-Apr.-2009
;
run;
proc sort data=one; by date1; run;

A****t
发帖数: 141

来自主题: Statistics版 - Question: Importing csv file into SAS 9----太多变量

1000?
data one;
infile "...." dlm=',' lrecl=100000;
input var1-var1000;
run;

j**********e
发帖数: 442

来自主题: Statistics版 - 请教问题：提供有用信息我会发包子

还有，请教一下如何指代宏变量。我的文件名为A(0),A(1),A(2)...
我尝试写成：
%do num=0 %to 10000;
infile 'C:\research\A(&num).txt';
可是SAS返回：
ERROR: Physical file does not exist, C:\research\A(&num).txt.
请大牛们看看错在哪里？多谢！

件名替
究。

s******r
发帖数: 1524

来自主题: Statistics版 - 请教问题：提供有用信息我会发包子

大牛? 都不敢回了。
装着胆子回一下，
try
infile “C:\research\A(&num).txt“

j**********e
发帖数: 442

来自主题: Statistics版 - 请教问题：提供有用信息我会发包子

您真是太谦虚了。这样确实就可以work了。给您发了20个伪币，聊表寸心。
有个问题是：根据上面的代码，在company和000007之间只能有:而不能有任何空格。有没有办法允许在：后有空格呢？
我才搞清楚，原来是中文的问题。中文一个字占两格，所以在读取冒号后面的值时有困难（到底为啥困难我也不明白）。我把冒号删除了然后在英文输入法环境下再添上冒号就可以了。但是文件太多，一个一个这样弄太费时间。大家有好办法吗？
附上整个宏（测试用，所以num只从0到1）：
%macro input_file;
%do num=0 %to 1;
data file_sub;
infile "C:\research\A (&num).txt"
firstobs=1

delimiter=":" truncover;
input col_1 $20. ;
n=_N_;
if n=1 then company=scan(col_1,2,':');
retain company;
if n=4 then date=in

j**********e
发帖数: 442

来自主题: Statistics版 - Quick Question

不好意思，不行啊。
原始数据的一部分：
date
4-Dec-08
18-Dec-04
13-Sep-07
15-Sep-07
8-May-07
程序：
data date;
infile 'C:\research\date.csv' firstobs=2 dsd missover;
input date date9.;
format date yymmdd10.;
run;
读出来的：
.
2004-12-18
2007-09-13
2007-09-15
.
就是说第一和第五个observation读不出来。
还有，如何输出20041218类型的日期变量？
多谢指教！

s******y
发帖数: 352

来自主题: Statistics版 - 问个SAS 数据处理问题

that is so called brotherhood, right.
well, I thin you asked how to add quotation mark to each word in the
sentence. the QUOTE function merely add one at begining and one at the very
end.
so for this task, I would like to use Regex. it is one liner solution. but
certainly, you can loop through and add the quotation mark for each word
delimited by space.
any way here is the code:
data _null_;
infile cards truncover;
length quoted_line $600.;
input line $200.;
quoted_line=prxchange('s/([\w''""]+)(

w********5
发帖数: 72

来自主题: Statistics版 - 请教一下SAS编程的一个问题

This is my answer. My codes are alway very long and not efficient. Please
help simlify.
data data1;
input var1;
cards;
5
6
;
run;
data data2;
input var2;
cards;
5
6
;
run;
data new;
infile datalines dlm=" ";
input name $ var $ ;
datalines;
data1 var1
data2 var2
data2 var2
data4 var4
;
run;
proc sql;
select name into:name1-:name&SYSMAXLONG
from new;
select var into:col1-:col&&SYSMAXLONG
from new;
quit;
%put _user_;
option mprint mlogic;
%macro mutiple;
%do i=1 %to &sqlobs;
proc so

P******V
发帖数: 83

来自主题: Statistics版 - 请教SAS ADV真题里一道题目

11, The following SAS code is submitted:
data WORK.TEMP WORK.ERRORS/view=WORK.TEMP;
infile RAWDDATA;
input Xa Xb Xc;
if Xa=. then output WORK.ERRORS;
else output WORK.TEMP;
run;
which of the following is true of the WORK.ERRORS data set?
A The data set is created when the DATA step is submitted
B The data set is created when the view TEMP is used in another SAS step
C The data set is not created because the DATA statement contains a syntax
error.
D The descriptor portion of WORK.ERRORS is create

s*****0
发帖数: 357

来自主题: Statistics版 - How to work on this dataset?

data tmp;
length number $15;
infile 'directory\filename.txt' DLM=',';
input number $ @@;
number = compress(number, "(')");
run;
"NUMBER" is still a character variable. If you prefer numerical variable,
use INPUT() to do the conversion.

the
,(

g**a
发帖数: 2129

来自主题: Statistics版 - How to work on this dataset?

data one;
infile 'filepath' LRECL=150000;
input @'(''' Var1 10. @@;
run;
proc print data=one;
run;
Assuming the length of the number is 10
change the record length accordingly.

the
,(

S********a
发帖数: 359

来自主题: Statistics版 - ADV 12月真题一问

ADV12月真题第11题
The following SAS code is submitted:
data WORK.TEMP WORK.ERRORS / view=WORK.TEMP;
infile RAWDATA;
input Xa Xb Xc;
if Xa=. then output WORK.ERRORS;
else output WORK.TEMP;
run;
Which of the following is true of
the WORK.ERRORS data set?
A.
The data set is created when the
DATA step is submitted.
B.
The data set is created when the view
TEMP is used in another SAS step.
C.
The data set is not created bec

t*******8
发帖数: 170

来自主题: Statistics版 - 请教Base 70 中的一题关于@

Item 29
The following SAS program is sumbitted:
data WORK.INFO;
infile 'DATAFILE.TXT';
input @1 Company $20. @25 State $2. @;
if State=' ' then input @30 Year;
else input @30 City Year;
input NumEmployees;
run;
How many raw data records are read during each iteration of the DATA step?
A. 1 B. 2 C. 3 D. 4
我看到@的用法：(trailing @) prevents SAS from automatically reading a new
data record into the input buffer when a new INPUT statement is executed
within the same iter

w********5
发帖数: 72

来自主题: Statistics版 - 如何删去string variable尾部的字符？

data name;
length name $100;
infile datalines dsd missover;
input name $;
datalines;
'KELLY SERVICES INC'
'FIDELITY HIGH INCOME INC'
'TALENT SERVICES -CL A'
;
run;
data name1;
set name;
Again=scan(name,1,' ')||" "||scan(name,2,' ');
run;

g********d
发帖数: 2022

来自主题: Statistics版 - 如何删去string variable尾部的字符？

改了一下楼上那位大侠的。
假设最长的公司名只有5个，例子我改了一下为了测试。
attention那个变量是用来探测可能中间有“INC”,"-CL", "A"需要保留的，可能对你
没什么用，但是只是一个谨慎的思路。
data name;
length name $100;
infile datalines dsd missover;
input name $;
datalines;
'KELLY INC SERVICES INC'
'FIDELITY HIGH INCOME INC'
'A TALENT SERVICES -CL A'
'.'
;
run;
data name1;
set name;
A1=scan(name,1,' ');
A2=scan(name,2,' ');
A3=scan(name,3,' ');
A4=scan(name,4,' ');
A5=scan(name,5,' ');
array aa(5) A1-A5;
do i=1 to 5;
if aa(i)="INC" or aa(i)="-CL" or aa(i)="A" then aa(i)="";
e

s******r
发帖数: 1524

来自主题: Statistics版 - 如何删去string variable尾部的字符？

It is bad idea. You are facing some uncertain.
For instance a company name is : xxxxx INCER**. you code would fail.
With limited word to remove, try something like
data name;
length name $100;
infile datalines dsd missover;
input name $;
if scan(name,-1)='INC' then
_name=substr(name,1, length(name)-3);
else if scan(name,-1,'-')='CL A' THEN
_name=substr(name,1, length(name)-6);
datalines;
'KELLY SERVICES INC'
'FIDELITY HIGH INCOME INC'
'TALENT SERVICES -CL A'
;
run;

p********a
发帖数: 5352

来自主题: Statistics版 - 请教一个简单问题

infile .........firstobs=2;

c**d
发帖数: 104

来自主题: Statistics版 - 请教如何同时用sas打开多个excel文件？多谢！

Suppose you have a lot of excel files under H:\Temp
/* get excel names */
filename myxls 'dir "H:\Temp" /b' LRECL=5000;
data myfile;
infile myxls length = len;
input fname $200. len;
run;
/* save each into a macro variable */
proc sql;
select fname into :a1 - :a9999
/* do loop to input excel */

x**********t
发帖数: 45

来自主题: Statistics版 - 请教一个base 123的问题，先谢谢大家了！！

（如题）
A raw data file is listed as below:
ranch, 1250
split,1190
condo,1400
twostory,1810
ranch,1500
split,1305
split,1615
The following SAS program is submitted using the raw data file as input:
data cd;
infile 'filename' dsd;
input style $ @;
if style='condo' or stype='ranch' then input sqfeet;
run;
How many observations does the cd data set contain?
正确答案是7，为什么不是3呢？

s********l
发帖数: 245

来自主题: Statistics版 - help need for SAS macro

The code I wrote is:
%macro data(num);
%do i=0 %to #
data est#
infile "path\data&num";
input a b c d;
run;
proc append base=data1 data=est#
run;
%mend;
%data(num=100);
through above program, I just got data combine with the data1 and the
data100. What's wrong with my program? I really need help from you! Many
thanks.

o******6
发帖数: 538

来自主题: Statistics版 - help need for SAS macro

%macro data(num);
%do i=1 %to #
data est&i;
infile "path\data&i";
input a b c d;
run;
proc append base=newdata data=est&i force;
run;
%end;
%mend;
%data(num=100);

s********l
发帖数: 245

来自主题: Statistics版 - help need for SAS macro

The reason behind that I have figured out, since I assign macro variable num
=100 then when I revoke the macro est, the following infile "path\data100
will execute.

o******6
发帖数: 538

来自主题: Statistics版 - help need for SAS macro

%macro new(num);
filename combine (%do i=1 %to #"path\data&i..txt" %end;);
data newdata;
infile combine;
input a b c d;
run;
%mend;
%new(100);

num

s*******r
发帖数: 769

来自主题: Statistics版 - 也问几道SAS base 题目

1. 为什么是A？是不是State='CA'后面少了一个“；”？
data WORK.GEO;
infile datalines;
input City $20.;
if City='Tulsa' then
State='OK';
Region='Central';
if City='Los Angeles' then
State='CA'
Region='Western';
datalines;
Tulsa
Los Angeles
Bangor
;
run;
After data step execution, what will data set WORK.GEO contain?
A.
City State Region

s*******r
发帖数: 769

来自主题: Statistics版 - 也问几道SAS base 题目

为什么选D？
----+----10---+----20---+----30
daisyyellow
The following SAS program is submitted:
data FLOWERS;
infile 'TYPECOLOR.DAT' truncover;
length
Type $ 5
Color $ 11;
input
Type $
Color $;
run;
What are the values of the variables Type and Color?
A. Type=daisy, Color=yellow
B. Type=daisy, Color=w
C. Type=daisy, Color=daisyyellow
D. Type=daisy, Color=

p******r
发帖数: 1279

来自主题: Statistics版 - 请教sas base 70题里第29题。。。

data WORK.INFO;
infile 'DATAFILE.TXT';
input @1 Company $20. @25 State $2. @;
if State=' ' then input @30 Year;
else input @30 City Year;
input NumEmployees;
run;
How many raw data records are read during each iteration of the DATA step?
A. 1
B. 2
C. 3
D. 4
答案是A。
可我觉得这个题怎么要看具体的raw data才能知道是每次循环是1个record还是2个
record呢？
如果raw data是下面这个样子，那每次iteration不就读入2个record了吗？？
companyA NC raleigh 2001
200
companyB 2002
300
companyC CA losangeles 1998
..

p******r
发帖数: 1279

来自主题: Statistics版 - 请教sas base 70题里第29题。。。

p******r
发帖数: 1279

来自主题: Statistics版 - 刚刚考完SAS base，补充一下

请问LZ下面这题的答案是什么？
data WORK.INFO;
infile 'DATAFILE.TXT';
input @1 Company $20. @25 State $2. @;
if State=' ' then input @30 Year;
else input @30 City Year;
input NumEmployees;
run;
How many raw data records are read during each iteration of the DATA step?
A. 1
B. 2
C. 3
D. 4

g******k
发帖数: 62

来自主题: Statistics版 - sas advance 12月真题Q20请教

The following SAS program is submitted:
data WORK.TEMP;
length A B 3 X;
infile RAWDATA;
input A B X;
run;
What is the length of variable A?
A.3
B.8
C.WORK.TEMP is not created - X has an invalid length.
D.Unknown.
用SAS run了下，log里这样说 length A B 3 X; 这里expecting a numeric constant
什么的。答案选A，不知道为什么，有没有大侠可以解释下。

s***1
发帖数: 343

来自主题: Statistics版 - 问一个很菜的问题 missover truncover的区别

确实很菜的问题，不要拍我，要拍也请轻拍。
感觉理论上对于column format读入的情况，如果最后一个变量值的实际长度短于定义
的长度，那么missover会assign一个missing value，而truncover会assign那个实际值
，但是下面这段却都assign了真值，想不明白为什么。
data t;
infile cards （missover/truncover）;
input num 3.;
cards;
1
12
121
;
proc print;
run;

R*********i
发帖数: 7643

来自主题: Statistics版 - sas 简单问题

Not "老手" or "前辈", just know some SAS coding.
Assume all your datasets are already in working directory, and "filenames"
is a .txt file.
data names;
infile "XXXX\XXXX\filenames";
input name $;
run;
proc sql noprint;
select name into: names seperatedby " "
from names;
quit;
data alldata;
set &names;
run;
There's a limit for the length of macro variable. So if your file names are
too long or you have more than several thousands files you will have to use
more than one macro variable to ... 阅读全帖

l**********9
发帖数: 148

来自主题: Statistics版 - sas 简单问题

如果不嫌麻烦，在原文件中添加分隔符的话，直接infile.....dsd 或者用（dlm）就可
以。

D******n
发帖数: 2836

来自主题: Statistics版 - sas 数据输入问题

baozi....
data a1;
infile './temp.txt' truncover;
input type @;
do until(0);
input weight @;
if weight=. then leave;
else output;
end;
run;

b****e
发帖数: 906

来自主题: Statistics版 - SAS文件读入的问题

just figured out, here to share with you,
add LRECL= option behind the infile statement

h*********y
发帖数: 183

来自主题: Statistics版 - 一个SAS应用问题

proc import
data step infile
libname with SAS/Access engines

a*******m
发帖数: 6

来自主题: Statistics版 - 问一道SAS题目

The following SAS program is submitted:
data WORK.TEST;
drop City;
infile datalines;
input
Name $ 1-14 /
Address $ 1-14 /
City $ 1-12 ;
if City='New York ' then input @1 State $2.;
else input;
datalines;
Joe Conley
123 Main St.
Janesville
WI
Jane Ngyuen
555 Alpha Ave.
New York
NY
Jennifer Jason
666 Mt. Diablo
Eureka
CA
;
What will the data set WORK.TEST contain?
A.
Name Address State

p*****o
发帖数: 543

来自主题: Statistics版 - a quick question importing txt into SAS

i only saw dsd before......
so would you mind telling me how to use it?
INFILE "test.txt" DLM=',' DSD MISSOVER FIRSTOBS=2 sds='"'---is it right?

c******5
发帖数: 22

来自主题: Statistics版 - large dataset impot into SAS

columns.
的确是有mixed的数据。可是我转了个相似的data没问题啊。详细说说吧：
1. 我在access里要用到它的3个tables中的信息。首先，我一个table一个table的把它
们变成了excel file，然后用proc import一个一个的把它们转到SAS中。成功。
2. 因为我想把3个tables中的信息放在一个table中在SAS中做Analysis，但SAS中用
merge什么我怕有multiple entries （不同的table有不同数量的multiple entries）
容易出错。所以我在access里用query把这三个tables先放到一起，然后转成一个excel
sheet，最后录入SAS。于是就给出了error message。
所以我觉得mixed的数据类型应该不是问题，因为之前没有出错啊。我能想到的就是
combine了以后row 太多，但前面也有朋友说SAS能handle很大的数据，应该也不是问题
。还有就是SAS能发现我这个新的excelsheet是从access中combine了不同的tables来的
所以有问题？觉得这... 阅读全帖

#	版面	帖数(主题数)
-	全站	4871 (796)
1	Military	3777 (569)
2	Stock	341 (51)
3	Joke	117 (17)
4	History	116 (3)
5	Automobile	100 (9)
6	USANews	55 (9)
7	Midlife	45 (1)
8	Headline	41 (41)
9	Dreamer	33 (13)
10	FleaMarket	32 (20)
11	Living	30 (7)

topics

未名新帖统计// 7月16日

历史上的今天