现在的位置: 首页 > 综合 > 正文

sas数据导入终极汇总-之二

2018年05月10日 ⁄ 综合 ⁄ 共 7440字 ⁄ 字号 评论关闭
从FTP读入数据
read raw data via FTP in SAS?
SAS has the ability to read raw data directly from FTP servers. Normally, you would use FTP to download the data to your local computer and then use SAS to read the data stored on your local computer. SAS allows you to bypass the FTP step and read the data
directly from the other computer via FTP without the intermediate step of downloading the raw data file to your computer. Of course, this assumes that you can reach the computer via the internet at the time you run your SAS program. The program below illustrates
how to do this. After the filename in you put ftp to tell SAS to access the data via FTP. After that, you supply the name of the file (in this case 'gpa.txt'. lrecl= is used to specify the width of your data. Be sure to choose a value that is at least as wide
as your widest record. cd= is used to specify the directory from where the file is stored. host= is used to specify the name of the site to which you want to FTP. user= is used to provide your userid (or anonymous if connecting via anonymous FTP). pass= is
used to supply your password (or your email address if connecting via anonymous FTP).
FILENAME in FTP 'gpa.txt' LRECL=80
CD='/local2/samples/sas/ats/'
HOST='cluster.oac.ucla.edu'
USER='joebruin'
PASS='yourpassword' ;
DATA gpa ;
INFILE in ;
INPUT gpa hsm hss hse satm satv gender ;
RUN;
PROC PRINT DATA=gpa(obs=10) ;
RUN;
读入多个数据文件
quarter1.dat
1 120321 1236 154669 211326
1 326264 1326 163354 312665
1 420698 1327 142336 422685
1 211368 1236 156327 655237
1 378596 1429 145678 366578
quarter2.dat
2 140362 1436 114641 362415
2 157956 1327 124869 345215
2 215547 1472 165578 412567
2 204782 1495 150479 364474
2 232571 1345 135467 332567
quarter3.dat
3 140357 1339 142693 205881
3 149964 1420 152367 223795
3 159852 1479 160001 254874
3 139957 1527 163567 263088
3 150047 1602 175561 277552
quarter4.dat
4 479574 1367 155997 36134
4 496207 1459 140396 35941
4 501156 1598 135489 39640
4 532982 1601 143269 38695
4 563222 1625 147889 39556
filename year ('d:\quarter1.dat' 'd:\quarter2.dat' 'd:\quarter3.dat' 'd:\quarter4.dat');
data temp;
infile year;
input quarter sales tax expenses payroll;
run;
proc print data = temp;
run;
读取excel 数据集
Reading an Excel file into SAS
Suppose that you have an Excel spreadsheet called auto.xls. The data for this spreadsheet are shown below.
MAKE MPG WEIGHT PRICE
AMC Concord 22 2930 4099
AMC Pacer 17 3350 4749
AMC Spirit 22 2640 3799
Buick Century 20 3250 4816
Buick Electra 15 4080 7827
Using the Import Wizard is an easy way to import data into SAS. The Import Wizard can be found on the drop down file menu. Although the Import Wizard is easy it can be time consuming if used repeatedly. The very last screen of the Import Wizard gives you the
option to save the statements SAS uses to import the data so that they can be used again. The following is an example that uses common options and also shows that the file was imported correctly.
PROC IMPORT OUT= WORK.auto1
DATAFILE= "C:\auto.xls"
DBMS=EXCEL REPLACE;
SHEET="auto1";
GETNAMES=YES;
MIXED=YES;
USEDATE=YES;
SCANTIME=YES;
RUN;
proc print data=auto1;
run;
Obs MAKE MPG WEIGHT PRICE
AMC Concord 22 2930 4099
AMC Pacer 17 3350 4749
Amc Spirit 22 2640 3799
Buick Century 20 3250 4816
Buick Electra 15 4080 7827
First we use the out= statement to tell SAS where to store the data once they are imported.
Next the datafile= statement tells SAS where to find the file we want to import.
The dbms= statement is used to identify the type of file being imported. This statement is redundant if the file you want to import already has an appropriate file extension, for example *.xls.
The replace statement will overwrite an existing file.
To specify which sheet SAS should import use the sheet="sheetname" statement. The default is for SAS to read the first sheet. Note that sheet names can only be 31 characters long.
The getnames=yes is the default setting and SAS will automatically use the first row of data as variable names. If the first row of your sheet does not contain variable names use the getnames=no.
SAS uses the first eight rows of data to determine whether the variable should be read as character or numeric. The default setting mixed=no assumes that each variable is either all character or all numeric. If you have a variable with both character and numeric
values or a variable with missing values use mixed=yes statement to be sure SAS will read it correctly.
Conveniently SAS reads date, time and datetime formats. The usedate=yes is the default statement and SAS will read date or time formatted data as a date. When usedate=no SAS will read date and time formatted data with a datetime format. Keep the default statement
scantime=yes to read in time formatted data as long as the variable does not also contain a date format.
Example 1: Making a permanent data file
What if you want the SAS data set created from proc import to be permanent? The answer is to use libname statement. Let's say that we have an Excel file called auto.xls in directory "d:\temp" and we want to convert it into a SAS data file (call it myauto) and
put it into the directory "c:\dissertation". Here is what we can do.
libname dis "c:\dissertation";
proc import datafile="d:\temp\auto.xls" out=dis.myauto replace;
run;
Example 2: Reading in a specific sheet
Sometimes you may only want to read a particular sheet from an Excel file instead of the entire Excel file. Let's say that we have a two-sheet Excel file called auto2.xls. The example below shows how to use the option sheet=sheetname to read the second sheet
called page2 in it.
proc import datafile="auto2.xls" out=auto1 replace;
sheet="page2";
run;
Example 3: Reading a file without variable names
What if the variables in your Excel file do not have variable names? The answer here is to use the statement getnames=no in proc import. Here is an example showing how to do this.
proc import datafile="a:\faq\auto.xls" out=auto replace;
getnames=no;
run;
Writing Excel files out from SAS
It is very easy to write out an Excel file using proc export in SAS version 8. Consider the following sample data file below.
Obs MAKE MPG WEIGHT PRICE
AMC 22 2930 4099
AMC 17 3350 4749
AMC 22 2640 3799
Buick 20 3250 4816
Buick 15 4080 7827
Here is a sample program that writes out an Excel file called mydata.xls into the directory "c:\dissertation".
proc export data=mydata outfile='c:\dissertation\mydata.xls' replace;
run;
SAS读入复杂分隔数据——字符长度不同,字符中间有空格作为间隔符
1.字符长度不同
data web;
length site $41;
input age site $ hits;
datalines;
12 http://www.site1.org/default.htm 123456
130 http://www.site2.com/index.htm 97654
254 http://www.site3.edu/department/index.htm 987654
;
proc print;
run;
Obs site age hits
http://www.site1.org/default.htm 12 123456
http://www.site2.com/index.htm 130 97654
http://www.site3.edu/department/index.htm 254 987654
或者用
data web;
input age site & $41. hits;
datalines;
12 http://www.site1.org/default.htm 123456
130 http://www.site2.com/index.htm 97654
254 http://www.site3.edu/department/index.htm 987654
;
proc print;
run;
Obs age site hits
12 http://www.site1.org/default.htm 123456
130 http://www.site2.com/index.htm 97654
254 http://www.site3.edu/department/index.htm 987654
2.字符有多个单词,单词之间用空格隔开
data fruit;
infile 'C:\messy.txt' delimiter = ' ' dsd;
length fruit $22;
input zip fruit $ pounds;
proc print;
run;
Obs fruit zip pounds
apples, grapes kiwi 10034 123456
oranges 92626 97654
pears apple 25414 987654
或者
data fruit;
input zip fruit & $22. pounds;
datalines;
10034 apples, grapes kiwi 123456
92626 oranges 97654
25414 pears apple 987654
;
proc print;
run;
Obs zip fruit pounds
10034 apples, grapes kiwi 123456
92626 oranges 97654
25414 pears apple 987654
没有格式库的情况下读入数据:
read a SAS data file when I don't have its format library
If you try to use a SAS data file that has permanent formats but you don't have the format library, you will get errors like this.
ERROR: The format $MAKEF was not found or could not be loaded.
ERROR: The format FORGNF was not found or could not be loaded.
Without the format library, SAS will not permit you to do anything with the data file. However, if you use options nofmterr; at the top of your program, SAS will go ahead and process the file despite the fact that it does not have the format library. You will
not be able to see the formatted values for your variables, but you will be able to process your data file. Here is an example.
OPTIONS nofmterr;
libname in "c:\";
PROC FREQ DATA=in.auto;
TABLES foreign make;
RUN;
高效的保留或者去掉部分变量的方式:
The following program creates exactly the same file, but is a more efficient program because SAS only reads the desired variables.
DATA auto2;
SET auto (KEEP = make mpg price);
RUN;
The drop data step option works in a similar way.

抱歉!评论已关闭.