Introduction to SAS
SAS stands for “Statistical Analysis System” is a software pursuit developed by SAS Institute for advanced analytics. SAS was developed at North Carolina State University from 1966 to 1976 . It is developed by James Goodnight and Anthony Barr to analyze agricultural data to improve crop yields . The project is funded by the National Institute of health .
SAS components include :
- Base SAS – Basic procedures and data management
- SAS/STAT – Statistical analysis
- SAS/GRAPH – Graphics and presentation
- Enterprise Miner – Data mining
- SAS Grid Manager – Manager of SAS grid computing environment
To work in SAS online , you have to create an account in SAS . You can SIGN IN this account to access SAS Studio free for student use.
You can access SAS by using SAS Studio online by this link – SAS Studio
The SAS Studio can be viewed as –
You can see Server Files and Folders in the left side of the window .
We are using Files and Libraries mainly to explore data.
1. Files(Home) – To create a SAS program file and store it
2. Libraries – It is used to show all the libraries associated with SAS.
We open a new Program file in two different ways:
1. First click on Files(Home) , then right click on New tab and click on SAS Program(F4) .
2. By simply press function key F4 .
You can see there are three windows CODE , LOG and RESULTS in our Program 1 file .
This window shows Errors , Warnings and Notes related to program .A log window checks the execution of SAS program .
We can write all the codes in this window .
It shows the output of our program.
It is a combination of rows and columns . We have a dataset of four rows and two columns . We can call variable names as Name , Score . We can call rows as observations in our dataset.
It shows Name and Score of four persons.
Libraries in SAS
Libraries store predefined user libraries. It contains following libraries :
When we open MAPS library , it shows following sub-folders .
These sub-folders represent datasets associated with MAPS library .
Library is a collection of datasets.
We have two types of libraries :
- Temporary or Work Library
- Permanent Library
Temporary or Work Library
This is the default library of SAS . When we create a program in SAS , it stores in Work library . If you create a SAS program and have not assign any permanent library to it .It will store in Work library or called as temporary library. When you close the session and open again ,it will not show any dataset in work library .This library did not store datasets after close the session.
This library can store the dataset permanently. If we create a program and save it in this library then it will available lifelong.
SAS Data types
SAS has only two types of data or variables :
Character data type
It includes all alphabetic characters , numbers , special symbols etc.
Numeric data type
It includes only numbers 0-9 .
The variable or dataset name follow below rules :
- It can be maximum 32 characters long.
- It cannot include blanks between variable names like emp id , x 1 etc.
- It must start with letters A through Z or an underscore(_) . Example abc , Ask, _asd etc.
- It can include numbers but not as first character.
- Variable names are case insensitive.
- It does not use hyphen(-) . Example emp-id , a-1 etc.
- Special characters are not allowed like $ , % etc.
It is a readable explanation or annotation in program helps us to understand the program code.
*message; type comment
A comment in the form of *message; cannot contain semicolons or unmatched quotation mark inside it.
It can span multiple lines . These are examples of comments :
*This is a comment;
*This is multiline comment.
This is the second line;
/*message*/ type comment
This type of comment is used more frequently and it cannot be nested . It can be span multiple lines.
/*This is comment*/
/*This is the first line.
This is the second line*/
- Statements can be start anywhere and end anywhere . A semicolon(;) is used to end the statement.
- We can write one statement in number of lines.
- We can write many statements in one line .
- Space can be used to separate the components in a SAS program statement.
- SAS keywords are not case sensitive.
- Every SAS program ends with a Run statement.
SAS Data Set
The Data statement marks the creation of a new SAS data set . The rules for data set creation are :
- We create a new data set by using Data keyword and write the name of data set .
- To store data set permanently , we can prefixed with a library name in the data set name.
- If the SAS data set name is omitted then SAS creates a temporary data set with a name generated by SAS like – DATA1 , DATA2 etc.
*Temporary Data set;
*Permanent Data set;
Data mylib.new; *mylib is permanent library and new is dataset ;
Data library1.data1 ; *library1 is permanent library and data1 is dataset name;
We have two types of steps :
1. DATA Step
2. PROC Step
This step helps to create a dataset and store into SAS .
The syntax of DATA Step is :
Data data_set_name ; * Name of the data set;
Input var1 var2 ; * define variable in this data set;
New_var; * create a new variable;
Label; * Assign labels to variables;
Datalines; * Enter the data ;
run; * Execute data step;
The step is used to apply built-in procedure to analyze the data .
The syntax is :
PROC procedure_name options; *The procedure_name is the name of procedure;