So I started using SAS® University Edition which is a FREE version of SAS® software. Again it's FREE, and that's the main reason why I want to relearn the language. The software was announced on March 24, 2014 and the download went available on May of that year. And for that, I salute Dr. Jim Goodnight. At least we can learn SAS® without paying for the expensive price tag, especially for single user like me.
The software requires a virtual machine, where it runs on top of that; and a 64-bit processor. To install, just follow the instruction in this video. Although the installation in the video is done in Windows, it also works on Mac. Below is the screenshot of my SAS® Studio running on Safari.
If you've been following this blog, I have been promoting free software (R, Python, and C/C++) for analysis, and the introduction of SAS® University Edition will only mean one thing, a new topic to discuss on succeeding posts. So let's welcome this software by doing analysis on it.
What about you? How's your experience with SAS® University Edition?
The software requires a virtual machine, where it runs on top of that; and a 64-bit processor. To install, just follow the instruction in this video. Although the installation in the video is done in Windows, it also works on Mac. Below is the screenshot of my SAS® Studio running on Safari.
What's in the box?
The software includes the following libraries:- Base SAS® - Make programming fast and easy with the SAS® programming language, ODS graphics and reporting procedure;
- SAS/STAT® - Trust SAS® proven reliability with a wide variety of statistical methods and techniques;
- SAS/IML® - Use this matrix programming language for more specialized analyses and data exploration;
- SAS Studio - Reduce your programming time with autocomplete for hundreds of SAS® statements and procedures, as well as built-in syntax help;
- SAS/ACCESS® - Seamlessly connect with your data, no matter where it resides.
If you've been following this blog, I have been promoting free software (R, Python, and C/C++) for analysis, and the introduction of SAS® University Edition will only mean one thing, a new topic to discuss on succeeding posts. So let's welcome this software by doing analysis on it.
Analysis
Our goal here is to address the basics in order to proceed with the analysis, and thus we have the following: 1. Importing and transforming the data; 2. Descriptive statistics; 3. Hypothesis testing: One-sample t test; 4. Creating function; and, 5. Visualization.Data
We'll use again the Volume of Palay Production (1994 to 2013 quarterly) from Cordillera Administrative Region (CAR) Philippines. To reproduce this article, please click here to download the data.- Importing and transforming the data
Working in SAS® Studio, requires you to upload your data into it. To do this, hover to the sidebar, click on Folders tab, and there you will find the "up arrow" for upload. See picture below
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters/* Imports the data */ proc import datafile = "/folders/myfolders/palay.csv" out = work.palay dbms = csv; getnames = yes; datarow = 2; run; proc
refers to procedure, where in this case we perform theimport
procedure.out
is the path where the SAS® data is saved, here we saved it in "Work" folder with filename "palay".getnames
determines whether to generate SAS® variable names from the data values in the first record of the imported file. Finally,datarow
starts reading data from the specified row number in the delimited text file.
I want to emphasize that the description of the arguments of the statements and procedures above is available in the software itself, thanks to SAS® Studio, autocomplete for hundreds of SAS® statements and procedures is very handy. So that in the proceeding codes, we will give description on selected statements only. Below is the autocomplete feature of SAS® Studio seen in action,
Now that we have the data in our workspace, let's do some transformation on it. In R, we always start by viewing the head of the data or the first few observations of the data, and we code it ashead(data)
. Having that habit, here's how to do it in SAS®, in this case, first five observations,
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersproc print data = palay(obs = 5); run; Obs Abra Apayao Benguet Ifugao Kalinga Mt_Province 1 1243 2934 148 3300 10553 2675 2 4158 9235 4287 8063 35257 1920 3 1787 1922 1955 1074 4544 6955 4 17152 14501 3536 19607 31687 2715 5 1266 2385 2530 3315 8520 2601
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersproc print data = palay(firstobs = 5 obs = 10); run; Obs Abra Apayao Benguet Ifugao Kalinga Mt_Province 5 1266 2385 2530 3315 8520 2601 6 5576 7452 771 13134 28252 1242 7 927 1099 2796 5134 3106 9145 8 21540 17038 2463 14226 36238 2465 9 1039 1382 2592 6842 4973 2624 10 5424 10588 1064 13828 40140 1237
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersproc print data = palay(keep = benguet firstobs = 15 obs = 20); run; Obs Benguet 15 2847 16 2942 17 2119 18 734 19 2302 20 2598 keep
-- keeps the variables to be returned, ordrop
-- drops the variables, excluded in the printing.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters/* keeps the first five variables */ proc print data = palay(keep = abra apayao benguet ifugao kalinga firstobs = 15 obs = 20); run; /* or */ /* drops the 6th variable */ proc print data = palay(drop = mt_province firstobs = 15 obs = 20); run; Obs Abra Apayao Benguet Ifugao Kalinga 15 1048 1427 2847 5526 4402 16 25679 15661 2942 14452 33717 17 1055 2191 2119 5882 7352 18 5437 6461 734 10477 24494 19 1029 1183 2302 6438 3316 20 23710 12222 2598 8446 26659 - Perform descriptive statistics
And as always, next step is to look on the descriptive statistics of the data, and here's how to do it,
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersproc means data = palay; run; Variable N Mean Std Dev Minimum Maximum AbraApayaoBenguetIfugaoKalingaMt_Province79797979797912874.3816860.653237.3912414.6230446.424506.2016746.4715448.151588.545034.2822245.713815.71927.0000000401.0000000148.00000001074.002346.00382.000000060303.0054625.008813.0021031.0068663.0013038.00
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersproc means data = palay min mean median mode cv std var kurt skew max; run;
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters/* Save the plot to the folder */ ods listing gpath = "/folders/myfolders/ODSEditorFiles"; /* Plot the data */ title "Scatter Plot Matrix"; proc sgscatter data = palay; matrix abra apayao benguet ifugao kalinga mt_province / diagonal = (histogram kernel) ellipse; run; ods listing close; - Hypothesis testing: One-sample t test
Let's perform simple hypothesis testing, the one-sample t test. Using 0.05 level of significance we'll test whether the true mean of Abra is not equal to 15000.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters/* Save the plot to the folder */ ods listing gpath = "/folders/myfolders/ODSEditorFiles"; /* t-test on data */ proc ttest data = palay(keep = abra) alpha = 0.05 h0 = 15000 sides = 2; run; ods listing close; N Mean Std Dev Std Err Minimum Maximum 79 12874.4 16746.5 1884.1 927.0 60303.0 Mean 95% CL Mean Std Dev 95% CL Std Dev 12874.4 9123.4 16625.4 16746.5 14480.9 19859.1 DF t Value Pr > |t| 78 -1.13 0.2627 - Creating a function
Let's create a function, we'll use thefcmp
procedure. For illustration purposes, consider the standard normal function, \phi(x) = \frac{1}{\sqrt{2\pi}}\exp\left\{-\frac{x^2}{2}\right\}In SAS® we code it as follows,
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersproc fcmp outlib = work.func.stdnorm; /* Save the function as stdnorm in work/func */ function stdnorm(t); /* Define the name of the function and its argument */ fx = 1 / sqrt(2 * constant('PI')) * constant('E') ** (-(t ** 2) / 2); /* Standard normal equation*/ return(fx); /* Return the function fx */ endsub; /* end the subroutine */ quit; /* quit the procedure */ /* Include the path work/func in compilation*/ options cmplib = work.func; run; do loop
, consider the following:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersdata sn_data; /* Define the name of the data */ do x = -5 to 5 by 0.1; /* Perform do loop on the */ y = stdnorm(x); /* function */ output; end; run; proc print data = sn_data(obs = 5); /* Print the first five observations */ run; Obs x y 1 -5.0 .000001487 2 -4.9 .000002439 3 -4.8 .000003961 4 -4.7 .000006370 5 -4.6 .000010141 fcmp
is the best procedure to be included in SAS® version 9.2, and I'm just lucky relearning this language with this feature available, especially that it is FREE in SAS® Studio. - Visualization
Now it's time for us to create some visual art. And SAS® being a propriety software, has a lot to offer. We've demonstrate few above already, this time let's plot the data points ofsn_data
generated from thestdnorm
function we define earlier. Here it is,
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters/* Save the plot to the folder */ ods listing gpath = "/folders/myfolders/ODSEditorFiles"; proc sgplot data = sn_data; title1 "Scatter Plot of SN_DATA"; title2 "by Al-Ahmadgaid Asaad"; xaxis label = "x-axis" grid minor; /* enables grid and minor ticks on x-axis */ yaxis label = "y-axis" grid minor; /* enables grid and minor ticks on y-axis */ scatter x = x y = y / markerattrs = (size = 20 symbol = "circlefilled") filledoutlinedmarkers markerfillattrs = (color = "red") markeroutlineattrs = (color = "purple" thickness = 1) transparency = 0.7 dataskin = matte; run; ods listing close;
- Histogram
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters/* Save the plot to the folder */ ods listing gpath = "/folders/myfolders/ODSEditorFiles"; proc sgplot data = palay; title1 "Histogram of Benguet"; title2 "by Al-Ahmadgaid Asaad"; xaxis minor grid offsetmin = 0.05 offsetmax = 0.05; yaxis minor grid; histogram benguet / nbins = 10 fill fillattrs = (color = "#FF6961") outline transparency = 0.2; density benguet / type = normal; density benguet / type = kernel lineattrs = (color = "purple"); keylegend / location = inside position = topright across = 1; run; ods listing close; - Historical
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters/* Generates New Data Years */ data years; do x = 1994 to 2013 by 0.25; output; end; /* Concatenate both data set */ data palay; set palay; set years; /* Save the plot to the folder */ ods listing gpath = "/folders/myfolders/ODSEditorFiles"; /* Series plot of abra and apayao */ proc sgplot data = palay; title1 "Historical Plot of Abra and Apayao"; title2 "Volume of Palay Production"; footnote "Region: Cordillera Administrative Region (CAR)"; series x = x y = abra; series x = x y = apayao; xaxis label = "Year" grid minor; yaxis label = "Volume of Production" grid minor; run; ods listing close;
- Histogram
Conclusion
In conclusion, it wasn't difficult for me to relearn SAS®, not only because I have used it on few papers back in college, but also because I have programming background on R and Python, which I used as basis on understanding the grammar of the language. Overall, SAS® language is a high level language, as we see above, simple statement will give you complete results with graphics without having lengthy code. And although I used R and Python as my primary tools for research, I am happy to include SAS® on it. And despite the popularity of R in analysis, I am looking ahead to see more learners, students, and researchers even more bloggers using SAS®. That way, we can share and get ideas, techniques between communities of R, SAS®, and Python.What about you? How's your experience with SAS® University Edition?
Data Source
Reference
- SAS® Documentation
- r4stats.com: Data Import. From http://r4stats.com/examples/data-import/ (acccessed January 15, 2015)
- SAS Learning Module: Subsetting data in SAS. From http://www.ats.ucla.edu/stat/sas/modules/subset.htm (accessed January 15, 2015)