SAS continues to be one of the most widely used tools in the Data science industry. Although people may have different opinions on its sustainability and features compared to other tools like R and Python, two things are for sure:
- A healthy market share – SAS continues to have the largest market share in terms of jobs, even in advanced markets like the US. UU. And the UK, SAS's labor market share would be at least 40%. In countries like India, it would be more than 70%.
- Ease of learning and incredible support – Among all the tools that I know, SAS would probably rate as the easiest to learn. The language is easy and can be learned quickly even by beginners.
Those 2 reasons are good enough to consider SAS, if you are just starting out in this industry. You can find more details about how SAS compares to other tools here.
Paso 0: Why learn SAS?
A small video to prepare you about what awaits you:
https://www.youtube.com/watch?v=ksp8CzIgb-E
Paso 1: SAS download and installation
Download the University edition by creating a SAS profile. You will also need to download VMWare or Oracle Virtual box. Here are the links:
Installation Notes:
- The SAS university edition works today only on machines of 64 bits
- You must first download VMWare Player or Oracle Virtual Box and then download the respective version of the SAS University edition.
Paso 2: SAS learning base
Take the Base SAS training at sas.com. This is a free training and will teach you the basics of the SAS language in 24 hours.
SAS programming 1: Fundamentals
Task / Test: Solve the quiz at the end of each section of the course.
Paso 3: SQL learning
Now that you know SAS base to some extent, I should find another way to enter the data in SAS – PROC SQL. Read this post to understand how PROC SQL helps: Comparison between Proc SQL and Data Step
If you already know SQL, I'd be thanking SAS for creating PROC SQL. Even if you don't know SQL, you may find it easier to do your day-to-day data management jobs in SAS. You can look at this SUGI document: Introduction to PROC SQL If you need a more detailed tutorial, you can check this tutorial: Introduction to PROC SQL
Paso 4: Learn descriptive statistics
Let's start our statistical learning now. It is the right time to take the course DataPeaker statistics. This course would use Python to teach you all the basics of descriptive statistics. If you already know them, you can skip this step.
Assignment: Assignments after each chapter of the course must be done in SAS. Your knowledge of the Base SAS course should be sufficient to complete them. If you need specific help, use SAS documentation.
Paso 5: learn inferential statistics
the aforementioned course also covers inferential statistics in Python, including topics such as hypothesis testing, test t and many others. If you already know them, you can skip this step.
Assignment: Assignments after each chapter of the previous course should be done in Python or Excel for now. We will visit them again once we have taken the next steps with the SAS course.
Paso 6: Learning from ANOVA, linear and logistic regression in SAS
TrainingTraining is a systematic process designed to improve skills, physical knowledge or abilities. It is applied in various areas, like sport, Education and professional development. An effective training program includes goal planning, regular practice and evaluation of progress. Adaptation to individual needs and motivation are key factors in achieving successful and sustainable results in any discipline.... de sas.com – Introduction to ANOVA, regression and logistic regression.
Assignment: Available in the course and from the Udacity course
If you are working in the SAS University edition, you will have to skip the steps 7, 9 Y 10. The SAS University edition has its own limitations and cannot run decision trees or time series models.
Paso 7: learning decision trees
Now that you know some algorithms, let's look at the decision trees. Here is an amazing post explaining how decision trees work:
- Decisions Tree: simplified
Here is a guide to run Decision Trees in Enterprise Miner and here is a paper that implements it in Base SAS
Paso 8: agrupación y segmentationSegmentation is a key marketing technique that involves dividing a broad market into smaller, more homogeneous groups. This practice allows companies to adapt their strategies and messages to the specific characteristics of each segment, thus improving the effectiveness of your campaigns. Targeting can be based on demographic criteria, psychographic, geographic or behavioral, facilitating more relevant and personalized communication with the target audience....
First, look at the first 4 videos in this playlist for introduction to clustering k-means. Next reading this guide About SAS Clustering. At the same time of this guide, you can also use This chapter as a good reference.
Paso 9: learning time series forecasting
Here is a good introduction to start learning. Forecasting time series and later use this guide to forecast using Temporal series in SAS
Paso 10: learn IML
Here are a number of posts that can help you get up to speed with IML:
- Introduction to SAS IML
- Next steps in the PROC IML world
- Aplicación de PROC IML en analyticsAnalytics refers to the process of collecting, Measure and analyze data to gain valuable insights that facilitate decision-making. In various fields, like business, Health and sport, Analytics Can Identify Patterns and Trends, Optimize processes and improve results. The use of advanced tools and statistical techniques is essential to transform data into applicable and strategic knowledge....
Paso 11: Learn SAS macros
Below is the series of posts that can contribute to understanding SAS Macro:
- Introduction to the SAS macro
- SAS Iterative and Conditional Macro Declarations
- Introduction to SAS macro functions
Other useful resources for SAS:
- SAS booklet
- ATS UCLA learning path
- Examples of data analysis – examples on specific topics in SAS.