SAS Tips and Tricks

Summary

SAS Tips and Tricks consists of half days/full days of knowledge transfer in SAS programming language and are organized by an experienced and motivated SAS consultant who combines theory and demonstration to helps you refreshing your SAS skills, discover news technics and meet other experts.

Organized for employees, sessions are organized in our business center (near Brussels-North train station) and, to guaranty an optimal quality, the number of participants is limited to 8 persons. Contact us to assist to a session (scheduled or not) or click the register button in the course details, we will recontact you with more informations.

You cannot find a topic you would like to follow or would like to follow the session individually at your best conveniance? Please Contact us in order to see if we could organize a SAS Coaching that suits perfectly your needs.

Topic list

Reading flat files in SAS Data Step gives you a lot of flexibility in terms of data quality and data manipulation. However, problem may arise when files are huge with poor data quality. In this tips and tricks we will have a look to the 'heart' of the Data Step manipulation and learn to do as much as we can do in this important step: applying data quality rules, creating error datasets, selecting good variables and records (etc.)

It's not always easy to choose between Proc SQL and Data Step programming. In this tips and tricks, we will see the big differences between Proc SQL and Data Step and list the pro's and the con's of both of them.

Level= Intermediate

At the era of the big data, it's very common to have lot of variables to manipulate in blocks of variables. Arrays and Do loops are powerful data manipulation tools that help make code more efficient for repetitive operation.

Level= Intermediate

Log files contains tons of useful information, they help you when you have syntax errors, warnings and notes but you could use your logs in order to monitor your daily processes and identify the potential bottlenecks.

Level= Intermediate

SAS Variable lists allows you to simplify the variable notations in your Data Step code. Very often we forget that capability of the SAS programing language which can be particularly useful when lot of variables have similar name and have to processed together.

SAS macro language allows you to process dynamically all the files of a directory or all the files of a library . Different technic exist depending on the nature of the files. This tips and tricks discuss the case when your files are datasets present in a single library and discuss the case when we manipulate flat files in a folder structure.

Windows scheduler can be used to execute automatically SAS code at specific moment of the day or at specific days of the month. They execute the same code every time but, thanks to macro, the output result could change thanks to the environment (data, time, user (etc.).)

SAS hash objects are super fast in-memory technic allowing you to perform join operation, lookup, data check (etc.) this technic can be used on datasets but also when flat files are read allowing you to do data validation during the load phase of your data.

Gathering job statistics could help you to detect proactively problems in your data by comparing current values with their estimated values. The idea behind is to collect as much as possible information on your daily run processes (number of records, times it take, number of missing (etc.)) and store them in an historical table. If the values of the current run differs with the values of previous run for the same jobs, this could indicate potential problem.

Getting quickly metadata information on your datasets could be useful in some context of conditional execution of SAS programs (e.g.: Don't execute this step if empty dataset). Different methods exists to achieve that: using metadata functions, using Proc contents, or dictionary tables.

A Slowly Changing Dimension (SCD) is a dimension that stores and manages both current and historical data over time in a data warehouse. It is considered and implemented as one of the most critical ETL tasks in tracking the history of dimension records. Different SCD strategies could be implemented and this tips and tricks is focused on SCD1 where records are updated no history is kept and SCD2 where history is kept,

Using a SAS table to control job execution can be useful when lot of jobs have to be scheduled regularly but depending on circumstance some don't have to be scheduled. In normal situation we would have to adapt the job scheduler and this could be time consuming. You could use instead a SAS Dataset or Excel spreadsheet where all the jobs are listed with a flag variable indicating the list of jobs to schedule.

Level= Intermediate

SAS formats are instruction that SAS uses to write data values. You use formats to control the written appearance of data values, or, in some cases, to group data values together for analysis. You can create format manually with hardcoded values or from datasets, you can store them in permanent library, share them and also use build-in format (etc.) every thing you want to know about SAS format should be present in this tips and tricks.

SAS projects can be difficult to evaluate and consultant misunderstood when you are not familiar with SAS terminology. Data Step, Proc Step, Hash Object, Formats, Statements (etc.) if you are lost with this terminology and would like to have a summary, this tips and tricks is made for you !

Level= Intermediate

Indexes are special reference tables that the SAS engine can use to speed up data retrieval. Simply put, an index is a pointer to data in a table. In this tips and tricks we will discuss how you can create and use SAS indexes.

The SAS hash object is meant to enable you to quickly and efficiently store, search, and retrieve data based on lookup keys , Conceptually, the hash object provides programmers the means to easily define and utilize a hash table within the Data Step.

Proc Tabulate is one of the most powerful SAS reporting Procedure. It generates multidimensional reports like Excel pivot tables and is super flexible. This tips and tricks provides you all you need to become an expert in that Procedure!

Level= Expert

Transposing your data is a common problem in data manipulation. Very often data are stored in narrow structure for flexibility and storage purpose and have to be transposed in wide structure or vice versa. When Proc transpose can be used to your data, it's quite easy to convert your data. However, it becomes more difficult to do it with Data Step. This tips and tricks shows you how you can transpose your data thanks to Data Step.

Creating CSV from SAS can be done in Data Step, Proc export or using ODS CSV. When CSV are not simple line of records, it becomes impossible to use this technic and Data Step programming is required.

XML files are not simple line of records, they have a hierarchy and they are not easy to create programmatically. Sometimes software cannot create complex XML files and it becomes mandatory to pass through this step. In this tips and tricks, we will see how we can create XML in Data Step.

This tips and tricks is an introduction of the SAS macro language showing you the powerful capabilities of this languages.

The Output Delivery System (ODS) is a global statement used to redirect the output of your reports outside the SAS environment. In this tips and tricks we will discuss the ODS PDF, ODS RTF, ODS Excel, ODS CSVALL.