BeOptimized - Tips and Tricks

Demystify the SAS Program Data Vector

Reading flat files in SAS Data Step gives you a lot of flexibility in terms of data quality and data manipulation. However, problem may arise when files are huge with poor data quality. In this tips and tricks we will have a look to the 'heart' of the Data Step manipulation and learn to do as much as we can do in this important step: applying data quality rules, creating error datasets, selecting good variables and records (etc.)

Details

The Program Data Vector (PDV) is the heart of the SAS Data Step processing. In this tips and tricks, you will learn what is the PDV and how you can take advantage of it in order to optimize your SAS Data Step code and understand your log better.
More precisely, we will answer the following questions:

What is the Program Data Vector?

  • When is it created, where is it stored, how could I see it?
  • Using Keep and Drop to select variables
  • Using subsetting if and where statement to select records
  • Understand when PDV variables are reset to missing
  • Retaining variable values through Data Step iteration
  • Understanding the default PDV variables _N_ and _ERROR_

How could I make sure the data loaded have good quality?
  • Count number of records having bad pattern VS the one having good pattern
  • Create a separate dataset to store bad data for further investigation
  • Generate an Email or stop the current process when some event are triggered.

Prerequisites

You are a SAS beginner or used SAS long time ago in the past and would like to refresh your skills.


Go back to your selection


This site uses cookies. .

By continuing to browse the site you are agreeing to our use of cookies. Review our cookies information for more details.