Understanding Generation Data Sets
Definition of Generation Data Set
A generation data set is an archived version of a SAS data set that is stored as part of a
generation group. A generation data set is created each time the file is replaced. Each
generation data set in a generation group has the same root member name, but each has a
different version number. The most recent version of the generation data set is called the
base version.
You can request generations for a SAS data file only. You cannot request generations for
a SAS view.
Note: Generation data sets provide historical versions of a data set; they do not track
observation updates for an individual data set. To log each time an observation is
added, deleted, or updated, see “Understanding an Audit Trail” on page 621.
CAUTION:
Do not use operating system tools when managing generation data sets. This
can cause limited access to the generation group files.
Terminology for Generation Data Sets
The following terms are relevant to generation data sets:
base version
is the most recently created version of a data set. Its name does not have the four-
character suffix for the generation number.
generation group
is a group of data sets that represent a series of replacements to the original data set.
The generation group consists of the base version and a set of historical versions.
generation number
is a monotonically increasing number that identifies one of the historical versions in
a generation group. For example, the data set named Air#272 has a generation
number of 272.
GENMAX=
is an output data set option that requests generations for a data set and specifies the
maximum number of versions (including the base version and all historical versions)
to keep for a given data set. The default is GENMAX=0, which means that the
generation data sets feature is not in effect.
GENNUM=
is an input data set option that references a specific version from a generation group.
Positive numbers are absolute references to a historical version by its generation
number. Negative numbers are a relative reference to historical versions. For
example, GENNUM=-1 refers to the youngest version.
historical versions
are the older copies of the base version of a data set. Names for historical versions
have a four-character suffix for the generation number, such as #003.
oldest version
is the oldest version in a generation group.
Understanding Generation Data Sets 631
rolling over
specifies the process of the version number moving from 999 to 000. When the
generation number reaches 999, its next value is 000.
youngest version
is the version that is chronologically closest to the base version.
Invoking Generation Data Sets
To invoke generation data sets and to specify the maximum number of versions to
maintain, include the output data set option GENMAX= when creating or replacing a
data set. For example, the following DATA step creates a new data set and requests that
up to four copies be kept (one base version and three historical versions):
data (genmax=4);
x=1;
output;
run;
Once the GENMAX= data set option is in effect, the data set member name is limited to
28 characters (rather than 32).This happens because the last four characters are reserved
for a version number. When the GENMAX= data set option is not in effect, the member
name can be up to 32 characters. See the GENMAX= data set option in SAS Data Set
Options: Reference.
Understanding How a Generation Group Is Maintained
The first time a data set with generations in effect is replaced, SAS keeps the replaced
data set, and appends a four-character version number to its member name, which
includes # and a three-digit number. That is, for a data set named A, the replaced data set
becomes A#001. When the data set is replaced for the second time, the replaced data set
becomes A#002. That is, A#002 is the version that is chronologically closest to the base
version. After three replacements, the result is:
A
base (current) version
A#003
most recent (youngest) historical version
A#002
second most recent historical version
A#001
oldest historical version
With GENMAX=4, a fourth replacement deletes the oldest version, which is A#001. As
replacements occur, SAS always keeps four copies. For example, after ten replacements,
the result is:
A
base (current) version
A#010
most recent (youngest) historical version
A#009
2nd most recent historical version
632 Chapter 28 SAS Data Files

Get SAS 9.4 Language Reference, 6th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.