SAS: The Power To Know: November 2013

Friday, 22 November 2013

Review: An Introduction to the SAS System Part-3

Multiple Data Sets: Overview
One of SAS’s greatest strengths is its ability to combine and
process more than one data set at a time. The main tools used to
do this are the set, merge and update statements, along with the
by statement and first. and last. variables.
We’ll look at the following situations:
• Concatenating datasets by observation
• Interleaving several datasets based on a single variable value
• One-to-one matching
• Simple Merge Matching, including table lookup
• More complex Merge Matching

Concatenating Data Sets by Observation
The simplest operation concerning multiple data sets is to
concatenate data sets by rows to form one large data set from
several other data sets. To do this, list the sets to be concatenated
on a set statement; each data set will be processed in turn,
creating an output data set in the usual way.
For example, suppose we wish to create a data set called last by
concatenating the data sets first, second, and third.
data last;
set first second third;
If there are variables in some of the data sets which are not in the
others, those variables will be set to missing (. or ’ ’) in
observations derived from the data sets which lacked the variable in
question.

Concatenating Data Sets (cont’d)
Consider two data sets clerk and manager:
Name Store Position Rank Name Store Position Staff
Joe Central Sales 5 Fred Central Manager 10
Harry Central Sales 5 John Mall Manager 12
Sam Mall Stock 3
The SAS statements to concatenate the data sets are:
data both;
set clerk manager;
run;
resulting in the following data set:
Name Store Position Rank Staff
Joe Central Sales 5 .
Harry Central Sales 5 .
Sam Mall Stock 3 .
Fred Central Manager . 10
John Mall Manager . 12
Note that the variable staff is missing for all observations from set
clerk, and rank is missing for all observations from manager. The
observations are in the same order as the input data sets.

Concatenating Data Sets with proc append
If the two data sets you wish to concatenate contain exactly the
same variables, you can save resources by using proc append
instead of the set statement, since the set statement must process
each observation in the data sets, even though they will not be
changed. Specify the “main” data set using the base= argument
and the data set to be appended using the new= argument. For
example, suppose we wish to append the observations in a data set
called new to a data set called master.enroll. Assuming that
both data sets contained the same variables, you could use proc
append as follows:
proc append base=master.enroll new=new;
run;
The SAS System will print an error message if the variables in the
two data sets are not the same.

Interleaving Datasets based on a Single Variable
If you want to combine several datasets so that observations
sharing a common value are all adjacent to each other, you can list
the datasets on a set statement, and specify the variable to be
used on a by statement. Each of the datasets must be sorted by the
variable on the by statement.
For example, suppose we had three data sets A, B, and C, and each
contained information about employees at different locations:
Set A Set B Set C
Loc Name Salary Loc Name Salary Loc Name Salary
NY Harry 25000 LA John 18000 NY Sue 19000
NY Fred 20000 NY Joe 25000 NY Jane 22000
NY Jill 28000 SF Bill 19000 SF Sam 23000
SF Bob 19000 SF Amy 29000 SF Lyle 22000
Notice that there are not equal numbers of observations from the
different locations in each data set.

Interleaving Datasets (cont’d)
To combine the three data sets, we would use a set statement
combined with a by statement.
data all;
set a b c;
by loc;
run;
which would result in the following data set:
Loc Name Salary Loc Name Salary
LA John 18000 NY Jane 22000
NY Harry 25000 SF Bob 19000
NY Fred 20000 SF Bill 19000
NY Jill 28000 SF Amy 29000
NY Joe 25000 SF Sam 23000
NY Sue 19000 SF Lyle 22000
Similar results could be obtained through a proc sort on the
concatenated data set, but this technique is more efficient and
allows for further processing by including programming statements
before the run;.

One-to-one matching
To combine variables from several data sets where there is a
one-to-one correspondence between the observations in each of the
data sets, list the data sets to be joined on a merge statement. The
output data set created will have as many observations as the
largest data set on the merge statement. If more than one data set
has variables with the same name, the value from the rightmost
data set on the merge statement will be used.
You can use as many data sets as you want on the merge
statement, but remember that they will be combined in the order
in which the observations occur in the data set.

Example: one-to-one matching
For example, consider the data sets personal and business:
Personal Business
Name Age Eyes Name Job Salary
Joe 23 Blue Joe Clerk 20000
Fred 30 Green Fred Manager 30000
Sue 24 Brown Sue Cook 24000
To merge the variables in business with those in personal, use
data both;
merge personal business;
to result in data set both
Name Age Eyes Job Salary
Joe 23 Blue Clerk 20000
Fred 30 Green Manager 30000
Sue 24 Brown Cook 24000
Note that the observations are combined in the exact order in
which they were found in the input data sets.

Simple Match Merging
When there is not an exact one-to-one correspondence between
data sets to be merged, the variables to use to identify matching
observations can be specified on a by statement. The data sets
being merged must be sorted by the variables specified on the by
statement.
Notice that when there is exactly one observation with each by
variable value in each data set, this is the same as the one-to-one
merge described above. Match merging is especially useful if you’re
not sure exactly which observations are in which data sets.
By using the IN= data set option, explained later, you can
determine which from data set(s) a merged observation is derived.

Simple Match Merging (cont’d)
Suppose we have data for student’s grades on two tests, stored in
two separate files
ID Score1 Score2 ID Score3 Score4
7 20 18 7 19 12
9 15 19 10 12 20
12 9 15 12 10 19
Clearly a one-to-one merge would not be appropriate.
Pay particular attention to ID 9, which appears only in the first
data set, and ID 10 which appears only in the second set.
To merge the observations, combining those with common values of
ID we could use the following SAS statements:
data both;
merge scores1 scores2;
by id;
run;

Simple Match Merging (cont’d)
Here’s the result of the merge:
ID Score1 Score2 Score3 Score4
7 20 18 19 12
9 15 19 . .
10 . . 12 20
12 9 15 10 19
Notes
1. All datasets must be sorted by the variables on the by
statement.
2. If an observation was missing from one or more data sets, the
values of the variables which were found only in the missing
data set(s) are set to missing.
3. If there are multiple occurences of a common variable in the
merged data sets, the value from the rightmost data set is used.

Table Lookup
Consider a dataset containing a patient name and a room number,
and a second data set with doctors names corresponding to each of
the room numbers. There are many observations with the same
room number in the first data set, but exactly one observation for
each room number in the second data set. Such a situation is called
table lookup, and is easily handled with a merge statement
combined with a by statement.
Patients Doctors
Patient Room Doctor Room
Smith 215 Reed 215
Jones 215 Ellsworth 217
Williams 215 . . .
Johnson 217
Brown 217
. . .

Table Lookup (cont’d)
The following statements combine the two data sets.
data both;
merge patients doctors;
by room;
run;
resulting in data set both
Patient Room Doctor
Smith 215 Reed
Jones 215 Reed
Williams 215 Reed
Johnson 217 Ellsworth
Brown 217 Ellsworth
Notes: . . .
• As always, both data sets must be sorted by the variables on
the by list.
• The data set with one observation per by variable must be the
second dataset on the merge statement.

Updating Data Sets
When you’re combining exactly two data sets with the goal of
updating some or all of the first data set’s values with values from
the second data set, you can use the update statement.
An update statement accepts exactly two data sets, and must be
followed by a by statement. An update statement is similar to a
merge statement except that
• the update statement will not overwrite non-missing values in
data set one with missing values from data set two, and
• the update statement doesn’t create any observations until all
the observations for a by group are processed. Thus, the first
data set should contain exactly one observation for each by
group, while the second data set can contain multiple
observations, with later observations supplementing or
overriding earlier ones.

Example: update statement
Set orig Set upd
ID Account Balance ID Account Balance
1 2443 274.40 1 . 699.00
2 4432 79.95 2 2232 .
3 5002 615.00 2 . 189.95
3 6100 200.00
Data set orig can be updated with the values in upd using the
following statements:
data orig;
update orig upd;
by id;
resulting in the updated data set:
ID Account Balance
1 2443 699.00
2 2232 189.95
3 6100 200.00

More Complex Merging
Keep the following in mind when performing more complex merges:
• If the merged data sets have variables in common (in addition
to the variables on the by statement), and the values differ, the
values from the rightmost data set in the merge statement are
used.
• If there are multiple occurences of observations with the same
value for the by variable(s) in the data sets being merged, they
are combined one-to-one. If there are unequal numbers of these
observations in any of the data sets being merged, the last
value from the data set with fewer observations is reused for all
the other observations with matching values.
• Various problems arise if variables in data sets being merged
have different attributes. Try to resolve these issues before
merging the data.

More Complex Merging (cont’d)
The following example, although artificial, illustrates some of the
points about complex merging:
one two three
a b c a b d a b c d
1 3 20 1 3 17 1 3 20 17
1 3 19 1 5 12 1 5 19 12
1 7 22 2 9 21 =) 1 7 22 12
2 9 18 2 3 15 2 9 18 21
2 3 22 2 6 31 2 3 22 15
2 6 22 31
The data sets were merged with the following statements:
data three;
merge one two;
by a;

in= data set option
When creating observations from multiple data sets, it is often
helpful to know which data set an observation should come from. It
should be clear that when merging large data sets, additional tools
will be necessary to determine exactly how observations are created.
The in= data set option is one such tool. The option is associated
with one or more of the data sets on a merge statement, and
specifies the name of a temporary variable which will have a value
of 1 if that data set contributed to the current observation and a 0
otherwise.
Two common uses of the in= variable are to make sure that only
complete records are output, and to create a data set of problem
observations which were missing from one of the merged data sets.
The next slide provides an example of these ideas, using the test
scores data from a previous example.

Example of in= data set option
data both problem;
merge scores1(in=one) scores2(in=two);
by id;
if one and two then output both;
else output problem;
run;
The resulting data sets are shown below; note that the in=
variables are not output to the data sets which are created.
Data set both
Obs Id Score1 Score2 Score3 Score4
1 7 20 18 19 12
2 12 9 15 10 19
Data set problem
Obs Id Score1 Score2 Score3 Score4
1 9 15 19 . .
2 10 . . 12 20

Programming with by statements
The power of the by statement coupled with the merge and set
statements is enough to solve most problems, but occasionally more
control is needed.
Internally, SAS creates two temporary variables for each variable
on the by statement of a data step. first.variable is equal to 1
if the current observation is the first occurence of this value of
variable and 0 otherwise. Similarly, last.variable is equal to 1
if the current observation is the last occurence of the value of
variable and 0 otherwise.
When there are several by variables, remember that a new
by-group begins when the value of the rightmost variable on the by
statement changes.

Application: Finding Duplicate Observations I
Many data sets are arranged so that there should be exactly one
observation for each unique combination of variable values. In the
simplest case, there may be an identifier like a social security or
student identification number, and we want to check to make sure
there are not multiple observations with the same value for that
variable.
If the data set is sorted by the identifier variable (say, ID), code like
the following will identify the duplicates:
data check;
set old;
by id;
if first.id and ^last.id;
run;
The duplicates can now be found in data set check

Example of first. and last. variables
Suppose we have a data set called grp with several observations for
each value of a variable called group. We wish to output one
observation for each group containing the three highest values of
the variable x encountered for that group.
data max;
set grp;
by group;
retain x1-x3; * preserve values btwn obs;
if first.group then do; * initialize
x1 = .; x2 = .; x3 = .;
end;
if x >= x1 then do;
x3 = x2; x2 = x1; x1 = x; end;
else if x >= x2 then do;
x3 = x2; x2 = x; end;
else if x >= x3 then x3 = x;
if last.group then output; * output one obs per group;
keep group x1-x3;
run;

Example of first. and last. variables (cont’d)
Here are the results of a simple example of the previous program:
Set grp
Group X
1 16
1 12
1 19 Set max
1 15 Group X1 X2 X3
1 18 1 19 18 17
1 17 =) 2 30 20 14
2 10 3 59 45 18
2 20
2 8
2 14
2 30
3 59
3 45
3 2
3 18

Sorting datasets
For procedures or data steps which need a data set to be sorted by
one or more variables, there are three options available:
1. You can use proc sort; to sort your data. Note that SAS
stores information about the sort in the dataset header, so that
it will not resort already sorted data.
2. You can enter your data in sorted order. If you choose this
option, make sure that you use the correct sorting order.* To
prevent proc sort from resorting your data, use the
sortedby= data set option when you create the data set.
3. You can create an index for one or more combinations of
variables which will stored along with the data set.
* EBCDIC Sorting Sequence (IBM mainframes):
blank .<(+|\&!$*);^-/,%_>?‘:#@’="abcdefghijklmnopqr~stuvwxyz{ABCDEFGHI}JKLMNOOPQR\STUVWXYZ0123456789
ASCII Sorting Sequence (most other computers):
blank !"#$%&’()* +,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_‘abcdefghijklmnopqrstuvwxyz{|}~

Indexed Data Sets
If you will be processing a data set using by statements, or
subsetting your data based on the value(s) of one or more variables,
you may want to store an index based on those variables to speed
future processing.
proc datasets is used to create an index for a SAS data set.
Suppose we have a data set called company.employ, with variables
for branch and empno. To store a simple index (based on a single
variable), statements like the following are used:
proc datasets library=company;
modify employ;
index create branch;
run;
More than one index for a data set can be specified by including
multiple index statements. The index will be used whenever a by
statement with the indexed variable is encountered.

Indexed Data Sets (cont’d)
To create a composite index (based on more than one variable),
you need to provide a label for the index. (This label need not be
specified when you access the data set.) You can have multiple
composite indices within the same data set:
proc datasets library=company;
modify employ;
index create brnum = (branch idnum);
run;
In the previous example, the composite index would mean the data
set is also indexed for branch, but not for idnum.
Note: If you are moving or copying an indexed data set, be sure to
use a SAS procedure like proc copy, datasets, or cport rather
than system utilities, to insure that the index gets correctly copied.

Formats and Informats
Formats control the appearance of variable values in both
procedures and with the put statement of the data step. Most
procedures also use formats to group values using by statements.
Informats are used in conjunction with the input statement to
specify the way that variables are to be read if they are not in the
usual numeric or character format.
SAS provides a large number of predefined formats, as well as the
ability to write your own formats, for input as well as output.
If you include a format or attribute statement in the data step
when the data set is created, the formats will always be associated
with the data. Alternatively, you can use a format statement
within a procedure.
The system option nofmterr will eliminate errors produced by
missing formats.

Basic Formats
Numeric formats are of the form w. or w.d, representing a field
width of w, and containing d decimal places.
put x 6.; *write x with field width of 6;
format price 8.2; *use field width of 8 and 2 d.p. for price;
The bestw. format can be used if you’re not sure about the
number of decimals. (For example, best6. or best8..)
Simple character formats are of the form $w., where w is the
desired field width. (Don’t forget the period.)
put name $20.; * write name with field width of 20;
format city $50.; * use field width of 50 for city;
You can also use formats with the put function to create character
variables formatted to your specifications:
x = 8;
charx = put(x,8.4);
creates a character variable called charx equal to 8.0000

Informats
The basic informats are the same as the basic formats, namely w.d
for numeric values, and $w. for character variables.
By default, leading blanks are stripped from character values. To
retain them, use the $charw. format.
When you specify a character informat wider than the default of 8
columns, SAS automatically will make sure the variable is big
enough to hold your input values.
Some Other SAS Informats
Name Description Name Description
hexw. numeric hexadecimal $hexw. character hexadecimal
octalw. numeric octal $octalw. character octal
bzw.d treat blanks as zeroes ew.d scientific notation
rbw.d floating point binary ibw.d integer binary
pdw.d packed decimal $ebcdicw. EBCDIC to ASCII

Writing User-defined formats using proc format
• You can specify several different value statements within a
single invocation of proc format.
• Each value must be given a name (which does not end in a
number), followed by a set of range/value pairs.
• The keywords low, high, and other can be used to construct
ranges.
• If you wish a range of values to be formatted using some other
format, enclose its name (including the period) in square
brackets ([ ]) as the value for that range.
• Character values should be enclosed in quotes; the name of a
character format must begin with a dollar sign ($).
• If a variable falls outside of the specified ranges, it is formatted
using the usual defaults
102
User-defined Format: Examples
For a variable with values from 1 to 5, the format qval. displays 1
and 2 as low, 3 as medium and 4 and 5 as high.
The format mf. displays values of 1 as male, 2 as female and all
other values as invalid.
The format tt. display values below .001 as undetected, and all
other values in the usual way.
proc format;
value qval 1-2=’low’ 3=’medium’ 4-5=’high’;
value mf 1=’male’ 2=’female’ other=’invalid’;
value tt low-.001=’undetected’;
run;

Recoding Values using Formats
Since many SAS procedures use formatted values to produce
groups, you can often recode variables by simply changing their
formats. This is more efficient than processing an entire data set,
and leaves the original variable unchanged for future use.
Suppose we have a survey which asks people how long they’ve been
using computers (measured in years), and how happy they are with
the computer they are using (measured on a scale of 1 to 5). We
wish to produce a cross tabulation of these results, that is a table
where rows represent levels of one variable, columns represent the
level of another variable, and the entries represent the number of
observations which fall into the row/column categories. Often the
unformatted table will have many empty cells - it is common
practice to group categories or values to create more meaningful
tables.

Recoding Values using Formats (cont’d)
If the two variables in our survey are called years and happy, the
following program would produce a cross tabulation:
proc freq;tables years*happy/nocol norow nocum nopct;
TABLE OF YEARS BY HAPPY
YEARS HAPPY
Frequency| 1| 2| 3| 4| 5| Total
---------+--------+--------+--------+--------+--------+
0.5 | 1 | 2 | 1 | 2 | 0 | 6
---------+--------+--------+--------+--------+--------+
1 | 0 | 0 | 2 | 2 | 0 | 4
---------+--------+--------+--------+--------+--------+
1.5 | 0 | 0 | 0 | 1 | 0 | 1
---------+--------+--------+--------+--------+--------+
2 | 0 | 0 | 2 | 2 | 3 | 7
---------+--------+--------+--------+--------+--------+
3 | 1 | 1 | 0 | 1 | 1 | 4
---------+--------+--------+--------+--------+--------+
8 | 0 | 0 | 0 | 0 | 1 | 1
---------+--------+--------+--------+--------+--------+
10 | 0 | 1 | 0 | 0 | 0 | 1
---------+--------+--------+--------+--------+--------+
12 | 0 | 0 | 0 | 1 | 0 | 1
---------+--------+--------+--------+--------+--------+
Total 2 4 5 9 5 25

Recoding Values using Formats (cont’d)
To make the table more useful, we define the following formats:
proc format;
value yy 0-1=’<=1yr’ 1.5-3=’1-3yr’ 3.5-high=’>3yr’;
value hh 1-2=’Low’ 3=’Medium’ 4-5=’High’;
run;
proc freq;
tables years*happy/nocol norow nocum nopct;
format years yy. happy hh.;
run;
TABLE OF YEARS BY HAPPY
YEARS HAPPY
Frequency|Low |Medium |High | Total
---------+--------+--------+--------+
<=1yr | 3 | 3 | 4 | 10
---------+--------+--------+--------+
1-3yr | 2 | 2 | 8 | 12
---------+--------+--------+--------+
>3yr | 1 | 0 | 2 | 3
---------+--------+--------+--------+
Total 6 5 14 25

SAS Date and Time Values
There are three types of date and time values which SAS can
handle, shown with their internal representation in parentheses:
• Time values (number of seconds since midnight)
• Date values (number of days since January 1, 1970)
• Datetime values (number of seconds since January 1, 1970)
You can specify a date and/or time as a constant in a SAS program
by surrounding the value in quotes, and following it with a t, a d or
a dt. The following examples show the correct format to use:
3PM ) ’3:00p’t or ’15:00’t or ’15:00:00’t
January 4, 1937 ) ’4jan37’d
9:15AM November 3, 1995 )
’3nov95:9:15’dt or ’3nov95:9:15:00’dt

Date and Time Informats and Formats
SAS provides three basic informats for reading dates which are
recorded as all numbers, with or without separators:
• ddmmyyw. - day, month, year (021102, 6-3-2002, 4/7/2000)
• mmddyyw. - month, day, year (110202, 3-6-2002, 7/4/2000)
• yymmddw. - year, month, day (021102, 2002-3-6, 2000/7/4)
These informats will correctly recognize any of the following
separators: blank : - . /, as well as no separator.
For output, the ddmmyyXw., mmddyyXw. and yymmddXw. formats are
available, where “X” specifies the separator as follows:
B - blank C - colon(:) D - dash(-)
N - no separator P - period(.) S - slash(/)

Other Date and Time Informats and Formats
Name Width(default) Examples
datew. 5-9 (7) 26jun96
datetimew. 7-40 (16) 4JUL96:01:30p
julianw. 5-7 (5) 96200 1995001
monyyw. 5-7 (5) jan95 mar1996
timew. 2-20 (8) 6:00:00 3:15:00p
The above formats are valid for input and output. In addition, the
yymmnw. format reads values with only a year and month. The
nldatew. format/informat provides natural language support for
written dates (like July 4, 1996), based on the value of the
locale option. The monnamew. and yearw. formats display parts
of dates.

SAS and Y2K
Since SAS stores its date/time values as the number of seconds or
days from a fixed starting point, there is no special significance of
the year 2000 to SAS. But when dates are input to SAS with only
two digits, there is no way to tell whether they should be
interpreted as years beginning with 19 or 20. The yearcutoff
option controls this decision.
Set the value of this option to the first year of a hundred year span
to be used to resolve two digit dates. For example, if you use the
statement
options yearcutoff=1950;
then two digit dates will be resolved as being in the range of 1950
to 2050; i.e. years greater than 50 will be interpreted as beginning
with 19, and dates less than 50 will be interpreted as beginning
with 20

Date and Time Functions
datepart – Extract date from datetime value
dateonly = datepart(fulldate);
day,month year – Extract part of a date value
day = day(date);
dhms – Construct value from date, hour, minute and second
dtval = dhms(date,9,15,0);
mdy – Construct date from month, day and year
date = mdy(mon,day,1996);
time – Returns the current time of day
now = time();
today – Returns the current date
datenow = today();
intck – Returns the number of intervals between two values
days = intck(’day’,then,today());
intnx – Increments a value by a number of intervals
tomrw = intnx(’day’,today(),1);

Application: Blue Moons
When there are two full moons in a single month, the second is
known as a blue moon. Given that July 9, 1998 is a full moon, and
there are 29 days between full moons, in what months will the next
few blue moons occur?
First, we create a data set which has the dates of all the full moons,
by starting with July 9, 1998, and incrementing the date by 29
days.
data fullmoon;
date = ’09jul98’d;
do i=1 to 500;
month = month(date);
year = year(date);
output;
date = intnx(’day’,date,29);
end;
run;

Application: Blue Moons (cont’d)
Now we can use the put function to create a variable with the full
month name and the year.
data bluemoon;
set fullmoon;
by year month;
if last.month and not first.month then do;
when = put(date,monname.) || ", " || put(date,year.);
output;
end;
run;
proc print data=bluemoon noobs;
var when;
run;
The results look like this:
December, 1998
August, 2000
May, 2002
. . .

Customized Output: put statement
The put statement is the reverse of the input statement, but in
addition to variable names, formats and pointer control, you can
also print text. Most of the features of the input statement work
in a similar fashion in the put statement. For example, to print a
message containing the value of of a variable called x, you could use
a put statement like:
put ’the value of x is ’ x;
To print the values of the variables x and y on one line and name
and address on a second line, you could use:
put x 8.5 y best10. / name $20 @30 address ;
Note the use of (optional) formats and pointer control.
By default, the put statement writes to the SAS log; you can
override this by specifying a filename or fileref on a file statement.

Additional Features of the put statement
By default, the put statement puts a newline after the last item
processed. To prevent this (for example to build a single line with
multiple put statements, use a trailing @ at the end of the put
statement.
The n* operator repeats a string n times. Thus
put 80*"-";
prints a line full of dashes.
Following a variable name with an equal sign causes the put
statement to include the variable’s name in the output. For
example, the statements
x = 8;
put x=;
results in X=8 being printed to the current output file. The keyword
all on the put statement prints out the values of all the variables
in the data set in this named format.

Headers with put statements
You can print headings at the top of each page by specifying a
header= specification on the file statement with the label of a set
of statements to be executed. For example, to print a table
containing names and addresses, with column headings at the top
of each page, you could use statements like the following:
options ps=50;
data _null_;
set address;
file "output" header = top print;
put @1 name $20. @25 address $30.;
return;
top:
put @1 ’Name’ @25 ’Address’ / @1 20*’-’ @25 30*’-’;
return;
run;
Note the use of the two return statements. The print option is
required when using the header= option on the file statement.

Output Delivery System (ODS)
To provide more flexibility in producing output from SAS data
steps and procedures, SAS introduced the ODS. Using ODS, output
can be produced in any of the following formats (the parenthesized
keyword is used to activate a particular ODS stream):
• SAS data set (OUTPUT)
• Normal listing (LISTING) - monospaced font
• Postscript output (PRINTER) - proportional font
• PDF output (PDF) - Portable Document Format
• HTML output (HTML) - for web pages
• RTF output (RTF) - for inclusion in word processors
Many procedures produce ODS objects, which can then be output
in any of these formats. In addition, the ods option of the file
statement, and the ods option of the put statement allow you to
customize ODS output.

ODS Destinations
You can open an ODS output stream with the ODS command and a
destination keyword. For example, to produce HTML formatted
output from the print procedure:
ods html file="output.html";
proc print data=mydata;
run;
ods html close;
Using the print and ods options of the file statement, you can
customize ODS output:
ods printer;
data _null_;
file print ods;
... various put statements ...
run;
ods printer close;

SAS System Options
SAS provides a large number of options for fine tuning the way the
program behaves. Many of these are system dependent, and are
documented online and/or in the appropriate SAS Companion.
You can specify options in three ways:
1. On the command line when invoking SAS, for example
sas -nocenter -nodms -pagesize 20
2. In the system wide config.sas file, or in a local config.sas
file (see the SAS Companion for details).
3. Using the options statement:
options nocenter pagesize=20;
Note that you can precede the name of options which do not take
arguments with no to shut off the option. You can display the
value of all the current options by running proc options.

Review: An Introduction to the SAS System Part-2

Some Common Options
Option Argument Description
Options which are useful when invoking SAS
dms - Use display manager windows
stdio - Obey UNIX standard input and output
config filename Use filename as configuration file
Options which control output appearance
center - Center output on the page
date - Include today’s date on each page
number - Include page numbers on output
linesize number Print in a width of number columns
pagesize number Go to a new page after number lines
ovp - Show emphasis by overprinting
Options which control data set processing
obs number Process a maximum of number obs.
firstobs number Skip first number observations
replace - Replace permanent data sets?

Application: Rescanning Input
Suppose we have an input file which has a county name on one line
followed by one or more lines containing x and y coordinates of the
boundaries of the county. We wish to create a separate observation,
including the county name, for each set of coordinates.
A segment of the file might look like this:
alameda
-121.55 37.82 -121.55 37.78 -121.55 37.54 -121.50 37.53
-121.49 37.51 -121.48 37.48
amador
-121.55 37.82 -121.59 37.81 -121.98 37.71 -121.99 37.76
-122.05 37.79 -122.12 37.79 -122.13 37.82 -122.18 37.82
-122.20 37.87 -122.25 37.89 -122.27 37.90
calaveras
-121.95 37.48 -121.95 37.48 -121.95 37.49 -122.00 37.51
-122.05 37.53 -122.07 37.55 -122.09 37.59 -122.11 37.65
-122.14 37.70 -122.19 37.74 -122.24 37.76 -122.27 37.79
-122.27 37.83 -122.27 37.85 -122.27 37.87 -122.27 37.90
. . .

Application: Rescanning Input (cont’d)
Note that we don’t know how many observations (or data lines)
belong to each county.
data counties;
length county $ 12 name $ 12;
infile "counties.dat";
retain county;
input name $ @; * hold current line for rescanning;
if indexc(name,’0123456789’) = 0 then do;
county = name;
delete; * save county name, but don’t output;
end;
else do;
x = input(name,12.) * do numeric conversion;
input y @@; * hold the line to read more x/y pairs;
end;
drop name;
run;

Application: Reshaping a Data Set I
Since SAS procedures are fairly rigid about the organization of
their input, it is often necessary to use the data step to change the
shape of a data set. For example, repeated measurements on a
single subject may be on several observations, and we may want
them all on the same observation. In essence, we want to perform
the following transformation:
Subj Time X
1 1 10
1 2 12
· · · Subj X1 X2 · · · Xn
1 n 8 =) 1 10 12 · · · 8
2 1 19 2 19 7 · · · 21
2 2 7
· · ·
2 n 21

Application: Reshaping a Data Set I(cont’d)
Since we will be processing multiple input observations to produce
a single output observation, a retain statement must be used to
remember the values from previous inputs. The combination of a
by statement and first. and last. variables allows us to create
the output observations at the appropriate time, even if there are
incomplete records.
data two;
set one;
by subj;
array xx x1-xn;
retain x1-xn;
if first.subj then do i=1 to dim(xx); xx{i} = .;end;
xx{time} = x;
if last.subj then output;
drop time x;
run;

Application: Reshaping a Data Set II
A similar problem to the last is the case where the data for several
observations is contained on a single line, and it is necessary to
convert each of the lines of input into several observations. Suppose
we have test scores for three different tests, for each of several
subjects; we wish to create three separate observations from each of
these sets of test scores:
data scores;
* assume set three contains id, group and score1-score3;
set three;
array sss score1-score3;
do time = 1 to dim(sss);
score = sss{time};
output;
end;
drop score1-score3;
run;

Output Data Sets
Many SAS procedures produce output data sets containing data
summaries (means, summary, univariate), information about
statistical analyses (reg, glm, nlin, tree) or transformed variables
(standard, score, cancorr, rank); some procedures can produce
multiple output data sets. These data sets can be manipulated just
like any other SAS data sets.
Recall that the statistical functions like mean() and std() can
calculate statistical summaries for variables within an observation;
output data sets are used to calculate summaries of variables over
the whole data set.
When you find that you are looping through an entire data set to
calculate a single quantity which you then pass on to another data
step, consider using an output data set instead.

Using ODS to create data sets
Many procedures use the output delivery system to provide
additional control over the output data sets that they produce. To
find out if ODS tables are available for a particular procedure, use
the following statement before the procedure of interest:
ods trace on;
Each table will produce output similar to the following on the log:
Output Added:
-------------
Name: ExtremeObs
Label: Extreme Observations
Template: base.univariate.ExtObs
Path: Univariate.x.ExtremeObs
-------------
Once the path of a table of interest is located, you can produce a
data set with the ods output statement, specifying the path with
an equal sign followed by the output data set name.

ODS Output Data Set: Example
The univariate procedure provides printed information about
extreme observations, but this information is not available through
the out= data set. To put this information in a data set, first find
the appropriate path by using the ods trace statement, and then
use an ODS statement like the following:
ods output Univariate.x.ExtremeObs=extreme;
proc univariate data=mydata;
var x;
run;
ods output close;
The data set extreme will now contain information about the
extreme values.

Output Data Sets: Example I
It is often useful to have summary information about a data set
available when the data set is being processed. Suppose we have a
data set called new, with a variable x, and we wish to calculate a
variable px equal to x divided by the maximum value of x.
proc summary data=new;
var x;
output out=sumnew max=maxx;
run;
data final;
if _n_ = 1 then set sumnew(keep=maxx);
set new;
px = x / maxx;
run;
The automatic variable n will be 1 for the first observation only;
the single observation in sumnew gets read at this time. The set
new; statement then reads in the original data.

Output Data Sets: Example II
Now suppose we have two classification variables called group and
trtmnt, and we wish to use the maximum value for each
group/trtmnt combination in the transformation. If the data set
had already been sorted, the following statements could be used:
proc summary nway data=new;
class group trtmnt;
var x;
output out=sumnew max=maxx;
run;
data final;
merge new sumnew(keep=maxx);
by group trtmnt;
px = x / maxx;
run;
The nway option limits the output data set to contain observations
for each unique combination of the variables given in the class
statement.

Output Data Sets: Example III
Suppose we have a data set called hdata, consisting of three
variables: hospital, time and score, representing the score of
some medical exam taken at three different times at three different
hospitals, and we’d like to produce a plot with three lines: one for
the means of each of the three hospitals over time. The following
statements could be used:
proc means noprint nway data=hdata;
class hospital time;
var score;
output out=hmeans mean=mscore;
run;
The noprint option suppresses the usual printing which is the
default for proc means. You could acheive similar results using a
by statement instead of a class statement, but the data set would
need to be sorted.

Output Data Sets: Example III(cont’d)
The transformation which the previous program produced can be
thought of as the following:
hospital time score
mercy 1 132
mercy 2 125
. . . hospital time _type_ _freq_ mscore
mercy 1 144 city 1 3 2 146.500
mercy 2 224 city 2 3 2 120.000
county 1 119 city 3 3 2 128.000
county 2 125 =) county 1 3 2 125.500
. . . county 2 3 2 127.000
county 2 129 county 3 3 2 117.000
county 3 113 mercy 1 3 3 131.667
city 1 144 mercy 2 3 2 174.500
city 2 121 mercy 3 3 1 121.000
. . .
city 1 149
city 3 122
The type variable indicates the observations are for a unique
combination of levels of two class variables, while the freq
variable is the number of observations which were used for the
computation of that statistic.

Plotting the Means
The following program produces the graph shown on the right:
symbol1 interpol=join
value=plus;
symbol2 interpol=join
value=square;
symbol3 interpol=join
value=star;
title "Means versus Time";
proc gplot data=hmeans;
plot mscore*time=hospital;
run;

Application: Finding Duplicate Observations II
In a previous example, duplicates were found by using by
processing and first. and last. variables. If the data set were
very large, or not already sorted by id, that program would not be
very efficient. In this case, an output data set from proc freq
might be more useful. Once again, assume the identifier variable is
called id. The following statements will produce a data set with
the id values of the duplicate observations.
proc freq data=old noprint;
tables id/ out=counts(rename = (count = n) keep=id count);
run;
data check;
set counts;
if n > 1;
run;
Note that, even though count is renamed to n, the original variable
name (count) is used on the keep statement.

SAS Macro Language: Overview
At it’s simplest level, the SAS Macro language allows you to assign
a string value to a variable and to use the variable anywhere in a
SAS program:
%let header = "Here is my title";
. . .
proc print ;
var x y z;
title &header;
run;
This would produce exactly the same result as if you typed the
string "Here is my title" in place of &header in the program.
Notice that the substitution is very simple - the text of the macro
variable is substituted for the macro symbol in the program.

SAS Macro Language: Overview (cont’d)
The macro facility can be used to replace pieces of actual programs
by creating named macros:
%macro readsome;
data one;
infile "myfile";
input x y z;
if
%mend;
%readsome x > 1; run;
The final statement is equivalent to typing
data one;
infile "myfile";
input x y z;
if x > 1; run;
Once again, all that is performed is simple text replacement.

SAS Macro Language: Overview (cont’d)
A large part of the macro facility’s utility comes from the macro
programming statements which are all preceded by a percent sign
(%). For example, suppose we need to create 5 data sets, named
sales1, sales2, etc., each reading from a corresponding data file
insales1, insales2, etc.
%macro dosales;
%do i=1 %to 5;
data sales&i;
infile "insales&i";
input dept $ sales;
run;
%end;
%mend dosales;
%dosales;
Note that, until the last line is entered, no actual SAS statements
are carried out; the macro is only compiled.

Defining and Accessing Macro Variables in the Data Step
You can set a macro variable to a value in the data step using the
call symput function. The format is
call symput(name,value);
where name is a string or character variable containing the name of
the macro variable to be created, and value is the value the macro
variable will have.
To access a macro variable in a data step, you can use the symget
function.
value = symget(name);
name is a string or character variable containing the name of the
macro variable to be accessed.

call symput: Example
Suppose we want to put the maximum value of a variable in a title.
The following program shows how.
data new;
retain max -1e20;
set salary end = last;
if salary > max then max = salary;
if last then call symput("maxsal",max);
drop max;
run;
title "Salaries of employees (Maximum = &maxsal)";
proc print data=salary;
run;
Note that no ampersand is used in call symput, but you must use
an ampersand to reference the macro variable later in your
program.

An Alternative to the macro facility
As an alternative to the previous example, we can use the put
statement to write a SAS program, and then use the %include
statement to execute the program. Using this technique, the
following statements recreate the previous example:
proc means data=salary noprint;
var salary;
output out=next max=maxsal;
data _null_;
set next;
file "tmpprog.sas";
put ’title "Salaries of employees (Maximum =’ maxsal ’)";’;
put ’proc print data=salary;’;
put ’run;’;
run;
%include "tmpprog.sas";
Pay special attention to quotes and semicolons in the generated
program.

Another Alternative to the Macro Facility
In addition to writing SAS statements to a file, SAS provides the
call execute function. This function takes a quoted string, a
character variable, or a SAS expression which resolves to a
character variable and then executes its input when the current
data step is completed.
For example, suppose we have a data set called new which contains
a variable called maxsal. We could generate a title statement
containing this value with statements like the following.
data _null_;
set new;
call execute(’title
"Salaries of employees (Maximum = ’|| put(maxsal,6.) || ’)";’);
run;

call execute Example
As a larger example of the use of the call execute function,
consider the problem of reading a list of filenames from a SAS data
set and constructing the corresponding data steps to read the files.
The following program performs the same function as the earlier
macro example.
data _null_;
set files;
call execute(’data ’ || name || ’;’);
call execute(’infile "’|| trim(name) || ’";’);
call execute(’input x y;’);
call execute(’run;’);
run;
Be careful with single and double quotes, and make sure the
generated statements follow the rules of SAS syntax.

Application: Reading a Series of Files
Suppose we have a data set containing the names of files to be read,
and we wish to create data sets of the same name from the data in
those files. First, we use the call symput function in the data step
to create a series of macro variables containing the file names
data _null_;
set files end=last;
n + 1;
call symput("file"||left(n),trim(name));
if last then call symput("num",n);
run;
Since macros work by simple text substitution, it is important that
there are no blanks in either the macro name or value, thus the use
of left and trim

Application: Reading a Series of Files (cont’d)
Now we can write a macro to loop over the previously defined file
names and create the data sets.
%macro readem;
%do i=1 %to &num;
data &&file&i;
infile "&&file&i";
input x y;
run;
%end;
%mend;
%readem;
Notice that the macro variable is refered to as &&file&i, to force
the macro substitution to be scanned twice. If we used just a single
ampersand, SAS would look for a macro variable called &file.

Macros with Arguments
Consider the following program to print duplicate cases with
common values of the variables a and b in data set one:
data one;
input a b y @@;
datalines;
1 1 12 1 1 14 1 2 15 1 2 19 1 3 15 2 1 19 2 4 16 2 4 12 2 8 18 3 1 19
proc summary data=one nway ;
class a b;
output out=next(keep = a b _freq_ rename=(_freq_ = count));
data dups;
merge one next;
by a b;
if count > 1;
proc print data=dups;
run;
If we had simple way of changing the input data set name and the
list of variables on the by statement, we could write a general
macro for printing duplicates in a data set.

Macros with Arguments (cont’d)
To add arguments to a macro, simply replace the parts of the
program in question with macro variables (beginning with &), and
list the variables in the argument list (without the &).
%macro prntdups(data,by);
proc summary data=&data nway ;
class &by;
output out=next(keep = &by _freq_ rename=(_freq_ = count));
run;
data dups;
merge &data next;
by &by;
if count > 1;
run;
proc print data=dups;
run;
%mend prntdups;
Then the program on the previous slide would be replaced by:
%prntdups(one,a b);

Accessing Operating System Commands
If you need to run an operating system command from inside a
SAS program, you can use the x command. Enclose the command
in quotes after the x, and end the statement with a semicolon. The
command will be executed as soon as it is encountered.
For example, in an earlier program, a file called tmpprog.sas was
created to hold program statements which were later executed. To
remove the file after the statements were executed (on a UNIX
system) you could use the SAS statement:
x ’rm tmpprog.sas’;
Other interfaces to the operating system may be available. For
example, on UNIX systems the pipe keyword can be used on a
filename statement to have SAS read from or write to a process
instead of a file. See the SAS Companion for your operating system
for more details.

Transporting Data Sets
It is sometimes necessary to move a SAS data set from one
computer to another. The internal format of SAS data sets is not
the same on all computers, so to make it possible to transfer data
sets from one computer to another, SAS provides what is known as
a transport format. Whenever you move a SAS data set from one
computer to another, you must first convert it into transport
format.
Keep in mind that SAS data sets are in general readable only by
SAS. Thus, an alternative (or perhaps a backup) method for
transporting data sets is to write them in a human-readable way,
using, for example, put statements. Human-readable files can be
processed by SAS (or some other program), and are generally easier
to move around than SAS transport data sets.

SAS/CONNECT
SAS also provides a product called SAS/CONNECT which lets you
initiate a SAS job on a remote computer from a local SAS display
manager session. It also provides two procedures, proc upload and
proc download to simplify transporting data sets. If
SAS/CONNECT is available on the machines between which the
data set needs to be moved, it may be the easiest way to move the
data set.
SAS/CONNECT must be run from the display manager. When
you connect with the other system, you will be prompted for a
login name and a password (if appropriate). Once you’re
connected, the rsubmit display manager command will submit jobs
to the remote host, even though the log and output will be
managed by the local host.

Creating a Dataset in transport format
proc copy can be used to create a transport format file, but the
critical step is to use the xport keyword in the libname statement.
The specified libname is then the name of the transport format file
which SAS will create, not a directory as is usually the case.
Suppose we wanted to create a SAS transport data file named
move.survey from a SAS data set named save.results.
libname save "/my/sas/dir";
libname move xport "move.survey";
proc copy in=save out=move;
select results;
run;
If you transfer the transport data set using a program like ftp,
make sure that you use binary (image) mode to transfer
the file.

proc transpose
Occasionally it is useful to switch the roles of variables and
observations in a data set. The proc transpose program takes
care of this task.
To understand the workings of proc transpose, consider a data
set with four observations and three variables (x, y and z). Here’s
the transformation proc transpose performs:
Original data Transposed data
X Y Z _NAME_ COL1 COL2 COL3 COL4
12 19 14 X 12 21 33 14
21 15 19 =) Y 19 15 27 32
33 27 82 Z 14 19 82 99
14 32 99
The real power of proc transpose becomes apparent when it’s
used with a by statement.

proc transpose with a by statement
When a by statement is used with proc transpose, a variety of
manipulations which normally require programming can be
acheived automatically.
For example, consider a data set with several observations for each
subject, similar to a previous example:
subj time x
1 1 12
1 2 15
1 3 19
2 1 17
2 3 14
3 1 21
3 2 15
3 3 18
Notice that there is no observation for subject 2 at time 2.

proc transpose with a by statement (cont’d)
To make sure proc transpose understands the structure that we
want in the output data set, an id statement is used to specify
time as the variable which defines the new variables being created.
The prefix= option controls the name of the new variables:
proc transpose data=one out=two prefix=value;
by subj;
id time;
The results are shown below:
subj _NAME_ value1 value2 value3
1 x 12 15 19
2 x 17 . 14
3 x 21 15 18
Notice that the missing value for subject 2, time 2 was handled
correctly.

proc contents
Since SAS data sets cannot be read like normal files, it is important
to have tools which provide information about data sets. proc
print can show what’s in a data set, but it not always be
appropriate. The var and libname windows of the display manager
are other useful tools, but to get printed information or to
manipulate that information, you should use proc contents.
Among other information, proc contents provides the name,
type, length, format, informat and label for each variable in the
data set, as well as the creation date and time and the number of
observations. To use proc contents, specify the data set name as
the data= argument of the proc contents statement; to see the
contents of all the data sets in a directory, define an appropriate
libname for the directory, and provide a data set name of the form
libname. all .

Options for proc contents
The short option limits the output to just a list of variable names.
The position option orders the variables by their position in the
data set, instead of the default alphabetical order. This can be
useful when working with double dashed lists.
The directory option provides a list of all the data sets in the
library that the specified data set comes from, along with the usual
output for the specified data set.
The nods option, when used in conjunction with a data set of the
form libname. all , limits the output to a list of the data sets in
the specified libname, with no output for the individual data sets.
The out= option specifies the name of an output data set to
contain the information about the variables in the specified data
set(s). The program on the next slide uses this data set to write a
human readable version of a SAS data set.

Using the output data set from proc contents
%macro putdat(indata,outfile);
proc contents noprint data=&indata out=fcon; run;
data _null_;
file "tmpprog.sas";
set fcon end=last;
length form $ 8;
if _n_ = 1 then do;
put "data _null_;"/"set &indata;";
put ’file "&outfile";’ / "put " @;
end;
put name @;
if type = 2 then
form = "$"||trim(left(put(length,3.)))||".";
else form = "best12.";
put form @;
if ^last then put "+1 " @;
else put ";" / "run;" ;
run;
%include "tmpprog.sas";
x ’rm tmpprog.sas’;
%mend putdat;

The Display Manager
When SAS is invoked, it displays three windows to help you
interact with your programs and output:
• Program Window - for editing and submitting SAS statements
• Log Window - for viewing and saving log output
• Output Window - for viewing and saving output
Some commands which open other useful windows include:
• assist - menu driven version of SAS
• dir - shows data sets in a library
• var - shows variables in a data set
• notepad - simple text window
• options - view and change system options
• filename - view current filename assignments
• help - interactive help system
• libname - view current libname assignments

Appearance of Display Manager
OUTPUT----------------------------------------------------------------------------------+
|Command ===> |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
LOG-------------------------------------------------------------------------------------+
|File Edit View Locals Globals Help |
| |
| |
|This is version 6.11 of the SAS System. |
| |
| |
|NOTE: AUTOEXEC processing beginning; file is /app/sas611/autoexec.sas. |
| |
|NOTE: SAS initialization used: |
| real time 0.567 seconds |
| cpu time 0.421 seconds |
| |
PROGRAM EDITOR--------------------------------------------------------------------------+
|Command ===> |
| |
|00001 |
|00002 |
|00003 |
|00004 |
|00005 |
|00006 |
|00007 |
|00008 |
|00009 |
|00010 |
+---------------------------------------------------------------------------------------+

Entering Display Manager Commands
You can type display manager commands on the command line of
any display manager window. (To switch from menu bar to
command line select Globals -> Options -> Command line; to
switch back to menu bar, enter the command command.)
You can also enter display manager commands from the program
editor by surrounding them in quotes, and preceding them by dm,
provided that the display manager is active.
Some useful display manager commands which work in any window
include:
• clear - clear the contents of the window
• end - close the window
• endsas - end the sas session
• file "filename" - save contents of the window to filename
• prevcmd - recall previous display manager command

The Program Editor
There are a number of special display manager commands available
in the program editor.
• submit - submit all the lines in the editor to SAS
• subtop - submit the first line in the editor to SAS
• recall - place submitted statements back in the editor
• include "file" - place the contents of file in the editor
• hostedit - on UNIX systems, invoke the editor described in
the editcmd system option
Typing control-x when the cursor is in the program editor toggles
between insert and overwrite mode.
You can close the program window with the display manager
command program off.

Using the Program Editor
There are two types of commands which can be used with the
program editor
• Command line commands are entered in the Command ===>
prompt, or are typed into a window when menus are in effect.
• Line commands are entered by typing over the numbered lines
on the left hand side of the editor window. Many of the line
commands allow you to operate on multiple selected lines of
text.
In addition, any of the editor or other display manager commands
can be assigned to a function or control key, as will be explained
later.
Note: The undo command can be used to reverse the effect of
editing commands issued in the display manager.

Editor Line Commands
Commands followed by <n> optionally accept a number to act on
multiple lines.
Inserting Lines Deleting Lines
i<n> insert after current line d<n> delete lines
ib<n> insert before current line dd block delete
Moving Lines Copying Lines
m<n> move lines c<n> copy lines
mm block move cc block copy
Other Commands
>><n> indent lines <<<n> remove indentation
tc connect text ts split text
Type block commands on the starting and ending lines of the block
and use a b or a command to specify the a line before which or
after which the block should be placed.

Defining Function and Control Keys
You can define function keys, control keys, and possibly other keys
depending on your operating system, through the keys window of
the display manager.
To define a function key to execute a display manager command,
enter the name of the command in the right hand field next to the
key you wish to define.
To define a function key to execute an editor line command, enter
the letter(s) corresponding to the command preceded by a colon (:)
in the right hand field.
To define a function key to insert text into the editor, precede the
text to be inserted with a tilda (~) in the right hand field.
Some display manager commands only make sense when defined
through keys. For example the command home puts the cursor on
the command line of a display manager window.

More on Function Keys
You can define a function key to perform several commands by
separating the commands with a semicolon (;).
Function keys defining display manager commands are executed
before any text which is on the command line. Thus, you can enter
arguments to display manager commands before hitting the
appropriate function key.
To set function keys without using the keys window, use the
display manager keydef command, for example:
keydef f2 rfind
Keys set in this way will only be in effect for the current session.

Cutting and Pasting
If block moves and/or copies do not satisfy your editing needs, you
can cut and paste non-rectangular blocks of text. Using these
commands generally requires that keys have been defined for the
display manager commands mark, cut, and paste, or home.
To define a range of text, issue the mark command at the beginning
and end of the text you want cut or pasted. Then issue the cut
(destructive) or store (non-destructive) command. Finally, place
the cursor at the point to which you want the text moved, and
issue the paste command.
When using cut or store, you can optionally use the append
option, which allows you to build up the contents of the paste
buffer with several distinct cuts or copies, or the buffer=name
option, to create or use a named buffer.

Searching and Changing Text
The display manager provides the following commands for
searching and, in the program editor or notepad, changing text.
These commands include:
• find string - search forward for string
• bfind string - search backward for string
• rfind - repeat previous find command
• change old new - change old to new
• rchange - repeat previous change command
Each of these commands takes a variety of options:
Scope of Search: next, first, last, prev, all
Component of Search: word, prefix, suffix
Case Independence: icase
If there are blanks or special symbols in any of the strings,
surround the entire string in single or double quotes.

Using the Find and Change Commands
• To change every occurence of the string “sam” to “fred”,
ignoring the case of the first string, enter
change sam fred all icase
• To selectively change the word cat to dog, use
change cat dog word
followed by repeated use of rfind, to find the next occurence
of the word, and rchange if a change is desired.
• To count the number of occurences of the word fish, use
find fish all
and the count will be displayed under the command line.
If an area of text is marked (using the display manager mark
command), then search and/or find commands apply only to the
marked region.

Customizing the Display Manager
Key definitions entered through the keys window are automatically
stored from session to session.
The color display manager command allows you to customize the
colors of various components of SAS windows. The sascolor
window allows you to change colors through a window.
To save the geometry and location of display manager windows,
issue the display manager command wsave from within a given
window, or wsave all to save for all windows.