For more information about DATA step match-merges, see SAS Statements: Reference.
Using Subqueries to Select Data
A table join combines multiple tables into a new table. A subquery (enclosed in
parentheses) selects rows from one table based on values in another table. A subquery, or
inner query, is a query expression that is nested as part of another query expression.
Depending on the clause that contains it, a subquery can return a single value or multiple
values. Subqueries are most often used in the WHERE and the HAVING expressions.
Single-Value Subqueries
A single-value subquery returns a single row and column. It can be used in a WHERE or
HAVING clause with a comparison operator. The subquery must return only one value,
or else the query fails and an error message is printed to the log.
This query uses a subquery in its WHERE clause to select US states that have a
population greater than Belgium. The subquery is evaluated first, and then it returns the
population of Belgium to the outer query.
libname sql 'SAS-library';
proc sql;
title 'U.S. States with Population Greater than Belgium';
select Name 'State' , population format=comma10.
from sql.unitedstates
where population gt
(select population from sql.countries
where name = "Belgium");
Internally, this is what the query looks like after the subquery has executed:
proc sql;
title 'U.S. States with Population Greater than Belgium';
select Name 'State', population format=comma10.
from sql.unitedstates
where population gt 10162614;
The outer query lists the states whose populations are greater than the population of
Belgium.
96 Chapter 3 Retrieving Data from Multiple Tables
Output 3.25 Single-Value Subquery
Multiple-Value Subqueries
A multiple-value subquery can return more than one value from one column. It is used in
a WHERE or HAVING expression that contains IN or a comparison operator that is
modified by ANY or ALL. This example displays the populations of oil-producing
countries. The subquery first returns all countries that are found in the OilProd table. The
outer query then matches countries in the Countries table to the results of the subquery.
libname sql 'SAS-library';
proc sql outobs=5;
title 'Populations of Major Oil Producing Countries';
select name 'Country', Population format=comma15.
from sql.countries
where Name in
(select Country from sql.oilprod);
Output 3.26 Multiple-Value Subquery Using IN
If you use the NOT IN operator in this query, then the query result will contain all the
countries that are not contained in the OilProd table.
Using Subqueries to Select Data 97
libname sql 'SAS-library';
proc sql outobs=5;
title 'Populations of NonMajor Oil Producing Countries';
select name 'Country', Population format=comma15.
from sql.countries
where Name not in
(select Country from sql.oilprod);
Output 3.27 Multiple-Value Subquery Using NOT IN
Correlated Subqueries
The previous subqueries have been simple subqueries that are self-contained and that
execute independently of the outer query. A correlated subquery requires a value or
values to be passed to it by the outer query. After the subquery runs, it passes the results
back to the outer query. Correlated subqueries can return single or multiple values.
This example selects all major oil reserves of countries on the continent of Africa.
libname sql 'SAS-library';
proc sql;
title 'Oil Reserves of Countries in Africa';
select * from sql.oilrsrvs o
where 'Africa' =
(select Continent from sql.countries c
where c.Name = o.Country);
The outer query selects the first row from the OilRsrvs table and then passes the value of
the Country column, Algeria, to the subquery. At this point, the subquery internally
looks like this:
(select Continent from sql.countries c
where c.Name = 'Algeria');
The subquery selects that country from the Countries table. The subquery then passes the
country's continent back to the WHERE clause in the outer query. If the continent is
Africa, then the country is selected and displayed. The outer query then selects each
subsequent row from the OilRsrvs table and passes the individual values of Country to
the subquery. The subquery returns the appropriate values of Continent to the outer
query for comparison in its WHERE clause.
98 Chapter 3 Retrieving Data from Multiple Tables

Get SAS 9.4 SQL Procedure User's Guide, Fourth Edition, 4th Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.