Keep your programs from performing tasks they weren’t meant to do.
One of the more
exciting new
features in NetBSD and OpenBSD is systrace
, a
system call access manager. With systrace
, a
system administrator can specify which programs can make which system
calls, and how those calls can be made. Proper use of
systrace
can greatly reduce the risks inherent
in running poorly written or exploitable programs. Systrace
policies can confine users in a manner completely
independent of Unix permissions. You can even define the errors that
the system calls return when access is denied, to allow programs to
fail in a more proper manner. Proper use of systrace
requires a practical understanding of system calls and
what functionality programs must have to work properly.
First of all, what exactly are system calls? A system call is a function that lets you talk to the operating-system kernel. If you want to allocate memory, open a TCP/IP port, or perform input/output on the disk, you’ll need to use a system call. System calls are documented in section 2 of the manpages.
Unix also supports a wide variety of C library calls. These are often confused with system calls but are actually just standardized routines for things that could be written within a program. For example, you could easily write a function to compute square roots within a program, but you could not write a function to allocate memory without using a system call. If you’re in doubt whether a particular function is a system call or a C library function, check the online manual.
You may find an occasional system call that is not documented in the
online manual, such as break()
.
You’ll need to dig into other resources to identify
these calls (break()
in particular is a very old
system call used within libc
, but not by
programmers, so it seems to have escaped being documented in the
manpages).
Systrace
denies all actions that are not
explicitly permitted and logs the rejection using
syslog
. If a program running under
systrace
has a problem, you can find out which
system call the program wants to use and decide if you want to add it
to your policy, reconfigure the program, or live with the error.
Systrace
has several important pieces:
policies,
the policy generation tools, the runtime access management tool, and
the sysadmin real-time interface. This hack gives a brief overview of
policies; in
[Hack #16]
,
we’ll learn about the systrace
tools.
The systrace(1)
manpage includes a full
description of the syntax used for policy descriptions, but I
generally find it easier to look at some examples of a working policy
and then go over the syntax in detail. Since
named
has been a subject of recent
security discussions, let’s look at the policy that
OpenBSD 3.2 provides for named.
Before reviewing the named policy,
let’s review some commonly known facts about the
name server daemon’s system-access requirements.
Zone transfers and large queries occur on port 53/TCP, while basic
lookup services are provided on port 53/UDP. OpenBSD
chroot
s named
into
/var/named
by default and logs everything to
/var/log/messages
.
Each systrace
policy file is in a file named
after the full path of the program, replacing slashes with
underscores. The policy file usr_sbin_named
contains quite a few entries that allow access beyond binding to port
53 and writing to the system log. The file starts with:
# Policy for named that uses named user and chroots to /var/named # This policy works for the default configuration of named. Policy: /usr/sbin/named, Emulation: native
The Policy
statement gives the full path to the
program this policy is for. You can’t fool
systrace
by giving the same name to a program
elsewhere on the system. The Emulation
entry shows
which ABI this policy is for. Remember, BSD systems expose ABIs for a
variety of operating systems. Systrace
can
theoretically manage system-call access for any ABI, although only
native and Linux binaries are supported at the moment.
The remaining lines define a variety of system calls that the program may or may not use. The sample policy for named includes 73 lines of system-call rules. The most basic look like this:
native-accept: permit
When /usr/sbin/named
tries to use the
accept()
system call to accept a connection on a
socket, under the native ABI, it is allowed. Other rules are far more
restrictive. Here’s a rule for bind( )
, the system call that lets a program request a TCP/IP
port to attach to:
native-bind: sockaddr match "inet-*:53" then permit
sockaddr
is the name of an argument taken by the
accept()
system call. The
match
keyword tells systrace
to compare the given variable with the string
inet-*:53
, according to the standard shell
pattern-matching (globbing) rules. So, if the variable
sockaddr
matches the string
inet-*:53
, the connection is accepted. This
program can bind to port 53, over both TCP and UDP protocols. If an
attacker had an exploit to make named attach a
command prompt on a high-numbered port, this systrace
policy would prevent that exploit from working.
At first glance, this seems wrong:
native-chdir: filename eq "/" then permit native-chdir: filename eq "/namedb" then permit
The eq
keyword compares one string to another and
requires an exact match. If the program tries to go to the root
directory, or to the directory /namedb
,
systrace
will allow it. Why would you possibly
want to allow named to access the root
directory? The next entry explains why:
native-chroot: filename eq "/var/named" then permit
We can use the native chroot()
system call to
change our root directory to /var/named
, but to
no other directory. At this point, the /namedb
directory is actually /var/named/namedb
. We also
know that named logs to syslog. To do this, it
will need access to /dev/log
:
native-connect: sockaddr eq "/dev/log" then permit
This program can use the native connect()
system
call to talk to /dev/log
and only
/dev/log
. That device hands the connections off
elsewhere.
We’ll also see some entries for system calls that do not exist:
native-fsread: filename eq "/" then permit native-fsread: filename eq "/dev/arandom" then permit native-fsread: filename eq "/etc/group" then permit
Systrace
aliases
certain system calls with very similar functions into groups. You can
disable this functionality with a command-line switch and only use
the exact system calls you specify, but in most cases these aliases
are quite useful and shrink your policies considerably. The two
aliases are fsread
and
fswrite
. fsread
is an alias for
stat()
, lstat()
,
readlink()
, and access()
under the native and Linux ABIs. fswrite
is an
alias for unlink()
, mkdir()
,
and rmdir()
, in both the native and Linux ABIs.
As open()
can be used to either read or write a
file, it is aliased by both fsread
and
fswrite
, depending on how it is called. So
named can read certain /etc
files, it can list the contents of the root directory, and it can
access the groups file.
Systrace
supports two optional keywords at the
end of a policy statement, errorcode
and log
. The
errorcode
is the error that is returned when the
program attempts to access this system call. Programs will behave
differently depending on the error that they receive.
named will react differently to a
“permission denied” error than it
will to an “out of memory” error.
You can get a complete list of error codes from the
errno
manpage. Use the error name, not the error
number. For example, here we return an error for nonexistent files:
filename sub "<non-existent filename>" then deny[enoent]
If you put the word log
at the end of your rule,
successful system calls will be logged. For example, if we wanted to
log each time named attached to port 53, we
could edit the policy statement for the bind()
call to read:
native-bind: sockaddr match "inet-*:53" then permit log
You can also choose to filter rules based on user ID and group ID, as the example here demonstrates.
native-setgid: gid eq "70" then permit
This very brief overview covers the vast majority of the rules you
will see. For full details on the systrace
grammar, read the systrace
manpage.
If you want some help with creating your policies, you can also use
systrace’s automated mode
[Hack #16]
.
The original article that this hack is based on is available online at http://www.onlamp.com/pub/a/bsd/2003/01/30/Big_Scary_Daemons.html.
—Michael Lucas
Get Network Security Hacks now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.