Using Resources

A module can’t accomplish its task without using system resources such as memory, I/O ports, I/O memory, and interrupt lines, as well as DMA channels if you use old-fashioned DMA controllers like the Industry Standard Architecture (ISA) one.

As a programmer, you are already accustomed to managing memory allocation; writing kernel code is no different in this regard. Your program obtains a memory area using kmalloc and releases it using kfree. These functions behave like malloc and free, except that kmalloc takes an additional argument, the priority. Usually, a priority of GFP_KERNEL or GFP_USER will do. The GFP acronym stands for “get free page.” (Memory allocation is covered in detail in Chapter 7.)

Beginning driver programmers may initially be surprised at the need to allocate I/O ports, I/O memory,[11] and interrupt lines explicitly. After all, it is possible for a kernel module to simply access these resources without telling the operating system about it. Although system memory is anonymous and may be allocated from anywhere, I/O memory, ports, and interrupts have very specific roles. For instance, a driver needs to be able to allocate the exact ports it needs, not just some ports. But drivers cannot just go about making use of these system resources without first ensuring that they are not already in use elsewhere.

I/O Ports and I/O Memory

The job of a typical driver is, for the most part, writing and reading I/O ports and I/O memory. Access to I/O ports and I/O memory (collectively called I/O regions) happens both at initialization time and during normal operations.

Unfortunately, not all bus architectures offer a clean way to identify I/O regions belonging to each device, and sometimes the driver must guess where its I/O regions live, or even probe for the devices by reading and writing to “possible” address ranges. This problem is especially true of the ISA bus, which is still in use for simple devices to plug in a personal computer and is very popular in the industrial world in its PC/104 implementation (see Section 15.3 in Chapter 15).

Despite the features (or lack of features) of the bus being used by a hardware device, the device driver should be guaranteed exclusive access to its I/O regions in order to prevent interference from other drivers. For example, if a module probing for its hardware should happen to write to ports owned by another device, weird things would undoubtedly happen.

The developers of Linux chose to implement a request/free mechanism for I/O regions, mainly as a way to prevent collisions between different devices. The mechanism has long been in use for I/O ports and was recently generalized to manage resource allocation at large. Note that this mechanism is just a software abstraction that helps system housekeeping, and may or may not be enforced by hardware features. For example, unauthorized access to I/O ports doesn’t produce any error condition equivalent to “segmentation fault”—the hardware can’t enforce port registration.

Information about registered resources is available in text form in the files /proc/ioports and /proc/iomem, although the latter was only introduced during 2.3 development. We’ll discuss version 2.4 now, introducing portability issues at the end of the chapter.

Ports

A typical /proc/ioports file on a recent PC that is running version 2.4 of the kernel will look like the following:

 0000-001f : dma1
 0020-003f : pic1
 0040-005f : timer
 0060-006f : keyboard
 0080-008f : dma page reg
 00a0-00bf : pic2
 00c0-00df : dma2
 00f0-00ff : fpu
 0170-0177 : ide1
 01f0-01f7 : ide0
 02f8-02ff : serial(set)
 0300-031f : NE2000
 0376-0376 : ide1
 03c0-03df : vga+
 03f6-03f6 : ide0
 03f8-03ff : serial(set)
 1000-103f : Intel Corporation 82371AB PIIX4 ACPI
  1000-1003 : acpi
  1004-1005 : acpi
  1008-100b : acpi
  100c-100f : acpi
 1100-110f : Intel Corporation 82371AB PIIX4 IDE
 1300-131f : pcnet_cs
 1400-141f : Intel Corporation 82371AB PIIX4 ACPI
 1800-18ff : PCI CardBus #02
 1c00-1cff : PCI CardBus #04
 5800-581f : Intel Corporation 82371AB PIIX4 USB
 d000-dfff : PCI Bus #01
  d000-d0ff : ATI Technologies Inc 3D Rage LT Pro AGP-133

Each entry in the file specifies (in hexadecimal) a range of ports locked by a driver or owned by a hardware device. In earlier versions of the kernel the file had the same format, but without the “layered” structure that is shown through indentation.

The file can be used to avoid port collisions when a new device is added to the system and an I/O range must be selected by moving jumpers: the user can check what ports are already in use and set up the new device to use an available I/O range. Although you might object that most modern hardware doesn’t use jumpers any more, the issue is still relevant for custom devices and industrial components.

But what is more important than the ioports file itself is the data structure behind it. When the software driver for a device initializes itself, it can know what port ranges are already in use; if the driver needs to probe I/O ports to detect the new device, it will be able to avoid probing those ports that are already in use by other drivers.

ISA probing is in fact a risky task, and several drivers distributed with the official Linux kernel refuse to perform probing when loaded as modules, to avoid the risk of destroying a running system by poking around in ports where some yet-unknown hardware may live. Fortunately, modern (as well as old-but-well-thought-out) bus architectures are immune to all these problems.

The programming interface used to access the I/O registry is made up of three functions:

 int check_region(unsigned long start, unsigned long len);
 struct resource *request_region(unsigned long start,
 unsigned long len, char *name);
 void release_region(unsigned long start, unsigned long len);

check_region may be called to see if a range of ports is available for allocation; it returns a negative error code (such as -EBUSY or -EINVAL) if the answer is no. request_region will actually allocate the port range, returning a non-NULL pointer value if the allocation succeeds. Drivers don’t need to use or save the actual pointer returned—checking against NULL is all you need to do.[12] Code that needs to work only with 2.4 kernels need not call check_region at all; in fact, it’s better not to, since things can change between the calls to check_region and request_region. If you want to be portable to older kernels, however, you must use check_region because request_region used to return void before 2.4. Your driver should call release_region, of course, to release the ports when it is done with them.

The three functions are actually macros, and they are declared in <linux/ioport.h>.

The typical sequence for registering ports is the following, as it appears in the skull sample driver. (The function skull_probe_hw is not shown here because it contains device-specific code.)

#include <linux/ioport.h>
#include <linux/errno.h>
static int skull_detect(unsigned int port, unsigned int range)
{
 int err;

 if ((err = check_region(port,range)) < 0) return err; /* busy */
 if (skull_probe_hw(port,range) != 0) return -ENODEV; /* not found */
 request_region(port,range,"skull");     /* "Can't fail" */
 return 0;
}

This code first looks to see if the required range of ports is available; if the ports cannot be allocated, there is no point in looking for the hardware. The actual allocation of the ports is deferred until after the device is known to exist. The request_region call should never fail; the kernel only loads a single module at a time, so there should not be a problem with other modules slipping in and stealing the ports during the detection phase. Paranoid code can check, but bear in mind that kernels prior to 2.4 define request_region as returning void.

Any I/O ports allocated by the driver must eventually be released; skull does it from within cleanup_module:

static void skull_release(unsigned int port, unsigned int range)
{
 release_region(port,range);
}

The request/free approach to resources is similar to the register/unregister sequence described earlier for facilities and fits well in the goto-based implementation scheme already outlined.

Memory

Similar to what happens for I/O ports, I/O memory information is available in the /proc/iomem file. This is a fraction of the file as it appears on a personal computer:

 00000000-0009fbff : System RAM
 0009fc00-0009ffff : reserved
 000a0000-000bffff : Video RAM area
 000c0000-000c7fff : Video ROM
 000f0000-000fffff : System ROM
 00100000-03feffff : System RAM
  00100000-0022c557 : Kernel code
  0022c558-0024455f : Kernel data
 20000000-2fffffff : Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge
 68000000-68000fff : Texas Instruments PCI1225
 68001000-68001fff : Texas Instruments PCI1225 (#2)
 e0000000-e3ffffff : PCI Bus #01
 e4000000-e7ffffff : PCI Bus #01
  e4000000-e4ffffff : ATI Technologies Inc 3D Rage LT Pro AGP-133
  e6000000-e6000fff : ATI Technologies Inc 3D Rage LT Pro AGP-133
 fffc0000-ffffffff : reserved

Once again, the values shown are hexadecimal ranges, and the string after the colon is the name of the “owner” of the I/O region.

As far as driver writing is concerned, the registry for I/O memory is accessed in the same way as for I/O ports, since they are actually based on the same internal mechanism.

To obtain and relinquish access to a certain I/O memory region, the driver should use the following calls:

 int check_mem_region(unsigned long start, unsigned long len);
 int request_mem_region(unsigned long start, unsigned long len,
    char *name);
 int release_mem_region(unsigned long start, unsigned long len);

A typical driver will already know its own I/O memory range, and the sequence shown previously for I/O ports will reduce to the following:

 if (check_mem_region(mem_addr, mem_size)) { printk("drivername:
  memory already in use\n"); return -EBUSY; }
  request_mem_region(mem_addr, mem_size, "drivername");

Resource Allocation in Linux 2.4

The current resource allocation mechanism was introduced in Linux 2.3.11 and provides a flexible way of controlling system resources. This section briefly describes the mechanism. However, the basic resource allocation functions (request_region and the rest) are still implemented (via macros) and are still universally used because they are backward compatible with earlier kernel versions. Most module programmers will not need to know about what is really happening under the hood, but those working on more complex drivers may be interested.

Linux resource management is able to control arbitrary resources, and it can do so in a hierarchical manner. Globally known resources (the range of I/O ports, say) can be subdivided into smaller subsets—for example, the resources associated with a particular bus slot. Individual drivers can then further subdivide their range if need be.

Resource ranges are described via a resource structure, declared in <linux/ioport.h>:

 struct resource {
  const char *name;
  unsigned long start, end;
  unsigned long flags;
  struct resource *parent, *sibling, *child;
 };

Top-level (root) resources are created at boot time. For example, the resource structure describing the I/O port range is created as follows:

 struct resource ioport_resource = 
    { "PCI IO", 0x0000, IO_SPACE_LIMIT, IORESOURCE_IO };

Thus, the name of the resource is PCI IO, and it covers a range from zero through IO_SPACE_LIMIT, which, according to the hardware platform being run, can be 0xffff (16 bits of address space, as happens on the x86, IA-64, Alpha, M68k, and MIPS), 0xffffffff (32 bits: SPARC, PPC, SH) or 0xffffffffffffffff (64 bits: SPARC64).

Subranges of a given resource may be created with allocate_resource. For example, during PCI initialization a new resource is created for a region that is actually assigned to a physical device. When the PCI code reads those port or memory assignments, it creates a new resource for just those regions, and allocates them under ioport_resource or iomem_resource.

A driver can then request a subset of a particular resource (actually a subrange of a global resource) and mark it as busy by calling __request_region, which returns a pointer to a new struct resource data structure that describes the resource being requested (or returns NULL in case of error). The structure is already part of the global resource tree, and the driver is not allowed to use it at will.

An interested reader may enjoy looking at the details by browsing the source in kernel/resource.c and looking at the use of the resource management scheme in the rest of the kernel. Most driver writers, however, will be more than adequately served by request_region and the other functions introduced in the previous section.

This layered mechanism brings a couple of benefits. One is that it makes the I/O structure of the system apparent within the data structures of the kernel. The result shows up in /proc/ioports, for example:

 e800-e8ff : Adaptec AHA-2940U2/W / 7890
 e800-e8be : aic7xxx

The range e800-e8ff is allocated to an Adaptec card, which has identified itself to the PCI bus driver. The aic7xxx driver has then requested most of that range—in this case, the part corresponding to real ports on the card.

The other advantage to controlling resources in this way is that it partitions the port space into distinct subranges that reflect the hardware of the underlying system. Since the resource allocator will not allow an allocation to cross subranges, it can block a buggy driver (or one looking for hardware that does not exist on the system) from allocating ports that belong to more than range—even if some of those ports are unallocated at the time.



[11] The memory areas that reside on the peripheral device are commonly called I/O memory to differentiate them from system RAM, which is customarily called memory).

[12] The actual pointer is used only when the function is called internally by the resource management subsystem of the kernel.

Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.