Handling Requests: A Simple Introduction

The most important function in a block driver is the request function, which performs the low-level operations related to reading and writing data. This section discusses the basic design of the request procedure.

The Request Queue

When the kernel schedules a data transfer, it queues the request in a list, ordered in such a way that it maximizes system performance. The queue of requests is then passed to the driver’s request function, which has the following prototype:

void request_fn(request_queue_t *queue);

The request function should perform the following tasks for each request in the queue:

  1. Check the validity of the request. This test is performed by the macro INIT_REQUEST, defined in blk.h; the test consists of looking for problems that could indicate a bug in the system’s request queue handling.

  2. Perform the actual data transfer. The CURRENT variable (a macro, actually) can be used to retrieve the details of the current request. CURRENT is a pointer to struct request, whose fields are described in the next section.

  3. Clean up the request just processed. This operation is performed by end_request, a static function whose code resides in blk.h. end_request handles the management of the request queue and wakes up processes waiting on the I/O operation. It also manages the CURRENT variable, ensuring that it points to the next unsatisfied request. The driver passes the function a single argument, which is 1 in case of success and 0 in case of failure. When end_request is called with an argument of 0, an “I/O error” message is delivered to the system logs (via printk).

  4. Loop back to the beginning, to consume the next request.

Based on the previous description, a minimal request function, which does not actually transfer any data, would look like this:

void sbull_request(request_queue_t *q)
{
    while(1) {
        INIT_REQUEST;
        printk("<1>request %p: cmd %i sec %li (nr. %li)\n", CURRENT,
               CURRENT->cmd,
               CURRENT->sector,
               CURRENT->current_nr_sectors);
        end_request(1); /* success */
    }
}

Although this code does nothing but print messages, running this function provides good insight into the basic design of data transfer. It also demonstrates a couple of features of the macros defined in <linux/blk.h>. The first is that, although the while loop looks like it will never terminate, the fact is that the INIT_REQUEST macro performs a return when the request queue is empty. The loop thus iterates over the queue of outstanding requests and then returns from the request function. Second, the CURRENT macro always describes the request to be processed. We get into the details of CURRENT in the next section.

A block driver using the request function just shown will actually work—for a short while. It is possible to make a filesystem on the device and access it for as long as the data remains in the system’s buffer cache.

This empty (but verbose) function can still be run in sbull by defining the symbol SBULL_EMPTY_REQUEST at compile time. If you want to understand how the kernel handles different block sizes, you can experiment with blksize= on the insmod command line. The empty request function shows the internal workings of the kernel by printing the details of each request.

The request function has one very important constraint: it must be atomic. request is not usually called in direct response to user requests, and it is not running in the context of any particular process. It can be called at interrupt time, from tasklets, or from any number of other places. Thus, it must not sleep while carrying out its tasks.

Performing the Actual Data Transfer

To understand how to build a working request function for sbull, let’s look at how the kernel describes a request within a struct request. The structure is defined in <linux/blkdev.h>. By accessing the fields in the request structure, usually by way of CURRENT, the driver can retrieve all the information needed to transfer data between the buffer cache and the physical block device.[48] CURRENT is just a pointer into blk_dev[MAJOR_NR].request_queue. The following fields of a request hold information that is useful to the request function:

kdev_t rq_dev;

The device accessed by the request. By default, the same request function is used for every device managed by the driver. A single request function deals with all the minor numbers; rq_dev can be used to extract the minor device being acted upon. The CURRENT_DEV macro is simply defined as DEVICE_NR(CURRENT->rq_dev).

int cmd;

This field describes the operation to be performed; it is either READ (from the device) or WRITE (to the device).

unsigned long sector;

The number of the first sector to be transferred in this request.

unsigned long current_nr_sectors; , unsigned long nr_sectors;

The number of sectors to transfer for the current request. The driver should refer to current_nr_sectors and ignore nr_sectors (which is listed here just for completeness). See Section 12.4.2 later in this chapter for more detail on nr_sectors.

char *buffer;

The area in the buffer cache to which data should be written (cmd==READ) or from which data should be read (cmd==WRITE).

struct buffer_head *bh;

The structure describing the first buffer in the list for this request. Buffer heads are used in the management of the buffer cache; we’ll look at them in detail shortly in Section 12.4.1.1.

There are other fields in the structure, but they are primarily meant for internal use in the kernel; the driver is not expected to use them.

The implementation for the working request function in the sbull device is shown here. In the following code, the Sbull_Dev serves the same function as Scull_Dev, introduced in Section 3.6 in Chapter 3.

void sbull_request(request_queue_t *q)
{
    Sbull_Dev *device;
    int status;

    while(1) {
        INIT_REQUEST;  /* returns when queue is empty */

        /* Which "device" are we using? */
        device = sbull_locate_device (CURRENT);
        if (device == NULL) {
            end_request(0);
            continue;
        }

        /* Perform the transfer and clean up. */
	spin_lock(&device->lock);
        status = sbull_transfer(device, CURRENT);
        spin_unlock(&device->lock);
        end_request(status); 
    }
}

This code looks little different from the empty version shown earlier; it concerns itself with request queue management and pushes off the real work to other functions. The first, sbull_locate_device, looks at the device number in the request and finds the right Sbull_Dev structure:

static Sbull_Dev *sbull_locate_device(const struct request *req)
{
    int devno;
    Sbull_Dev *device;

    /* Check if the minor number is in range */
    devno = DEVICE_NR(req->rq_dev);
    if (devno >= sbull_devs) {
        static int count = 0;
        if (count++ < 5) /* print the message at most five times */
            printk(KERN_WARNING "sbull: request for unknown device\n");
        return NULL;
    }
    device = sbull_devices + devno; /* Pick it out of device array */
    return device;
}

The only “strange” feature of the function is the conditional statement that limits it to reporting five errors. This is intended to avoid clobbering the system logs with too many messages, since end_request(0) already prints an “I/O error” message when the request fails. The static counter is a standard way to limit message reporting and is used several times in the kernel.

The actual I/O of the request is handled by sbull_transfer:

static int sbull_transfer(Sbull_Dev *device, const struct request *req)
{
    int size;
    u8 *ptr;
    
    ptr = device->data + req->sector * sbull_hardsect;
    size = req->current_nr_sectors * sbull_hardsect;

    /* Make sure that the transfer fits within the device. */
    if (ptr + size > device->data + sbull_blksize*sbull_size) {
        static int count = 0;
        if (count++ < 5)
            printk(KERN_WARNING "sbull: request past end of device\n");
        return 0;
    }

    /* Looks good, do the transfer. */
    switch(req->cmd) {
        case READ:
            memcpy(req->buffer, ptr, size); /* from sbull to buffer */
            return 1;
        case WRITE:
            memcpy(ptr, req->buffer, size); /* from buffer to sbull */
            return 1;
        default:
            /* can't happen */
            return 0;
    }
}

Since sbull is just a RAM disk, its “data transfer” reduces to a memcpy call.



[48] Actually, not all blocks passed to a block driver need be in the buffer cache, but that’s a topic beyond the scope of this chapter.

Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.