Initialization and Shutdown

As already mentioned, init_module registers any facility offered by the module. By facility, we mean a new functionality, be it a whole driver or a new software abstraction, that can be accessed by an application.

Modules can register many different types of facilities; for each facility, there is a specific kernel function that accomplishes this registration. The arguments passed to the kernel registration functions are usually a pointer to a data structure describing the new facility and the name of the facility being registered. The data structure usually embeds pointers to module functions, which is how functions in the module body get called.

The items that can be registered exceed the list of device types mentioned in Chapter 1. They include serial ports, miscellaneous devices, /proc files, executable domains, and line disciplines. Many of those registrable items support functions that aren’t directly related to hardware but remain in the “software abstractions” field. Those items can be registered because they are integrated into the driver’s functionality anyway (like /proc files and line disciplines for example).

There are other facilities that can be registered as add-ons for certain drivers, but their use is so specific that it’s not worth talking about them; they use the stacking technique, as described earlier in Section 2.3. If you want to probe further, you can grep for EXPORT_SYMBOL in the kernel sources and find the entry points offered by different drivers. Most registration functions are prefixed with register_, so another possible way to find them is to grep for register_ in /proc/ksyms.

Error Handling in init_module

If any errors occur when you register utilities, you must undo any registration activities performed before the failure. An error can happen, for example, if there isn’t enough memory in the system to allocate a new data structure or because a resource being requested is already being used by other drivers. Though unlikely, it might happen, and good program code must be prepared to handle this event.

Linux doesn’t keep a per-module registry of facilities that have been registered, so the module must back out of everything itself if init_module fails at some point. If you ever fail to unregister what you obtained, the kernel is left in an unstable state: you can’t register your facilities again by reloading the module because they will appear to be busy, and you can’t unregister them because you’d need the same pointer you used to register and you’re not likely to be able to figure out the address. Recovery from such situations is tricky, and you’ll be often forced to reboot in order to be able to load a newer revision of your module.

Error recovery is sometimes best handled with the goto statement. We normally hate to use goto, but in our opinion this is one situation (well, the only situation) where it is useful. In the kernel, goto is often used as shown here to deal with errors.

The following sample code (using fictitious registration and unregistration functions) behaves correctly if initialization fails at any point.

 int init_module(void)
 {
 int err;

  /* registration takes a pointer and a name */
  err = register_this(ptr1, "skull");
  if (err) goto fail_this;
  err = register_that(ptr2, "skull");
  if (err) goto fail_that;
  err = register_those(ptr3, "skull");
  if (err) goto fail_those;

  return 0; /* success */

  fail_those: unregister_that(ptr2, "skull");
  fail_that: unregister_this(ptr1, "skull");
  fail_this: return err; /* propagate the error */
 }

This code attempts to register three (fictitious) facilities. The goto statement is used in case of failure to cause the unregistration of only the facilities that had been successfully registered before things went bad.

Another option, requiring no hairy goto statements, is keeping track of what has been successfully registered and calling cleanup_module in case of any error. The cleanup function will only unroll the steps that have been successfully accomplished. This alternative, however, requires more code and more CPU time, so in fast paths you’ll still resort to goto as the best error-recovery tool. The return value of init_module, err, is an error code. In the Linux kernel, error codes are negative numbers belonging to the set defined in <linux/errno.h>. If you want to generate your own error codes instead of returning what you get from other functions, you should include <linux/errno.h> in order to use symbolic values such as -ENODEV, -ENOMEM, and so on. It is always good practice to return appropriate error codes, because user programs can turn them to meaningful strings using perror or similar means. (However, it’s interesting to note that several versions of modutils returned a “Device busy” message for any error returned by init_module; the problem has only been fixed in recent releases.)

Obviously, cleanup_module must undo any registration performed by init_module, and it is customary (but not mandatory) to unregister facilities in the reverse order used to register them:

 void cleanup_module(void)
 {
  unregister_those(ptr3, "skull");
  unregister_that(ptr2, "skull");
  unregister_this(ptr1, "skull");
  return;
 }

If your initialization and cleanup are more complex than dealing with a few items, the goto approach may become difficult to manage, because all the cleanup code must be repeated within init_module, with several labels intermixed. Sometimes, therefore, a different layout of the code proves more successful.

What you’d do to minimize code duplication and keep everything streamlined is to call cleanup_module from within init_module whenever an error occurs. The cleanup function, then, must check the status of each item before undoing its registration. In its simplest form, the code looks like the following:

 struct something *item1;
 struct somethingelse *item2;
 int stuff_ok;

 void cleanup_module(void)
 {
  if (item1)
	 release_thing(item1);
  if (item2)
   release_thing2(item2);
  if (stuff_ok)
   unregister_stuff();
  return;
 }

 int init_module(void)
 {
  int err = -ENOMEM;

  item1 = allocate_thing(arguments);
  item2 = allocate_thing2(arguments2);
  if (!item2 || !item2)
   goto fail;
  err = register_stuff(item1, item2);
  if (!err)
   stuff_ok = 1;
  else
   goto fail;
  return 0; /* success */ 
   
  fail:
  cleanup_module();
  return err;
 }

As shown in this code, you may or may not need external flags to mark success of the initialization step, depending on the semantics of the registration/allocation function you call. Whether or not flags are needed, this kind of initialization scales well to a large number of items and is often better than the technique shown earlier.

The Usage Count

The system keeps a usage count for every module in order to determine whether the module can be safely removed. The system needs this information because a module can’t be unloaded if it is busy: you can’t remove a filesystem type while the filesystem is mounted, and you can’t drop a char device while a process is using it, or you’ll experience some sort of segmentation fault or kernel panic when wild pointers get dereferenced.

In modern kernels, the system can automatically track the usage count for you, using a mechanism that we will see in the next chapter. There are still times, however, when you will need to adjust the usage count manually. Code that must be portable to older kernels must still use manual usage count maintenance as well. To work with the usage count, use these three macros:

MOD_INC_USE_COUNT

Increments the count for the current module

MOD_DEC_USE_COUNT

Decrements the count

MOD_IN_USE

Evaluates to true if the count is not zero

The macros are defined in <linux/module.h>, and they act on internal data structures that shouldn’t be accessed directly by the programmer. The internals of module management changed a lot during 2.1 development and were completely rewritten in 2.1.18, but the use of these macros did not change.

Note that there’s no need to check for MOD_IN_USE from within cleanup_module, because the check is performed by the system call sys_delete_module (defined in kernel/module.c) in advance.

Proper management of the module usage count is critical for system stability. Remember that the kernel can decide to try to unload your module at absolutely any time. A common module programming error is to start a series of operations (in response, say, to an open request) and increment the usage count at the end. If the kernel unloads the module halfway through those operations, chaos is ensured. To avoid this kind of problem, you should call MOD_INC_USE_COUNT before doing almost anything else in a module.

You won’t be able to unload a module if you lose track of the usage count. This situation may very well happen during development, so you should keep it in mind. For example, if a process gets destroyed because your driver dereferenced a NULL pointer, the driver won’t be able to close the device, and the usage count won’t fall back to zero. One possible solution is to completely disable the usage count during the debugging cycle by redefining both MOD_INC_USE_COUNT and MOD_DEC_USE_COUNT to no-ops. Another solution is to use some other method to force the counter to zero (you’ll see this done in Section 5.1.4 in Chapter 5). Sanity checks should never be circumvented in a production module. For debugging, however, sometimes a brute-force attitude helps save development time and is therefore acceptable.

The current value of the usage count is found in the third field of each entry in /proc/modules. This file shows the modules currently loaded in the system, with one entry for each module. The fields are the name of the module, the number of bytes of memory it uses, and the current usage count. This is a typical /proc/modules file:

parport_pc    7604 1 (autoclean)
lp      4800 0 (unused)
parport     8084 1 [parport_probe parport_pc lp]
lockd     33256 1 (autoclean)
sunrpc     56612 1 (autoclean) [lockd]
ds      6252 1 
i82365     22304 1 
pcmcia_core   41280 0 [ds i82365]

Here we see several modules in the system. Among other things, the parallel port modules have been loaded in a stacked manner, as we saw in Figure 2-2. The (autoclean) marker identifies modules managed by kmod or kerneld (see Chapter 11); the (unused) marker means exactly that. Other flags exist as well. In Linux 2.0, the second (size) field was expressed in pages (4 KB each on most platforms) rather than bytes.

Unloading

To unload a module, use the rmmod command. Its task is much simpler than loading, since no linking has to be performed. The command invokes the delete_module system call, which calls cleanup_module in the module itself if the usage count is zero or returns an error otherwise.

The cleanup_module implementation is in charge of unregistering every item that was registered by the module. Only the exported symbols are removed automatically.

Explicit Initialization and Cleanup Functions

As we have seen, the kernel calls init_module to initialize a newly loaded module, and calls cleanup_module just before module removal. In modern kernels, however, these functions often have different names. As of kernel 2.3.13, a facility exists for explicitly naming the module initialization and cleanup routines; using this facility is the preferred programming style.

Consider an example. If your module names its initialization routine my_init (instead of init_module) and its cleanup routine my_cleanup, you would mark them with the following two lines (usually at the end of the source file):

 module_init(my_init);
 module_exit(my_cleanup);

Note that your code must include <linux/init.h> to use module_init and module_exit.

The advantage of doing things this way is that each initialization and cleanup function in the kernel can have a unique name, which helps with debugging. These functions also make life easier for those writing drivers that work either as a module or built directly into the kernel. However, use of module_init and module_exit is not required if your initialization and cleanup functions use the old names. In fact, for modules, the only thing they do is define init_module and cleanup_module as new names for the given functions.

If you dig through the kernel source (in versions 2.2 and later), you will likely see a slightly different form of declaration for module initialization and cleanup functions, which looks like the following:

 static int __init my_init(void)
 {
  ....
 }

 static void __exit my_cleanup(void)
 {
  ....
 }

The attribute __init, when used in this way, will cause the initialization function to be discarded, and its memory reclaimed, after initialization is complete. It only works, however, for built-in drivers; it has no effect on modules. __exit, instead, causes the omission of the marked function when the driver is not built as a module; again, in modules, it has no effect.

The use of __init (and __initdata for data items) can reduce the amount of memory used by the kernel. There is no harm in marking module initialization functions with __init, even though currently there is no benefit either. Management of initialization sections has not been implemented yet for modules, but it’s a possible enhancement for the future.

Get Linux Device Drivers, Second Edition now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.