Understanding the Linux Kernel

14.2. The Page Cache

The page cache, which is thankfully much simpler than the buffer cache, is a disk cache for the data accessed by page I/O operations. As we shall see in Chapter 15, all access to regular files made by read( ), write( ), and mmap( ) system calls is done through the page cache. Of course, the unit of information kept in the cache is a whole page, since page I/O operations transfer whole pages of data. A page does not necessarily contain physically adjacent disk blocks, and it cannot thus be identified by a device number and a block number. Instead, a page in the page cache is identified by a file's inode and by the offset within the file.

There are three main activities related to the page cache: adding a page when accessing a file portion not already in the cache, removing a page when the cache gets too big, and finding the page including a given file offset.

14.2.1. Page Cache Data Structures

The page cache makes use of two main data structures:

A page hash table: Lets the kernel quickly derive the page descriptor address for the page associated with a specified inode and file offset
An inode queue: A list of page descriptors corresponding to pages of data of a particular file (distinguished by a unique inode)

Manipulation of the page cache involves adding and removing entries from these data structures, as well as updating the fields in all inode objects referencing cached files.

14.2.1.1. The page hash table

When a process reads a large file, the page ...

Get Understanding the Linux Kernel now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Understanding the Linux Kernel by Daniel P. Bovet, Marco Cesati

14.2. The Page Cache

14.2.1. Page Cache Data Structures

14.2.1.1. The page hash table

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly