RAID on Linux

It’s important to understand that when I refer to a RAID array, I’m talking about a block device and not a filesystem. You could think of the relationship between the two much in the same way you might think of the relationship between a house and its foundation. If the foundation is weak, the house will eventually collapse. The filesystem, which represents the house in my analogy, is built on top of a block device. Normally, a block device is a single hard disk, but RAID introduces another layer (see Figure 1-5). RAID groups many block devices into a single virtual device.

Filesystems are built on block devices; RAID introduces an intermediary layer.
Figure 1-5. Filesystems are built on block devices; RAID introduces an intermediary layer.

This means that Linux interacts with an array through a single block device having a single major and minor number. Physically, the array device points to many different physical disks, each with their own major and minor numbers. Programmers might think of this model the same way they think of an array data type, hence the use of the word “array” in the RAID acronym.

Each piece of hardware connected to a Linux system is assigned a major and minor number. The major number refers to a specific group of hardware (such as small computer systems interface, or SCSI, disks), while the minor uniquely identifies each installed piece of hardware within the group (for example, each individual SCSI disk). Since RAID is merely an intermediary layer, and because it works just like any other block device, you can build any type of filesystem on top of it. When working with Linux’s RAID implementation, you can even build arrays on top of other arrays, or use other types of storage management like Logical Volume Management (LVM).

The Linux device names for accessing software RAID devices are designated md. While you might assume that md stands for metadevice, that’s incorrect (although the abbreviation is used that way by many people). The md in Linux software RAID actually refers to the kernel subsystem that handles arrays: the multiple devices driver. /dev/md[0-255] represents the default block devices used for accessing software RAID on Linux, allowing a total of 256 software RAID devices on a single Linux system.

RAID under Linux is available as part of the kernel. The kernel supports five different RAID levels: linear mode, striping (RAID-0), mirroring (RAID-1), RAID-4, and RAID-5. The RAID subsystem can be compiled statically into the kernel or used as a loadable module. Chapter 3 covers software RAID implementation under Linux. If you are already familiar with RAID from an architectural standpoint, you can skip ahead to Chapter 3 and start rebuilding your kernel.

With the popularity of Linux increasing daily, many manufacturers have begun to release Linux drivers for hardware RAID cards and offer full-scale technical support for such RAID cards. Many of these companies have gone one step further and released drivers that are open source (http://opensource.org). Some companies that have not been kind enough to release drivers have still released technical information about their hardware that has allowed open source developers to write drivers. This growing industry support allows Linux, and open source, to more effectively compete with commercial systems and legacy operating systems.

Linux professionals have done considerable work to bring high-performance, open source filesystems to Linux. These filesystems include IBM’s Journaled File System (JFS), ext3, SGI’s XFS, and ReiserFS. However, improving the performance and reliability of a filesystem can be a wasted effort if equal consideration is not given to the block devices on which these filesystems are built. Likewise, you’d be foolish to spend your time building a reliable, high-performance RAID system without considering the filesystem that you are going to use. In fact, in many cases, limitations of filesystems like ext2 will prevent you from fully realizing the potential of a RAID device. Chapter 6 provides a brief overview of some high-performance filesystems.

Get Managing RAID on Linux now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.