Chapter 18
Data Storage and Retrieval
Retrieving, processing, and storing data is what computing is all about. This chapter looks at two different recipes which take on this task. The first processes HTML documents to identify and use any links contained in the document. This is not as easy as it first appears, so this recipe covers some of the work that has to be done to ensure that no links are missed and that regular text is not easily mistaken for a link. The second recipe parses kernel state from the Linux kernel’s /proc pseudo-filesystem and converts this into CSV, which can be parsed by spreadsheet software and used to create graphs.
These two recipes balance each other, in that the first reads data which is really intended to be read by graphical desktop software in the form of a web browser. The second creates data which can be interpreted by desktop spreadsheet software. Although purely text-based, the shell can play a part in parsing as well as creating data for graphical software.