Chapter 22Computer Memory and Data Structures

This chapter dovetails with the one on computer performance. It will describe in more detail the way a computer program's memory is laid out and how data is encoded in memory. It will then move on to some of the most important data structures in common use and explain how their physical layout gives rise to their performance characteristics.

To make things concrete, this chapter will be taught using the C language. C is a low-level language that gives you very fine-grained control over how a program utilizes memory. The main Python interpreter, similar to a lot of the most important code in the world, is written in C because it allows you to make things very, very efficient. This chapter doesn't count as a crash course in C, but it will give you enough to understand how the key data structures are implemented and how they form the basis of Python.

22.1 Virtual Memory, the Stack, and the Heap

One of the most important jobs of an operating system is to allow multiple different processes on the computer to share the same physical RAM. It does this by providing each process with a “virtual address space” (VAS), which it can use to store the data it is operating on. The process can refer to data in any location from 0 to 232-1 in 32-bit operating systems and 0 to 264-1 in 64-bit operating systems. Each location contains exactly 1 byte of data, and the finite range of valid addresses puts a hard (but very large) upper limit on the amount ...

Get The Data Science Handbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.