Algorithms and Parallel Computing

4.2 CACHE COHERENCE AND MEMORY CONSISTENCY

Attaching private caches to processors speeds up program execution by making memory latency match the processor speed. Thus, read/write operations take about the same time as the arithmetic and logic unit (ALU) operations. Table 4.1 summarizes the terminology used to describe cache coherence. A cache is useful because most tasks or applications display temporal locality and spatial locality. Temporal locality refers to the near future. Spatial locality refers to using data located near the current data in the near future. For this reason, data load/store operations between the shared memory and the caches take place using blocks. Figure 4.2 shows the relation between the blocks stored in the shared memory and their copies in the cache of a certain processor. The cache stores some blocks using a tag, which stores the address of the block in the shared memory. Each block stored in the cache is stored as a row called a line. A line contains the following components:

1. Valid bit (V) to indicate whether the data in the line are coherent with the block in the shared memory or not

2. Index, which is the address of the line in the cache

3. Tag, which refers to the address of the block in the shared memory

4. Data, which comprise the data stored in the block

Table 4.1 Terminology Used to Describe Cache Coherence

Term	Meaning
Block	Group of contiguous words or data stored in shared memory
Broadcast	When information is sent to all caches
Cache	A small ...

Get Algorithms and Parallel Computing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Algorithms and Parallel Computing by Fayez Gebali

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly