Cache Coherence

The cache effects of running on a shared-memory multiprocessor are probably the most salient factors limiting the scalability of this type of computer architecture. The various forms of processor cache, including Translation Lookaside Buffers (TLBs), code and data caches, and branch prediction tables, all play a critical role in the performance of pipelined machines like the Pentium, Pentium Pro, Pentium II, and Pentium III. For the sake of performance, in a multiprocessor configuration each CPU retains its own private cache memory, as depicted in Figure 5-10. We have seen that multiple threads executing inside the Windows 2000 kernel or running device driver code concurrently can attempt to access the same memory locations. Propagating changes to the contents of memory locations cached locally to other engines with their own private copies of the same shared-memory locations is a major issue, known as the cache coherence problem in shared-memory multiprocessors. Cache coherence issues also have significant performance ramifications.

Maintaining cache coherence in a shared-memory multiprocessor is absolutely necessary for programs to execute correctly. While independent program execution threads operate independently of each other for the most part, sometimes they must interact. Whenever they read and write common or shared-memory data structures, threads must communicate and coordinate accesses to these memory locations. This coordination inevitably has performance ...

Get Windows 2000 Performance Guide now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.