False Sharing

The atomic unit of a cache is a line. A typical cache line may hold a large number of bytes. A 128-byte cache line is typical. When a 4-byte integer is loaded from main memory, it is not loaded in isolation. The whole line containing it is loaded into the cache at once. Similarly, when that integer value is invalidated by another cache (running on a different processor), the whole cache line is invalidated. It follows that the physical memory layout of variables can play a role in SMP scalability.

Take the HTStats class discussed earlier. If you eliminate the smpDmz character array, you will end up with the two locks close to one another as in:

 class HTStats { int httpReqs; int httpBytes; pthread_mutex_t lockHttp; int sslReqs; ...

Get Efficient C++ Performance Programming Techniques now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.