Store-to-Load Forwarding

Background

The Pentium® 4 processor implements 24 Store Buffers (sometimes referred to as the Store Forwarding Buffers) into which store data is written and held pending its ultimate delivery to the cache or to system memory. If Hyper-Threading is enabled, the Store Buffers are partitioned into two groups of 12 buffers each, with each group dedicated to handling stores performed by one of the logical processors.

When a store is executed, the store is posted in the Store Buffer that was reserved by the Allocator and the store cannot be performed to the cache (if it's a write to cacheable memory), or to system memory (if it's a write to uncacheable memory) until the instruction is retired. The processor has a very deep ...

Get The Unabridged Pentium 4 IA32 Processor Genealogy now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.