Avoid Branching

Modern processors tend not to be particularly branch friendly. In the “good old days” one processor instruction would finish before the next instruction was issued. This meant that if one instruction calculated the target for a branch or set a condition code on which a branch decision was dependent, the result of this instruction was available for immediate use by the next instruction. This made for a simple processor architecture, though generally not a particularly fast one.

Almost all modern processors are now pipelined. A pipelined processor breaks up instruction execution into a number of stages. A simple pipeline might have five stages: instruction fetch, instruction decode, operand fetch, operation execution, and result ...

Get Efficient C++ Performance Programming Techniques now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.