Optimizing SelectionDAG

After converting the IR into SelectionDAG, many opportunities may arise to optimize the DAG itself. These optimization takes place in the DAGCombiner phase. These opportunities may arise due to set of architecture specific instructions.

Let's take an example:

#include <arm_neon.h>
unsigned hadd(uint32x4_t a) {
  return a[0] + a[1] + a[2] + a[3];
}

The preceding example in IR looks like the following:

define i32 @hadd(<4 x i32> %a) nounwind { %vecext = extractelement <4 x i32> %a, i32 3 %vecext1 = extractelement <4 x i32> %a, i32 2 %add = add i32 %vecext, %vecext1 %vecext2 = extractelement <4 x i32> %a, i32 1 %add3 = add i32 %add, %vecext2 %vecext4 = extractelement <4 x i32> %a, i32 0 %add5 = add i32 %add3, %vecext4 ret i32 %add5 ...

Get LLVM Essentials now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.