Make sure you're keeping all processors busy with parallel builds
If you're running a multiprocessor system (SMP) with a moderate amount of RAM, you can usually see significant benefits by performing a parallel make when building code. Compared to doing serial builds when running make (as is the default), a parallel build is a vast improvement.
To tell make to allow more than one child at a time while building, use the -j switch:
rob@mouse:~/linux$ make -j4; make -j4 modules
Some projects aren't designed to handle parallel builds and can get confused if parts of the project are built before their parent dependencies have completed. If you run into build errors, it is safest to just start from scratch this time without the -j switch.
By way of comparison, here are some sample timings. They were performed on an otherwise unloaded dual PIII/600 with 1GB RAM. Each time I built a bzImage for Linux 2.4.19 (redirecting STDOUT to /dev/null), and removed the source tree before starting the next test.
time make bzImage: real 7m1.640s user 6m44.710s sys 0m25.260s time make -j2 bzImage: real 3m43.126s user 6m48.080s sys 0m26.420s time make -j4 bzImage: real 3m37.687s user 6m44.980s sys 0m26.350s time make -j10 bzImage: real 3m46.060s user 6m53.970s sys 0m27.240s
As you can see, there is a significant improvement just by adding the -j2 switch. We dropped from 7 minutes to 3 minutes and 43 seconds of actual time. Increasing to -j4 saved us about five more seconds, but jumping all the way to -j10 actually hurt performance by a few seconds. Notice how user and system seconds are virtually the same across all four runs. In the end, you need to shovel the same sized pile of bits, but -j on a multi-processor machine simply lets you spread it around to more people with shovels.
Of course, bits all eventually end up in the bit bucket anyway. But hey, if nothing else, performance timings are a great way to keep your cage warm.