Strictly speaking, many optimizations that can be applied to JavaScript can be applied to any language. Going down to the CPU level, the rule is the same: minimize work. In JavaScript, the CPU-level work is so abstracted from the programmer that it can be difficult to ascertain how much work is actually going on. If you use a few tried-and-tested methods, it is a safe bet that your code will benefit, although only performing empirical tests will prove this conclusively.
Computationally expensive calculations can have their values
precalculated and stored in a lookup table. You can then quickly pull
the values out of the lookup table using a simple integer index. As long
as accessing a value from the lookup table is a cheaper operation than
calculating the value from scratch, an application will benefit from
better performance. JavaScript’s trigonometry functions are a good
example of where you can use lookup tables to speed things up. In this
section, the Math.sin()
function will
be superseded by a lookup table, and we’ll build an animated graphical
application to utilize it.
The Math.sin()
function accepts
a single argument: an angle, measured in radians. It returns a value
between −1 and 1. The angle argument has an effective range of 0 to 2π
radians, or about 6.28318. This is not very useful for indexing into a
lookup table, as the range of just six possible integer values is too
small. The solution is to dispense with radians completely and allow the
lookup table to accept integer indexes of between 0 and 4,095. This granularity
should be enough for most applications, but you can make it finer by
specifying a larger steps
argument:
var fastSin = function (steps) { var table = [], ang = 0, angStep = (Math.PI * 2) / steps; do { table.push(Math.sin(ang)); ang += angStep; } while (ang < Math.PI * 2); return table; };
The fastSin()
function divides
2π radians into the number of steps specified in the argument, and
stores the sin
values for each step in an array,
which is returned.
Testing the JavaScript Math.sin()
against a lookup table yields the
results shown in Figure 1-2.
Across most browsers, there appears to be an approximately 20%
increase in performance, with an even more pronounced improvement in
Google Chrome. If the calculated values within the lookup table had come
from a more complex function than Math.sin()
, then the performance gains would
be even more significant; the speed of accessing the lookup table
remains constant regardless of the initial work required to fill in the
values.
The following application uses the fastSin()
lookup table to create a hypnotic
animated display. Figure 1-3 shows the
output.
<!DOCTYPE html> <html> <head> <title> Fast Sine Demonstration </title> <script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/jquery.min.js"> </script> <style type="text/css"> #draw-target { width:480px; height:320px; background-color:#000; position:relative; } </style> <script type="text/javascript"> $(document).ready(function() { (function() { var fastSin = function(steps) { var table = [], ang = 0, angStep = (Math.PI * 2) / steps; do { table.push(Math.sin(ang)); ang += angStep; } while (ang < Math.PI * 2); return table; };
The fastSin()
function is
called, and the created sine lookup table is referenced in sinTable[]
.
var sinTable = fastSin(4096), $drawTarget = $('#draw-target'), divs = '', i, bars, x = 0;
The drawGraph()
function draws
a sine wave by updating the height and position of numerous
one-pixel-wide div
s. Table 1-1 shows the
arguments.
var drawGraph = function(ang, freq, height) { var height2 = height * 2; for (var i = 0; i < 480; i++) { bars[i].style.top = 160 - height + sinTable[(ang + (i * freq)) & 4095] * height + 'px'; bars[i].style.height = height2 + 'px'; } };
Table 1-1. Arguments passed to drawGraph()
Argument | Description |
---|---|
| The start angle for the sine wave. |
| The frequency of the sine wave. Defines the “tightness” of the wave. |
| The height (amplitude) of the wave; also affects the thickness of the lines drawn. |
The following loop creates 480 one-pixel vertical div
elements. The div
s are then appended to $drawTarget
. All the div
s are then referenced in the bars[]
array for use in drawGraph()
.
for (i = 0; i < 480; i++) { divs += '<div style = "position:absolute;width:1px;height:40px;' + 'background-color:#0d0; top:0px; left: ' + i + 'px;"></div>'; } $drawTarget.append(divs); bars = $drawTarget.children();
setInterval()
repeatedly calls
drawGraph()
with continuously
changing parameters to create an animated effect:
setInterval(function() { drawGraph(x * 50, 32 - (sinTable[(x * 20) & 4095] * 16), 50 - (sinTable[(x * 10) & 4095] * 20)); x++; }, 20); })(); }); </script> </head> <body> <div id="draw-target"> </div> </body> </html>
In JavaScript, all numbers are represented in a floating-point
format. In contrast to languages such as C++ and Java, int
and float
types are not explicitly declared. This
is a surprising omission, and a legacy of JavaScript’s early years as a
simple language intended for web designers and amateurs. JavaScript’s
single number type does help you avoid many numeric type errors.
However, integers are fast, CPU-friendly, and the preferred choice for
many programming tasks in other languages.
JavaScript’s number representation is defined in the ECMAScript
Language Specification as
“double-precision 64-bit format IEEE 754 values as specified in the
IEEE Standard for Binary Floating-Point Arithmetic.” This gives a
(somewhat huge) range of large numbers (±1.7976931348623157 ×
10^{308}) or small numbers (±5 ×
10^{−324}). Beware, though: floating-point
numbers are subject to rounding errors; for example, alert(0.1 + 0.2)
displays
0.30000000000000004, not 0.3 as expected!
However, a closer look at the ECMAScript Standard reveals that JavaScript has several internal operations defined to deal with integers:
ToInteger
Converts to an integer
ToInt32
Converts to a signed 32-bit integer
ToUint32
Converts to an unsigned 32-bit integer
ToUint16
Converts to an unsigned 16-bit integer
You cannot use these operations directly; rather, they are called under the hood to convert numbers into an appropriate integer type for JavaScript’s rarely used bitwise operators. Though sometimes incorrectly dismissed as slow and irrelevant to web programming, some of these operators possess quirky abilities that can be useful for optimization.
Bitwise operators convert numbers into 32-bit integers, with a numerical range of −2,147,483,648 to 2,147,483,647. Numbers outside this range will be adjusted to fit.
During the halcyon days of computing, when 16 KB of RAM was considered a lot, binary numbers were a programmer’s staple diet. The sort of low-level programming used for the computers of the day required a good understanding of binary and hexadecimal notation. Binary numbers are rarely used in web programming, but they still have their place in areas such as hardware drivers and networking.
Everyone is familiar with the base-10 number system. In the first row of Table 1-2, each column from right to left represents an increasing power of 10. By multiplying the numbers in the second row by their corresponding power of 10 and then adding all the results (or products) together, we end up with a final number:
(3 × 1,000) + (9 × 1) = 3,009 |
The principle is exactly the same for base-2, or binary, numbers. However, instead of the columns increasing in powers of 10, they increase in powers of 2. The only digits required in the second row are either 0 or 1, also known as a bit. The simple on-off nature of binary numbers is perfect for emulating in digital electronic circuits. Table 1-3 shows the binary representation of the base-10 number 69:
(1 × 64) + (1 × 4) + (1 × 1) = 69 |
How can a binary number be negated (sign change)? A system called twos complement is used as follows:
Invert each bit in the binary number, so 01000101 becomes 10111010.
Add 1, so 10111010 becomes 10111011 (−69).
The topmost bit acts as a sign, where 0 means positive and 1 means negative. Go through the same procedure again, and we are back to +69.
JavaScript’s bitwise operators act on the binary digits, or bits, within an integer number.
This performs a binary AND on the operands, where the
resultant bit will be set only if the equivalent bit is set in both
operands. So, 0x0007 & 0x0003
gives 0x0003
. This can be a very
fast way of checking whether an object possesses a desired set of
attributes or flags. Table 1-4
shows the available flags for a pet
object. For example, a small, old,
brown dog would have a flags value of 64 + 16 + 8 + 2 =
90.
Searching for pet
s with
certain flags is simply a case of performing a bitwise AND with a
search value. The following code searches for any pet
that is big, young, and white (it may
be either a cat or dog, as this is not specified):
var searchFlags = 128 + 32 + 4; var pets = []; // This is an array full of pet objects. var numPets = pets.length; for (var i = 0; i < numPets; i++) { if (searchFlags & pets[i].flags === searchFlags) { /* Found a Match! Do something. */ } }
With a total of 32 bits available in an integer to represent various flags, this can be much faster than checking flags stored as separate properties or other types of conditional testing; for example:
var search = ['big','young','white'}; var pets = []; // This is an array full of pet objects. var numPets = pets.length; for (var i = 0; i < numPets; i++) { // The following inner loop makes things much slower. for(var c=0;c<search.length;c++) { // Check if the property exists in the pet object. if ( pets[i][search[c]] == undefined) break; } if( c == search.length ) { /* Found a Match! Do something. */ } }
The &
operator can also
act in a similar way to the modulus operator (%
), which returns the remainder after
division. The following code will ensure that the variable value
is always between 0 and 7:
value &= 7; // Equivalent to value % 8;
The equivalence to the %
operator works only if the value after the &
is 1, or a power of 2 less 1 (1, 3,
7, 15, 31...).
This performs a binary OR on the operators, where the
resultant bit will be set if the equivalent bit is set in either
operand. So, 0x0007 | 0x0003
gives 0x0007
. Effectively, it
merges the bits together.
This performs a binary exclusive OR on the operators, where
the resultant bit will be set if only one of the equivalent bits is
set in either operand. So, 0x0000 ^
0x0001
gives 0x0001
,
and 0x0001 ^ 0x0001
gives
0x0000
. This can act as a
shorthand way of toggling a variable:
toggle ^= 1;
Each time toggle ^= 1;
is
executed, the toggle
value will
flip between 1 and 0 (assuming it is 1 or 0 to start with). Here is
the equivalent code using if-else
:
if (toggle) { toggle = 0; }else { toggle = 1; }
or using the ternary operator (?
):
toggle = toggle ? 0:1;
This performs a ones complement, or inversion of all bits. So,
in binary, 11100111 would become 00011000. If the number in question
is a signed integer (where the topmost bit represents the sign),
then the ~
operator is equivalent
to changing the sign and subtracting 1.
This performs a binary shift left by a specified number of
bits. All bits are moved to the left, the topmost bit is lost, and a
0 is fed into the bottommost bit. This is the equivalent of an
unsigned integer multiplication of x
by 2^numBits
. Here are some
examples:
y = 5 << 1; // y = 10; Equivalent to Math.floor(5 * (2^1)). y = 5 << 2; // y = 20; Equivalent to Math.floor(5 * (2^2)). y = 5 << 3; // y = 40; Equivalent to Math.floor(5 * (2^3)).
Tests reveal no performance benefit over using the standard
multiply operator (*
).
This performs a binary shift right by a specified number of
bits. All bits are moved to the right, with the exception of the
topmost bit, which is preserved as the sign. The bottommost bit is
lost. This is the equivalent of a signed integer division of
x
by 2^numBits
. Here are some
examples:
y = 10 >> 1; // y = 5; Equivalent to Math.floor(5 / (2^1)). y = 10 >> 2; // y = 2; Equivalent to Math.floor(5 / (2^2)). y = 10 >> 3; // y = 1; Equivalent to Math.floor(5 / (2^3)).
Tests reveal no performance benefit over using the standard
divide operator (/
).
The following code looks pretty useless:
x = y >> 0;
However, it forces JavaScript to call its internal integer
conversion functions, resulting in the fractional parts of the
number being lost. Effectively, it is performing a fast Math.floor()
operation. Figure 1-4 shows that for
Internet Explorer 8, Google Chrome, and Safari 5.0, there is a speed
increase.
Rarely used, this is similar to the >>
operator, but the topmost bit
(sign bit) is not preserved and is set to 0. The bottommost bit is
lost. For positive numbers, this is the same as the >>
operator. For negative numbers,
however, the result is a positive number. Here are some
examples:
y = 10 >>> 1; // y = 5; y = −10 >>> 2; // y = 1073741821; y = −10 >>> 3; // y = 536870910;
Looping in any programming language adds a certain amount of overhead beyond the code within the loop. Loops usually maintain a counter and/or check for the termination condition, both of which take time.
Removing the loop overhead provides some performance benefits. A typical JavaScript loop looks like this:
for (var i = 0; i < 8; i++) { /*** do something here **/ }
By executing this instead, you can completely eliminate the loop overhead:
/*** do something here ***/ /*** do something here ***/ /*** do something here ***/ /*** do something here ***/ /*** do something here ***/ /*** do something here ***/ /*** do something here ***/ /*** do something here ***/
However, with a loop of just eight iterations, the improvement
is not worth the effort. Assuming do
something here
is a simple statement (e.g., x++
), removing the loop might execute the
code 300% faster, but this is at the microsecond level; 0.000003
seconds versus 0.000001 seconds is not going to make a noticeable
difference. If do something here
is
a big and slow function call, then the figures read more like 0.100003
seconds versus 0.100001 seconds. Again, too small an improvement to be
worthwhile.
There are two factors that determine whether loop unrolling will provide a tangible benefit:
The number of iterations. In practice, many iterations (probably thousands) are needed to make a difference.
The proportion of time the inner loop code takes versus the loop overhead. Complex inner loop code that is many times slower to execute than the loop overhead will show a smaller improvement. This is because most of the time is being spent inside the inner loop code, not the loop overhead.
It is not practical to entirely unroll loops that require hundreds or thousands of iterations. The solution is to use a technique that is a variation of Duff’s device. This works by performing partial unrolling of a loop. For example, a loop of 1,000 iterations can be broken into 125 iterations of code that is unrolled eight times:
// Credit: from Jeff Greenberg's site via an anonymous donor. var testVal = 0; var n = iterations % 8 while (n--) { testVal++; } n = parseInt(iterations / 8); while (n--) { testVal++; testVal++; testVal++; testVal++; testVal++; testVal++; testVal++; testVal++; } }
The first while
loop takes
into account situations where the number of iterations is not divisible by the number of unrolled
code lines. For example, 1,004 iterations requires a loop of 4 normal
iterations (1004 % 8
), followed by
125 unrolled iterations of 8 each (parseInt(1004 / 8)
). Here is a slightly
improved version:
var testVal = 0; var n = iterations >> 3; // Same as: parseInt(iterations / 8). while(n--){ testVal++; testVal++; testVal++; testVal++; testVal++; testVal++; testVal++; testVal++; } n = iterations - testVal; // testVal has kept a tally, so do the remainder here. while(n--) { testVal++; }
Duff’s device refers to a specific C-language optimization for unrolling loops that was developed by Tom Duff in 1983. He does not claim credit for loop unrolling as a general principle. Loop unrolling is common practice in assembly language, where tiny optimizations can make a difference in areas such as large memory copies and clears. Optimizing compilers may also perform automatic loop unrolling.
For loops of 10,000 iterations with trivial code, this returns significant performance gains. Figure 1-5 shows the results. Should we now optimize all our loops like this? No, not yet. The test is unrealistic: it is unlikely that incrementing a local variable is all we want to do within a loop.
A better test involves iterating through an array and calling a function with the array contents. This is much more along the lines of what will happen inside real applications:
// Initialize 10000 items. var items = []; for (var i = 0; i < 10000; i++) { items.push(Math.random()); } // A function to do some useful work. var processItem = function (x) { return Math.sin(x) * 10; }; // The slow way. var slowFunc = function () { var len = items.length; for (var i = 0; i < len; i++) { processItem(items[i]); } }; // The 'fast' way. var fastFunc = function () { var idx = 0; var i = items.length >> 3; while (i--) { processItem(items[idx++]); processItem(items[idx++]); processItem(items[idx++]); processItem(items[idx++]); processItem(items[idx++]); processItem(items[idx++]); processItem(items[idx++]); processItem(items[idx++]); } i = items.length - idx; while (i--) { processItem(items[idx++]); } };
Figure 1-6 shows the improvement. Ouch. The real work within the loops has made the loop unrolling benefit a drop in the ocean. It is somewhat akin to ordering the 4,000-calorie Mega Burger Meal and hoping the diet soda will make things less fattening. Considering the 10,000 iterations, this is a disappointing set of results.
Figure 1-6. Unrolled loop with nontrivial inner loop code (10,000 iterations). A disappointing turnout. Bigger is better.
The moral of this story is that JavaScript loops are actually rather efficient, and you need to place micro-optimizations within the context of real application behavior to realistically test their benefits.