The assembly line metaphor illustrates how OpenGL works behind the scenes, but a photography metaphor is more useful when thinking about a 3D application’s workflow. When my wife makes an especially elaborate Indian dinner, she often asks me to take a photo of the feast for her personal blog. I usually perform the following actions to achieve this:

It turns out that each of these actions have analogues in OpenGL, although they typically occur in a different order. Setting aside the issue of lighting (which we’ll address in a future chapter), an OpenGL program performs the following actions:

The product of the model and view matrices is
known as the *model-view matrix*. When rendering an
object, OpenGL ES 1.1 transforms every vertex first by the model-view
matrix and then by the projection matrix. With OpenGL ES 2.0, you can
perform any transforms you want, but it’s often useful to follow the same
model-view/projection convention, at least in simple scenarios.

Later we’ll go over each of the three transforms (projection, view, model) in detail, but first we need to get some preliminaries out of the way. OpenGL has a unified way of dealing with all transforms, regardless of how they’re used. With ES 1.1, the current transformation state can be configured by loading matrices explicitly, like this:

float projection[16] = { ... }; float modelview[16] = { ... }; glMatrixMode(GL_PROJECTION); glLoadMatrixf(projection); glMatrixMode(GL_MODELVIEW); glLoadMatrixf(modelview);

With ES 2.0, there is no inherent concept of
model-view and projection; in fact, `glMatrixMode`

and
`glLoadMatrixf`

do not exist in 2.0. Rather, matrices are
loaded into *uniform variables* that are then
consumed by shaders. Uniforms are a type of shader connection that we’ll
learn about later, but you can think of them as constants that shaders
can’t modify. They’re loaded like this:

float projection[16] = { ... }; float modelview[16] = { ... }; GLint projectionUniform = glGetUniformLocation(program, "Projection"); glUniformMatrix4fv(projectionUniform, 1, 0, projection); GLint modelviewUniform = glGetUniformLocation(program, "Modelview"); glUniformMatrix4fv(modelviewUniform, 1, 0, modelview);

ES 1.1 provides additional ways of manipulating matrices that do not exist in 2.0. For example, the following 1.1 snippet loads an identity matrix and multiplies it by two other matrices:

float view[16] = { ... }; float model[16] = { ... }; glMatrixMode(GL_MODELVIEW); glLoadIdentity(); glMultMatrixf(view); glMultMatrixf(model);

The default model-view and projection matrices are identity matrices. The identity transform is effectively a no-op, as shown in Equation 2-2.

For details on how to multiply a vector with a matrix, or a matrix with another matrix, check out the code in the appendix.

It’s important to note that this book uses row
vector notation rather than column vector notation. In Equation 2-2, both the left side of
`(v`

and right side of
_{x} v_{y}
v_{z} 1)`(v`

are 4D row vectors. That equation
could, however, be expressed in column vector notation like so:_{x}*1 v_{y}*1
v_{z}*1 1)

Sometimes it helps to think of a 4D row vector
as being a 1×4 matrix, and a 4D column vector as being a 4×1 matrix.
(*n*x*m* denotes the dimensions of a
matrix where *n* is the number of rows and
*m* is the number of columns.)

Figure 2-6 shows a
trick for figuring out whether it’s legal to multiply two quantities in a
certain order: the inner numbers should match. The outer numbers tell you
the dimensions of the result. Applying this rule, we can see that it’s
legal to multiply the two matrices shown in Equation 2-2: the 4D row vector (effectively a 1×4
matrix) on the left of the `*`

and the 4×4 matrix on the right
are multiplied to produce a 1×4 matrix (which also happens to be a 4D row
vector).

From a coding perspective, I find that row vectors are more natural than column vectors because they look like tiny C-style arrays. It’s valid to think of them as column vectors if you’d like, but if you do so, be aware that the ordering of your transforms will flip around. Ordering is crucial because matrix multiplication is not commutative.

Consider this snippet of ES 1.1 code:

glLoadIdentity(); glMultMatrix(A); glMultMatrix(B); glMultMatrix(C); glDrawArrays(...);

With row vectors, you can think of each
successive transform as being *pre*multiplied with the
current transform, so the previous snippet is equivalent to the
following:

With column vectors, each successive transform
is *post*multiplied, so the code snippet is actually
equivalent to the following:

Regardless of whether you prefer row or column vectors, you should always think of the last transformation in your code as being the first one to be applied to the vertex. To make this apparent with column vectors, use parentheses to show the order of operations:

This illustrates another reason why I like row vectors; they make OpenGL’s reverse-ordering characteristic a little more obvious.

Enough of this mathematical diversion; let’s get back to the photography metaphor and see how it translates into OpenGL. OpenGL ES 1.1 provides a set of helper functions that can generate a matrix and multiply the current transformation by the result, all in one step. We’ll go over each of these helper functions in the coming sections. Since ES 2.0 does not provide helper functions, we’ll also show what they do behind the scenes so that you can implement them yourself.

Recall that there are three matrices involved in OpenGL’s setup:

We’ll go over each of these three transforms in reverse so that we can present the simplest transformations first.

The three most common operations when positioning an object in a scene are scale, translation, and rotation.

The most trivial helper function is
`glScalef`

:

float scale[16] = { sx, 0, 0, 0, 0, sy, 0, 0, 0, 0, sz, 0 0, 0, 0, 1 }; // The following two statements are equivalent. glMultMatrixf(scale); glScalef(sx, sy, sz);

The matrix for scale and its derivation are shown in Equation 2-3.

Figure 2-7 depicts a
scale transform where s_{x} =
s_{y} = 0.5.

Another simple helper transform is
`glTranslatef`

, which shifts an object by a fixed
amount:

float translation[16] = { 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, tx, ty, tz, 1 }; // The following two statements are equivalent. glMultMatrixf(translation); glTranslatef(tx, ty, tz);

Intuitively, translation is achieved with addition, but recall that homogeneous coordinates allow us to express all transformations using multiplication, as shown in Equation 2-4.

Figure 2-8 depicts
a translation transform where t_{x} = 0.25 and
t_{y} = 0.5.

You might recall this transform from the fixed-function variant (ES 1.1) of the HelloArrow sample:

glRotatef(m_currentAngle, 0, 0, 1);

This applies a counterclockwise rotation about the z-axis. The first argument is an angle in degrees; the latter three arguments define the axis of rotation. The ES 2.0 renderer in HelloArrow was a bit tedious because it computed the matrix manually:

#include <cmath> ... float radians = m_currentAngle * Pi / 180.0f; float s = std::sin(radians); float c = std::cos(radians); float zRotation[16] = { c, s, 0, 0, -s, c, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1 }; GLint modelviewUniform = glGetUniformLocation(m_simpleProgram, "Modelview"); glUniformMatrix4fv(modelviewUniform, 1, 0, &zRotation[0]);

Figure 2-9 depicts a rotation transform where the angle is 45°.

Rotation about the z-axis is relatively
simple, but rotation around an arbitrary axis requires a more complex
matrix. For ES 1.1, `glRotatef`

generates the matrix
for you, so there’s no need to get too concerned with its contents.
For ES 2.0, check out the appendix to see how to implement
this.

By itself, `glRotatef`

rotates only around the origin, so what if you want to rotate around
an arbitrary point **p**? To accomplish
this, use a three-step process:

For example, to change HelloArrow to rotate around (0, 1) rather than the center, you could do this:

glTranslatef(0, +1, 0); glRotatef(m_currentAngle, 0, 0, 1); glTranslatef(0, -1, 0); glDrawArrays(...);

Remember, the last transform in your code is actually the first one that gets applied!

The simplest way to create a view matrix is
with the popular `LookAt`

function. It’s not built into
OpenGL ES, but it’s easy enough to implement it from scratch.
`LookAt`

takes three parameters: a camera position, a
target location, and an “up” vector to define the camera’s orientation
(see Figure 2-10).

Using the three input vectors,
`LookAt`

produces a transformation matrix that would
otherwise be cumbersome to derive using the fundamental transforms
(scale, translation, rotation). Example 2-1 is one
possible implementation of `LookAt`

.

Example 2-1. LookAt

mat4 LookAt(const vec3& eye, const vec3& target, const vec3& up) { vec3 z = (eye - target).Normalized(); vec3 x = up.Cross(z).Normalized(); vec3 y = z.Cross(x).Normalized(); mat4 m; m.x = vec4(x, 0); m.y = vec4(y, 0); m.z = vec4(z, 0); m.w = vec4(0, 0, 0, 1); vec4 eyePrime = m * -eye; m = m.Transposed(); m.w = eyePrime; return m; }

Note that Example 2-1 uses
custom types like `vec3`

, `vec4`

, and
`mat4`

. This isn’t pseudocode; it’s actual code from
the C++ vector library in the appendix. We’ll discuss the library later
in the chapter.

Until this point, we’ve been dealing with
transformations that are typically used to modify the model-view rather
than the projection. ES 1.1 operations such as `glRotatef`

and
`glTranslatef`

always affect the current matrix, which
can be changed at any time using `glMatrixMode`

.
Initially the matrix mode is `GL_MODELVIEW`

.

What’s the distinction between projection and model-view? Novice OpenGL programmers sometimes think of the projection as being the “camera matrix,” but this is an oversimplification, if not completely wrong; the position and orientation of the camera should actually be specified in the model-view. I prefer to think of the projection as being the camera’s “zoom lens” because it affects the field of view.

Camera position and orientation should always go in the model-view, not the projection. OpenGL ES 1.1 depends on this to perform correct lighting calculations.

Two types of projections commonly appear in computer graphics: perspective and orthographic. Perspective projections cause distant objects to appear smaller, just as they do in real life. You can see the difference in Figure 2-11.

An orthographic projection is usually appropriate only for 2D graphics, so that’s what we used in HelloArrow:

const float maxX = 2; const float maxY = 3; glOrthof(-maxX, +maxX, -maxY, +maxY, -1, 1);

The arguments for `glOrthof`

specify the distance of the six bounding planes from the origin: left,
right, bottom, top, near, and far. Note that our example arguments
create an aspect ratio of 2:3; this is appropriate since the iPhone’s
screen is 320×480. The ES 2.0 renderer in HelloArrow reveals how the
orthographic projection is computed:

float a = 1.0f / maxX; float b = 1.0f / maxY; float ortho[16] = { a, 0, 0, 0, 0, b, 0, 0, 0, 0, -1, 0, 0, 0, 0, 1 };

When an orthographic projection is centered around the origin, it’s really just a special case of the scale matrix that we already presented in Scale:

sx = 1.0f / maxX sy = 1.0f / maxY sz = -1 float scale[16] = { sx, 0, 0, 0, 0, sy, 0, 0, 0, 0, sz, 0 0, 0, 0, 1 };

Since HelloCone (the example you’ll see later
in this chapter) will have true 3D rendering, we’ll give it a
perspective matrix using the `glFrustumf`

command, like
this:

glFrustumf(-1.6f, 1.6, -2.4, 2.4, 5, 10);

The arguments to
`glFrustumf`

are the same as
`glOrthof`

. Since `glFrustum`

does not
exist in ES 2.0, HelloCone’s 2.0 renderer will compute the matrix
manually, like this:

void ApplyFrustum(float left, float right, float bottom, float top, float near, float far) { float a = 2 * near / (right - left); float b = 2 * near / (top - bottom); float c = (right + left) / (right - left); float d = (top + bottom) / (top - bottom); float e = - (far + near) / (far - near); float f = -2 * far * near / (far - near); mat4 m; m.x.x = a; m.x.y = 0; m.x.z = 0; m.x.w = 0; m.y.x = 0; m.y.y = b; m.y.z = 0; m.y.w = 0; m.z.x = c; m.z.y = d; m.z.z = e; m.z.w = -1; m.w.x = 0; m.w.y = 0; m.w.z = f; m.w.w = 1; glUniformMatrix4fv(projectionUniform, 1, 0, m.Pointer()); }

When a perspective projection is applied, the field of view is in the shape of a frustum. The viewing frustum is just a chopped-off pyramid with the eye at the apex of the pyramid (see Figure 2-12).

A viewing frustum can also be computed based
on the angle of the pyramid’s apex (known as *field of
view*); some developers find these to be more intuitive than
specifying all six planes. The function in Example 2-2 takes four arguments: the field-of-view
angle, the aspect ratio of the pyramid’s base, and the near and far
planes.

Start Free Trial

No credit card required