O'Reilly logo

iPhone 3D Programming by Philip Rideout

Stay ahead with the world's most comprehensive technology and business learning platform.

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, tutorials, and more.

Start Free Trial

No credit card required

Vertex Submission: Above and Beyond VBOs

The manner in which you submit vertex data to OpenGL ES can have a huge impact on performance. The most obvious tip is something I mentioned early in this book: use vertex buffer objects whenever possible. They eliminate costly memory transfers. VBOs don’t help as much with older devices, but using them is a good habit to get into.

Batch, Batch, Batch

VBO usage is just the tip of the iceberg. Another best practice that you’ll hear a lot about is draw call batching. The idea is simple: try to render as much as possible in as few draw calls as possible. Consider how you’d go about drawing a human head. Perhaps your initial code does something like Example 9-1.

Example 9-1. Highly unoptimized OpenGL ES sequence

glBindTexture(...);  // Bind the skin texture.
glDrawArrays(...);   // Render the head.
glDrawArrays(...);   // Render the nose.
glLoadMatrixfv(...); // Shift the model-view to the left side.
glDrawArrays(...);   // Render the left ear.

glBindTexture(...);  // Bind the eyeball texture.
glDrawArrays(...);   // Render the left eye.
glLoadMatrixfv(...); // Shift the model-view to the right side.

glBindTexture(...);  // Bind the skin texture.
glDrawArrays(...);   // Render the right ear.

glBindTexture(...);  // Bind the eyeball texture.
glDrawArrays(...);   // Render the right eye.
glLoadMatrixfv(...); // Shift the model-view to the center.

glBindTexture(...);  // Bind the lips texture.
glDrawArrays(...);   // Render the lips.

Right off the bat, you should notice that the head and nose can be “batched” into a single VBO. You can also do a bit of rearranging to reduce the number of texture binding operations. Example 9-2 shows the result after this tuning.

Example 9-2. OpenGL ES sequence after initial tuning

glBindTexture(...);  // Bind the skin texture.
glDrawArrays(...);   // Render the head and nose.
glLoadMatrixfv(...); // Shift the model-view to the left side.
glDrawArrays(...);   // Render the left ear.
glLoadMatrixfv(...); // Shift the model-view to the right side.
glDrawArrays(...);   // Render the right ear.

glBindTexture(...);  // Bind the eyeball texture.
glLoadMatrixfv(...); // Shift the model-view to the left side.
glDrawArrays(...);   // Render the left eye.
glLoadMatrixfv(...); // Shift the model-view to the left side.
glDrawArrays(...);   // Render the right eye.
glLoadMatrixfv(...); // Shift the model-view to the center.

glBindTexture(...);  // Bind the lips texture.
glDrawArrays(...);   // Render the lips.

Try combing through the code again to see whether anything can be eliminated. Sure, you might be saving a little bit of memory by using a single VBO to represent the ear, but suppose it’s a rather small VBO. If you add two instances of the ear geometry to your existing “head and nose” VBO, you can eliminate the need for changing the model-view matrix, plus you can use fewer draw calls. Similar guidance applies to the eyeballs. Example 9-3 shows the result.

Example 9-3. OpenGL ES sequence after second pass of tuning

glBindTexture(...);  // Bind the skin texture.
glDrawArrays(...);   // Render the head and nose and ears.

glBindTexture(...);  // Bind the eyeball texture.
glDrawArrays(...);   // Render both eyes.

glBindTexture(...);  // Bind the lips texture.
glDrawArrays(...);   // Render the lips.

You’re not done yet. Remember texture atlases, first presented in Animation with Sprite Sheets? By tweaking your texture coordinates and combining the skin texture with the eye and lip textures, you can reduce the rendering code to only two lines:

glBindTexture(...);  // Bind the atlas texture.
glDrawArrays(...);   // Render the head and nose and ears and eyes and lips.

Note

Pixomatic’s ZBrush application is a favorite with artists for generating texture atlases.

OK, I admit this example was rather contrived. Rarely does production code make linear sequences of OpenGL calls as I’ve done in these examples. Real-world code is usually organized into subroutines, with plenty of stuff going on between the draw calls. But, the same principles apply. From the GPU’s perspective, your application is merely a linear sequence of OpenGL calls. If you think about your code in this distilled manner, potential optimizations can be easier to spot.

Interleaved Vertex Attributes

You might hear the term interleaved data being thrown around in regard to OpenGL optimizations. It is indeed a good practice, but it’s actually nothing special. In fact, every sample in this book has been using interleaved data (for a diagram, flip back to Figure 1-8 in Chapter 1). Using our C++ vector library, much of this book’s sample code declares a plain old data (POD) structure representing a vertex, like this:

struct Vertex {
    vec3 Position;
    vec3 Normal;
    vec2 TexCoord;
};

When we create the VBO, we populate it with an array of Vertex objects. When it comes time to render the geometry, we usually do something like Example 9-4.

Example 9-4. Using interleaved attributes

glBindBuffer(...);
GLsizei stride = sizeof(Vertex);

// ES 1.1
glVertexPointer(3, GL_FLOAT, stride, 0);
glNormalPointer(GL_FLOAT, stride, offsetof(Vertex, Normal));
glTexCoordPointer(2, GL_FLOAT, stride, offsetof(Vertex, TexCoord));

// ES 2.0
glVertexAttribPointer(positionAttrib, 3, GL_FLOAT, GL_FALSE, stride, 0);
glVertexAttribPointer(normalAttrib, 3, GL_FALSE, 
                      GL_FALSE, stride, offsetof(Vertex, Normal));
glVertexAttribPointer(texCoordAttrib, 2, GL_FLOAT, 
                      GL_FALSE, stride, offsetof(Vertex, TexCoord));

OpenGL does not require you to arrange VBOs in the previous manner. For example, consider a small VBO with only three vertices. Instead of arranging it like this:

Position-Normal-TexCoord-Position-Normal-TexCoord-Position-Normal-TexCoord

you could lay it out it like this:

Position-Position-Position-Normal-Normal-Normal-TexCoord-TexCoord-TexCoord

This is perfectly acceptable (but not advised); Example 9-5 shows the way you’d submit it to OpenGL.

Example 9-5. Unoptimal vertex layout

glBindBuffer(...);

// ES 1.1
glVertexPointer(3, GL_FLOAT, sizeof(vec3), 0);
glNormalPointer(GL_FLOAT, sizeof(vec3), sizeof(vec3) * VertexCount);
glTexCoordPointer(2, GL_FLOAT, sizeof(vec2), 
                  2 * sizeof(vec3) * VertexCount);

// ES 2.0
glVertexAttribPointer(positionAttrib, 3, GL_FLOAT, 
                      GL_FALSE, sizeof(vec3), 0);
glVertexAttribPointer(normalAttrib, 3, GL_FALSE, 
                      GL_FALSE, sizeof(vec3), 
                      sizeof(vec3) * VertexCount);
glVertexAttribPointer(texCoordAttrib, 2, GL_FLOAT, 
                      GL_FALSE, sizeof(vec2), 
                      2 * sizeof(vec3) * VertexCount);

When you submit vertex data in this manner, you’re forcing the driver to reorder the data to make it amenable to the GPU.

Optimize Your Vertex Format

One aspect of vertex layout you might be wondering about is the ordering of attributes. With OpenGL ES 2.0 and newer Apple devices, the order has little or no impact on performance (assuming you’re using interleaved data). On first- and second-generation iPhones, Apple recommends the following order:

  1. Position

  2. Normal

  3. Color

  4. Texture coordinate (first stage)

  5. Texture coordinate (second stage)

  6. Point size

  7. Bone weight

  8. Bone index

You might be wondering about the two “bone” attributes—stay tuned, well discuss them later in the chapter.

Another way of optimizing your vertex format is shrinking the size of the attribute types. In this book, we’ve been a bit sloppy by using 32-bit floats for just about everything. Don’t forget there are other types you can use. For example, floating point is often overkill for color, since colors usually don’t need as much precision as other attributes.

// ES 1.1
// Lazy iPhone developer:
glColorPointer(4, GL_FLOAT, sizeof(vertex), offset);

// Rock Star iPhone developer!
glColorPointer(4, GL_UNSIGNED_BYTE, sizeof(vertex), offset); 

// ES 2.0
// Lazy:
glVertexAttribPointer(color, 4, GL_FLOAT, GL_FALSE, stride, offset);

// Rock star!
glVertexAttribPointer(color, 4, GL_UNSIGNED_BYTE, 
                      GL_FALSE, stride, offset);

Warning

Don’t use GL_FIXED. Because of the iPhone’s architecture, fixed-point numbers actually require more processing than floating-point numbers. Fixed-point is available only to comply with the Khronos specification.

Apple recommends aligning vertex attributes in memory according to their native alignment. For example, a 4-byte float should be aligned on a 4-byte boundary. Sometimes you can deal with this by adding padding to your vertex format:

struct Vertex {
    vec3 Position;
    unsigned char Luminance;
    unsigned char Alpha;
    unsigned short Padding;
};

Use the Best Topology and Indexing

Apple’s general advice (at the time of this writing) is to prefer GL_TRIANGLE_STRIP over GL_TRIANGLES. Strips require fewer vertices but usually at the cost of more draw calls. Sometimes you can reduce the number of draw calls by introducing degenerate triangles into your vertex buffer.

Strips versus separate triangles, indexed versus nonindexed; these all have trade-offs. You’ll find that many developers have strong opinions, and you’re welcome to review all the endless debates on the forums. In the end, experimentation is the only reliable way to determine the best tessellation strategy for your unique situation.

Imagination Technologies provides code for converting lists into strips. Look for PVRTTriStrip.cpp in the OpenGL ES 1.1 version of the PowerVR SDK (first discussed in Texture Compression with PVRTC). It also provides a sample app to show it off (Demos/OptimizeMesh).

With Safari, you learn the way you learn best. Get unlimited access to videos, live online training, learning paths, books, interactive tutorials, and more.

Start Free Trial

No credit card required