Chapter 4. Object Creation

The biggest difference between time and space is that you can’t reuse time.

Merrick Furst

“I thought that I didn’t need to worry about memory allocation. Java is supposed to handle all that for me.” This is a common perception, which is both true and false. Java handles low-level memory allocation and deallocation and comes with a garbage collector. Further, it prevents access to these low-level memory-handling routines, making the memory safe. So memory access should not cause corruption of data in other objects or in the running application, which is potentially the most serious problem that can occur with memory access violations. In a C or C++ program, problems of illegal pointer manipulations can be a major headache (e.g., deleting memory more than once, runaway pointers, bad casts). They are very difficult to track down and are likely to occur when changes are made to existing code. Java deals with all these possible problems and, at worst, will throw an exception immediately if memory is incorrectly accessed.

However, Java does not prevent you from using excessive amounts of memory nor from cycling through too much memory (e.g., creating and dereferencing many objects). Contrary to popular opinion, you can get memory leaks by holding on to objects without releasing references. This stops the garbage collector from reclaiming those objects, resulting in increasing amounts of memory being used.[27] In addition, Java does not provide for large numbers of objects to be created simultaneously (as you could do in C by allocating a large buffer), which eliminates one powerful technique for optimizing object creation.

Creating objects costs time and CPU effort for an application. Garbage collection and memory recycling cost more time and CPU effort. The difference in object usage between two algorithms can make a huge difference. In Chapter 5, I cover algorithms for appending basic data types to StringBuffer objects. These can be an order of magnitude faster than some of the conversions supplied with Java. A significant portion of the speedup is obtained by avoiding extra temporary objects used and discarded during the data conversions.[28]

Here are a few general guidelines for using object memory efficiently:

  • Avoid creating objects in frequently used routines. Because these routines are called frequently, you will likely be creating objects frequently, and consequently adding heavily to the overall burden of object cycling. By rewriting such routines to avoid creating objects, possibly by passing in reusable objects as parameters, you can decrease object cycling.

  • Try to presize any collection object to be as big as it will need to be. It is better for the object to be slightly bigger than necessary than to be smaller than it needs to be. This recommendation really applies to collections that implement size increases in such a way that objects are discarded. For example, Vector grows by creating a new larger internal array object, copying all the elements from and discarding the old array. Most collection implementations have similar implementations for growing the collection beyond its current capacity, so presizing a collection to its largest potential size reduces the number of objects discarded.

  • When multiple instances of a class need access to a particular object in a variable local to those instances, it is better to make that variable a static variable rather than have each instance hold a separate reference. This reduces the space taken by each object (one less instance variable) and can also reduce the number of objects created if each instance creates a separate object to populate that instance variable.

  • Reuse exception instances when you do not specifically require a stack trace (see Section 6.1).

This chapter presents many other standard techniques to avoid using too many objects, and identifies some known inefficiencies when using some types of objects.

Object-Creation Statistics

Objects need to be created before they can be used, and garbage-collected when they are finished with. The more objects you use, the heavier this garbage-cycling impact becomes. General object-creation statistics are actually quite difficult to measure decisively, since you must decide exactly what to measure, what size to pregrow the heap space to, how much garbage collection impacts the creation process if you let it kick in, etc.

For example, on a medium Pentium II, with heap space pregrown so that garbage collection does not have to kick in, you can get around half a million to a million simple objects created per second. If the objects are very simple, even more can be garbage-collected in one second. On the other hand, if the objects are complex, with references to other objects, and include arrays (like Vector and StringBuffer) and nonminimal constructors, the statistics plummet to less than a quarter of a million created per second, and garbage collection can drop way down to below 100,000 objects per second. Each object creation is roughly as expensive as a malloc in C, or a new in C++, and there is no easy way of creating many objects together, so you cannot take advantage of efficiencies you get using bulk allocation.

There are already runtime systems that use generational garbage collection, minimize object-creation overhead, and optimize native-code compilation. By doing this they reach up to three million objects created and collected per second (on a Pentium II), and it is likely that the average Java system should improve to get closer to that kind of performance over time. But these figures are for basic tests, optimized to show the maximum possible object-creation throughput. In a normal application with varying size objects and constructor chains, these sorts of figures cannot be obtained or even approached. Also bear in mind that you are doing nothing else in these tests apart from creating objects. In most applications, you are usually doing something with all those objects, making everything much slower but significantly more useful. Avoidable object creation is definitely a significant overhead for most applications, and you can easily run through millions of temporary objects using inefficient algorithms that create too many objects. In Chapter 5, we look at an example that uses the StreamTokenizer class. This class creates and dereferences a huge number of objects while it parses a stream, and the effect is to slow down processing to a crawl. The example in Chapter 5 presents a simple alternative to using a StreamTokenizer, which is 100 times faster: a large percentage of the speedup is gained from avoiding cycling through objects.

Note that different VM environments produce different figures. If you plot object size against object-creation time for various environments, most plots are monotonically increasing, i.e., it takes more time to create larger objects. But there are discrepancies here too. For example, Netscape Version 4 running on Windows has the peculiar behavior that objects of size 4 and 12 ints are created fastest (refer to http://www.javaworld.com/javaworld/jw-09-1998/jw-09-speed.html). Also, note that JIT VMs actually have a worse problem with object creation relative to other VM activities, because JIT VMs can speed up almost every other activity, but object creation is nearly as slow as if the JIT compiler was not there.



[27] Ethan Henry and Ed Lycklama have written a nice article discussing Java memory leaks in the February 2000 issue of Dr. Dobb’s Journal. This article is available online from the Dr. Dobb’s web site, http://www.ddj.com.

[28] Up to Java 1.3. Data-conversion performance is targeted by JavaSoft, however, so some of the data conversions may speed up after 1.3.

Get Java Performance Tuning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.