Strings Versus char Arrays

In one of my first programming courses, in the language C, our instructor made an interesting comment. He said, “C has lightning-fast string handling because it has no string type.” He went on to explain this oxymoron by pointing out that in C, any null-terminated sequence of bytes can be considered a string: this convention is supported by all string-handling functions. The point is that since the convention is adhered to fairly rigorously, there is no need to use only the standard string-handling functions. Any string manipulation you want to do can be executed directly on the byte array, allowing you to bypass or rewrite any string-handling functions you need to speed up. Because you are not forced to run through a restricted set of manipulation functions, it is always possible to optimize code using your own hand-crafted functions. Furthermore, some string-manipulating functions operate directly on the original byte array rather than creating a copy of this array. This can be a source of bugs, but is another reason speed can be optimized.

In Java, the inability to subclass String or access its internal char array means you cannot use the techniques applied in C. Even if you could subclass String, this does not avoid the second problem: many other methods operate on or return copies of a String. Generally, there is no way to avoid using String objects for code external to your application classes. But internally, you can provide your own char array ...

Get Java Performance Tuning now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.