6.6. Keep Initials Together

The next example shows what length we have to go to because of the absence of lookbehinds. The script is designed to sort some things out in reference lists, namely, to keep multiple initials together (though they may be split from the surname), and to keep single initials together with the surname. Thus in T. H. H. Huxley the initials must stay together, the surname can be on the following line; this can be achieved by applying nobreak to just the initials. At the same time, when there is no space between the initials, one is added. On the other hand, in A. Huxley the initial mustn't be left on its own at the end of a line, so we apply nobreak to the initial and the first letter of the surname. The first regex tries to match two or more initials; the positive lookahead matches the first two letters of the surname (we want to be moderately certain that we're dealing with initials), but doesn't match them, so we in fact capture just the initials. In the replace loop, each set of initials is checked for spacing: a period is inserted after any period not followed by one.

The second part of the script first matches all single initials and following letter, the first name of the surname. The lookahead now matches just a lower-case letter, again to have some check that we're dealing with an initial. At this stage, all upper-case letters preceding a name are captured, but by setting the find preference to skip the nobreak feature, any initials processed by ...

Get Automating InDesign with Regular Expressions now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.