Expanding and Compressing Tabs

Credit: Alex Martelli

Problem

You want to convert tabs in a string to the appropriate number of spaces, or vice versa.

Solution

Changing tabs to the appropriate number of spaces is a reasonably frequent task, easily accomplished with Python strings’ built-in expandtabs method. Because strings are immutable, the method returns a new string object (a modified copy of the original one). However, it’s easy to rebind a string variable name from the original to the modified-copy value:

mystring = mystring.expandtabs(  )

This doesn’t change the string object to which mystring originally referred, but it does rebind the name mystring to a newly created string object in which tabs are expanded into runs of spaces.

Changing spaces into tabs is a rare and peculiar need. Compression, if that’s what you’re after, is far better performed in other ways, so Python doesn’t offer a built-in way to unexpand spaces into tabs. We can, of course, write one. String processing tends to be fastest in a split/process/rejoin approach, rather than with repeated overall string transformations:

def unexpand(astring, tablen=8):
    import re
    pieces = re.split(r'( +)', astring.expandtabs(tablen))
    lensofar = 0
    for i in range(len(pieces)):
        thislen = len(pieces[i])
        lensofar += thislen
        if pieces[i][0]==' ':
            numblanks = lensofar % tablen
            numtabs = (thislen-numblanks+tablen-1)/tablen
            pieces[i] = '\t'*numtabs + ' '*numblanks
    return ''.join(pieces)

Discussion

If expandtabs didn’t exist, we could write it up as a function. Here is a regular expression-based approach, similar to the one used in the recipe’s unexpand function:

def expand_with_re(astring, tablen=8):
    import re
    pieces = re.split(r'(\t)', astring)
    lensofar = 0
    for i in range(len(pieces)):
        if pieces[i]=='\t':
            pieces[i] = ' '*(tablen-lensofar%tablen)
        lensofar += len(pieces[i])
    return ''.join(pieces)

When the regular expression contains a (parenthesized) group, re.split gives us the splitters too. This is useful here for massaging the pieces list into the form we want for the final ''.join. However, a string split by '\t', followed by interleaving the spaces joiners of suitable lengths, looks a bit better in this case:

def expand(astring, tablen=8):
    result = []
    for piece in astring.split('\t'):
        result.append(piece)
        result.append(' '*(tablen-len(piece)%tablen))
    return ''.join(result[:-1])

See Also

Documentation for the expandtabs function in the string module in the Library Reference; Perl Cookbook Recipe 1.7.

Get Python Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.