Removing punctuation in Pandas

When performing string comparisons on your data, certain things like punctuation might not matter. In this recipe, you'll learn how to remove punctuation from a column in a DataFrame.

Getting ready

Part of the power of Pandas is applying a custom function to an entire column at once. Create a DataFrame from the customer data, and use the following recipe to update the last_name column.

How to do it…

import string exclude = set(string.punctuation) def remove_punctuation(x): """ Helper function to remove punctuation from a string x: any string """ try: x = ''.join(ch for ch in x if ch not in exclude) except: pass return x # Apply the function to the DataFrame customers.last_name = customers.last_name.apply(remove_punctuation) ...

Get Python Business Intelligence Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.