Standardizing a Social Security number in Pandas

When working with Personally Identifiable Information, also known as PII, whether in the medical, human resources, or any other industry, you'll receive that data in various formats. There are many ways you might see a Social Security number written. In this recipe, you'll learn how to standardize the commonly seen formats.

Getting ready

Import Pandas, and create a new DataFrame to work with:

import pandas as pd lc = pd.DataFrame({ 'people' : ["cole o'brien", "lise heidenreich", "zilpha skiles", "damion wisozk"], 'age' : [24, 35, 46, 57], 'ssn': ['6439', '689 24 9939', '306-05-2792', '992245832'], 'birth_date': ['2/15/54', '05/07/1958', '19XX-10-23', '01/26/0056'], 'customer_loyalty_level' : ['not ...

Get Python Business Intelligence Cookbook now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.