I have been working with data produced by the U.S. Census Bureau. They use FIPS codes to identify geographies. When I read in the data using Pandas, the FIPS codes get converted into numbers. After trying to force the type to string (which doesn’t work currently), I decided to create a work around. Here it is:
def fix_fips(fips, total_length): """Takes a broken FIPS and repairs it""" fips = str(fips) current_length = len(fips) if current_length < total_length: number_of_leading_zeros = total_length - current_length leading_zeros = ''.join('0' * number_of_leading_zeros) fips = leading_zeros + fips return fips
So say I have some state level data which has a two character FIPS code read into a pandas dataframe. I would correct the mangled data by:
df['State FIPS code'] = df['State FIPS code'].apply(fix_fips, args=(2,))
Hope this helps other data ninjas out there!