Python – Useful String Operations Part 2

Python – Useful String Operations Part 2
January 24, 2018 Reece

Python Logo

More Python Quirks and Simple Solutions – String Operations – Part 2

The String… In the last post we looked at the basics of getting the left, right,  and mid parts out of strings with a simple set of functions that are amazingly not native to python (I come from the Excel world). If you haven’t looked at it and want to, check it out here.

In this part, we are looking at a few other useful functions.

Multiple String Replacement

I use this almost daily. Working in transportation, I deal with a lot of data that has city names. A lot of cities can have multiple spellings (i.e. Saint Joseph: St. Joe, St Joseph, etc.) which can be a nightmare when trying to normalize data.

This function handles that problem with ease. You pass in the crap data and a dictionary that has all of the mis-spellings for the keys and the correct spellings for the values.

def multiple_replace(text, dict):
Pass a dictionary as a key to replace values
# Create a regular expression from the dictionary keys
regex = re.compile("^(%s)$" % "|".join(map(re.escape, dict.keys())))

# For each match, look-up corresponding value in dictionary
return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text)
return text

So to use this on a dataframe, you use the apply function

my_dataframe['BadData'].apply(multiple_replace, dict=dict_MyFix)

 Did you find this useful? Do you know of a better way? Please share your thoughts in the comments!


Comment below or contact me. Keep it classy.


Leave a reply

Your email address will not be published. Required fields are marked *