More Python Quirks and Simple Solutions – String Operations – Part 2
The String… In the last post we looked at the basics of getting the left, right, and mid parts out of strings with a simple set of functions that are amazingly not native to python (I come from the Excel world). If you haven’t looked at it and want to, check it out here.
In this part, we are looking at a few other useful functions.
Multiple String Replacement
I use this almost daily. Working in transportation, I deal with a lot of data that has city names. A lot of cities can have multiple spellings (i.e. Saint Joseph: St. Joe, St Joseph, etc.) which can be a nightmare when trying to normalize data.
This function handles that problem with ease. You pass in the crap data and a dictionary that has all of the mis-spellings for the keys and the correct spellings for the values.
def multiple_replace(text, dict): ''' Pass a dictionary as a key to replace values ''' try: # Create a regular expression from the dictionary keys regex = re.compile("^(%s)$" % "|".join(map(re.escape, dict.keys()))) # For each match, look-up corresponding value in dictionary return regex.sub(lambda mo: dict[mo.string[mo.start():mo.end()]], text) except: return text
So to use this on a dataframe, you use the apply function
Did you find this useful? Do you know of a better way? Please share your thoughts in the comments!
Comment below or contact me. Keep it classy.