Python – Pandas Easy Batch Data Export

Python – Pandas Easy Batch Data Export
June 27, 2017 Reece

Python Logo

Intro

As an analyst for a day job, I have found myself using tools like Python more and more. Though the business environment I am in still leans heavily on Excel. I routinely have found myself doing a bulk of my analysis in Python but the end user needs something in Excel for them to review the results. A lot of times those outputs rely on more than one table or DataFrame. I got tired of writing the same old code for each data frame that I wanted to export so I came up with a simple solution leveraging the mighty for loop.

Please note, I am self-taught via the Google Search bar. I am not what some would call a Professional Programmer since I didn’t go to school for it. One thing I have learned is that there are many ways to accomplish something, especially in Python. If you have a better way, please share it in the comments. Be respectful of others in the comment section or your comment will be deleted. We’re just having fun.

Batch Export Data Frames To Excel

The following function works well, at least for me, to batch export data frames. It takes lists in as arguments and loops through them to export the data frames.

def TablesToExcel(dfs:list, names:list, bInclIndex:list,destString:list, _G):
    '''
    pass globals() into the _G variable when calling\n
    dfs = List of Data Frames to export\n
    names = List of Names for the tabs\n
    destString = Save to location\n
    bInclIndex = list of True|False for each Data Frame to include the Index
    '''
    writer1 = pd.ExcelWriter(destString)
    i = 0
    for df in (dfs):
        d = _G[df]
        n = names[i]
        bIndex = bInclIndex[i]
        if(len(d.index)>0): 
            print("Exporting: " + df + " | Sheet Name: " + n)
            d.to_excel(writer1, sheet_name=n, verbose=False, index_label=True, index=bIndex)
        else:
            print("Skipping Export of: " + df + " | Dataframe is empty.")
        i = i+1
    writer1.save()



It is important to note here that when passing in the list of data frames, they should be passed as strings, not the actual data frames.

# Define the variables
dataframeList = ['df1', 'df2', 'df3']
nameList = ['My Table 1', 'My Table 2', 'My Table 3']
inclIndexList = [True, False, False]
dest = r"C:\export\location\myexport.xlsx"

# Calling the function
TablesToExcel(dfs=dataframeList, names=nameList, bInclIndex=inclIndexList, destString=dest, _G=globals())</pre>
<pre>

Easy. This really comes in handy when you have to export a dozen tables to excel.

Batch Export Data Frames To SQL

The process for exporting to SQL is roughly the same. On this function, I have added the ability to toggle whether or not to export a table using a 1 or 0 instead of messing with commenting out the code.

def TablesToSQL(dfs:list, names:list, bInclIndex:list, expConfirm:list, engine, _G):
    '''
    dfs = List of Data Frame names to export \n
    names = List of Export names for SQL server \n
    bInclIndex = list of True|False for each Data Frame to include the Index \n
    engine = SQL Alchemy Engine that is setup for the connection \n
    _G = globals() \n
    expConfirm = List of 1|0 for each dataframe to confirm export (1) or skip (0)
    '''
    i = 0 
    for df in (dfs):
        d = _G[df]
        n = names[i]
        bI = bInclIndex[i]
        conf = expConfirm[i]
        if (conf == 1 and len(d.index)>0):
            print("Exporting to SQL: " + df + " | Dest Table: " + n)
            d.to_sql(n, engine, index=bI, if_exists='replace')
        else:
            print("Skipping: " + df)
        i = i+1

The above code in practice:

# Define the variables
dataframeList = ['df1', 'df2', 'df3']
nameList = ['tbl1', 'tbl2', 'tbl3']
inclIndexList = [True, False, False]
exportConfirm = [0,1,1]

# Calling the function
# ASSUMING YOU ALREADY HAVE A SQL ENGINE DEFINED AS engine
TablesToSql(dataframeList, nameList, inclIndexList, exportConfirm, engine, globals())

The above example would export ‘df2’ and ‘df3’ as ‘tbl2’ and ‘tbl3’ respectively to your SQL database, skipping the first one. Note that I didn’t cover creating the ‘engine’ to connect to the database as that is out of scope for this post.

Hope you were able to find this helpful.

Cheers,
Reece

Questions?

Comment below or contact me.

0 Comments

Leave a reply

Your email address will not be published. Required fields are marked *

*