Home » excel » python – pandas Combine Excel Spreadsheets

python – pandas Combine Excel Spreadsheets

Posted by: admin March 9, 2020 Leave a comment

Questions:

I have an Excel workbook with many tabs.
Each tab has the same set of headers as all others.
I want to combine all of the data from each tab into one data frame (without repeating the headers for each tab).

So far, I’ve tried:

import pandas as pd
xl = pd.ExcelFile('file.xlsx')
df = xl.parse()

Can use something for the parse argument that will mean “all spreadsheets”?
Or is this the wrong approach?

Thanks in advance!

Update: I tried:

a=xl.sheet_names
b = pd.DataFrame()
for i in a:
    b.append(xl.parse(i))
b

But it’s not “working”.

How to&Answers:

This is one way to do it — load all sheets into a dictionary of dataframes and then concatenate all the values in the dictionary into one dataframe.

import pandas as pd

Set sheetname to None in order to load all sheets into a dict of dataframes
and ignore index to avoid overlapping values later (see comment by @bunji)

df = pd.read_excel('tmp.xlsx', sheetname=None, ignore_index=True)

Then concatenate all dataframes

cdf = pd.concat(df.values())

print(cdf)