• Post author:
  • Post category:Pandas
  • Post last modified:November 22, 2024
  • Reading time:11 mins read
You are currently viewing How to Read Excel Multiple Sheets in Pandas

How to read excel multiple sheets in pandas? To read multiple sheets from an Excel file using Pandas, you can use the pd.read_excel() function. sheet_name param on pandas.read_excel() is used to read multiple sheets from excel. This supports reading excel sheets by name or position. When you read multiple sheets, it creates a Dict of DataFrame, each key in Dictionary is represented as Sheet name and DF for Dict value.

Advertisements

Key Points –

  • This function is the core method to read Excel files, supporting reading from multiple sheets by specifying sheet names.
  • The sheet_name parameter can accept a string for a single sheet, a list of sheet names, or None to read all sheets.
  • Setting sheet_name=None reads all sheets in the Excel file into a dictionary of DataFrames.
  • You can also use sheet indices (0 for the first sheet, 1 for the second, etc.) in the sheet_name parameter.
  • You can loop over the sheet names and use read_excel() to read each sheet individually if customized processing is needed.

Pandas Read Excel Multiple Sheets

sheet_name param also takes a list of sheet names as values that can be used to read multiple sheets into Pandas DataFrame. Not that while reading multiple sheets it returns a Dict of DataFrame. The key in Dict is a sheet name and the value would be DataFrame.

This param takes values str, int, list, or None, default 0. When None is used it reads all sheets from excel. By default, it is set to 0 meaning the first sheet.

Let’s see with an example, I have an excel file with two sheets named 'Technologies' and 'Schedule'.


import pandas as pd

# Read excel file with sheet name
dict_df = pd.read_excel('c:/apps/courses_schedule.xlsx', 
                   sheet_name=['Technologies','Schedule'])

Since we are reading two sheets from excel, this function returns Dict of DataFrame. You can get the DataFrames from Dict as follows.


# Get DataFrame from Dict
technologies_df = dict_df .get('Technologies')
schedule_df = df.get('Schedule')

# Print DataFrame's
print(technologies_df)
print(schedule_df)

The 'Technologies' sheet is converted into DataFrame as follows. Note that empty values are converted into NaN on DataFrame.


# Output:
  Courses    Fee Duration  Discount
0   Spark  25000  50 Days      2000
1  Pandas  20000  35 Days      1000
2    Java  15000      NaN       800
3  Python  15000  30 Days       500
4     PHP  18000  30 Days       800

And the 'Schedule' sheet is converted into DataFrame as follows.


# Output:
  Courses      Days                 Time
0   Spark  MON, THU   7:00 AM to 9:00 AM
1  Pandas  MON, WED  8:00 AM to 10:00 AM
2    Java  WEN, FRI   7:00 PM to 9:00 PM
3  Python  TUE, THU   6:00 PM to 8:00 PM
4     PHP  WEN, THU  8:00 AM to 10:00 AM

FAQ on Read Excel Multiple Sheets in Pandas

How do I read multiple sheets from an Excel file using Pandas?

To read multiple sheets from an Excel file using Pandas, you can use the pd.read_excel() function and specify the sheet_name parameter. The sheet_name parameter can take several values.

How can I access a specific sheet after reading multiple sheets?

After reading multiple sheets from an Excel file using Pandas, the data is typically stored in a dictionary of DataFrames where the keys are the sheet names. To access a specific sheet, you can use the sheet name as the key to retrieve the corresponding DataFrame.

Is there a way to read all sheets into a single DataFrame?

You can concatenate all sheets into a single DataFrame using the pd.concat() function. After reading multiple sheets into a dictionary of DataFrames, you can use pd.concat() to combine them into a single DataFrame.

What if I want to skip a specific sheet while reading?

You can use the skiprows or skipfooter parameters to skip rows at the beginning or end of each sheet. If you want to skip a specific sheet, you can remove it from the dictionary after reading.

How do I handle missing values while reading multiple sheets?

To handle missing values (NaN or Not a Number) while reading multiple sheets from an Excel file using Pandas, you can use the na_values parameter within the pd.read_excel() function. The na_values parameter allows you to specify a list of values that should be treated as NaN during the reading process.

Conclusion

In this article, you have learned how to read an excel with multiple sheets and convert it to pandas DataFrame. Since it returns a Dict of DataFrame, you have also learned how to get each DF from the dict.

Happy Learning !!