In this post I’ll demonstrate how to open an Excel file in Python using Pandas, a (the) module for data manipulation. I love using Pandas, and I cannot recommend it enough.
Loading Data from Excel
The following code, which I found here, will create a dictionary for each sheet in the Excel file.
import pandas xl_file = pandas.ExcelFile( file_name ) dfs = { sheet: xl_file.parse( sheet ) for sheet in xl_file.sheet_names }
If you know that you only have one sheet in your Excel file, you may do the following.
import pandas xl_file = pandas.ExcelFile( file_name ) df = xl_file.parse( xl_file.sheet_names[0] )
Changing Data Types
Usually Pandas does a great job of guessing the data type of the imported data, but in case you need to manually change a column from a string to a float, or vice versa, you would use the astype()
method of the DataFrame
object.
# some serial numbers that should be strings df['API'].astype( str ) # some depths that should be floats instead of ints df['MD'].astype( float )
Update
Since I’ve written this, a new tool has come out called PyXLL that allows users to write Excel add-ins in Python, and perform other tasks. Learn more here: https://www.pyxll.com/blog/tools-for-working-with-excel-and-python/
One thought on “Open an Excel File in Pandas”
Comments are closed.