Solution to fix the “Excel file format cannot be determined, you must specify an engine manually” error in Python

When attempting to access an Excel file in Python, if you receive the “Excel file format cannot be determined, you must specify an engine manually” message. This is because the Excel file format cannot be determined. So, you must manually define an engine. We’ll show you some ways to fix this error in the sections below. Scroll down and continue reading.

What causes the “Excel file format cannot be determined, you must specify an engine manually” error?

When you access an Excel file to work with, the operating system creates a temporary file that looks like this: ~$employees.xls.

Then, in our project, two .xlsx files with nearly identical names appear, and Python cannot determine which file needs to be read. This is the main cause of this error.

Consider the following example to get a better understanding.

When you access the Excel file, the system will generate a temporary file that looks like this:

import pandas as pd

# Read an Excel file using read_excel()
print(pd.read_excel('employees.xls'))

If you run the above code, you will receive the error:

Excel file format cannot be determined, you must specify an engine manually.

How to resolve this error?

Two ways to fix this error are saving the Excel file in a different format and manually specifying an engine. Below we will guide you to do that in detail.

Saving the Excel file in a different format

You can save an Excel worksheet as another file using the Save As command.

Navigate to File > Save As > Choose Browse.

In the Save As dialogue box, select the file format for the worksheet under Save as type, such as CSV (Comma delimited). As shown below:

After saving successfully, you will see the following csv file:

Instead of reading the Excel file with read_excel(), we will now read the CSV file with read_csv(). Like this:

import pandas as pd

# Read a CSV file using read_csv()
print(pd.read_csv('employees.csv'))

Output:

   ID    Name  Gender
0   1   David    Male
1   2   Lucas    Male
2   3   Betty  Female
3   4  Rachel  Female

As you can see from the output, we can read the contents of the original Excel file without any errors.

Specifying the engine manually

Another way to fix this error is specifying the engine manually when you open the Excel file. This tells Python which parser to use and should help it correctly identify the file’s format.

We can use two main packages in this case: xlrd and openpyxl. We recommend using openpyxl instead of xlrd because it is more versatile and easier to use.

To install openpyxl, run the following command in your terminal:

pip install openpyxl

After installing openpyxl, you should be able to read your Excel file without any issues. Let’s re-run our faulty code to make sure!

This time, we also opened the Excel file to force the operating system to create a temporary file. Like this:

Then try executing the following code again:

import pandas as pd

# Read an Excel file using read_excel()
print(pd.read_excel('employees.xls'))

Output:

   ID    Name  Gender
0   1   David    Male
1   2   Lucas    Male
2   3   Betty  Female
3   4  Rachel  Female

As you can see, with the openpyxl module installed, we can work with the Excel file without converting it to another format.

Summary

The leading cause for the error “Excel file format cannot be determined, you must specify an engine manually” is that Microsoft software creates temporary files automatically. Installing the openpyxl module is the simplest and quickest way to fix it. After reading this article, we are sure you will not encounter this error again. Share this article with your friends if you found it helpful.

Leave a Reply

Your email address will not be published. Required fields are marked *