In Pandas, the equals()
method is used to determine if two DataFrame objects contain the same data. It returns True
if the DataFrames are exactly equal (i.e., they have the same shape, elements, and data types), otherwise it returns False
.
In this article, I will explain the Pandas DataFrame equals()
method by using its syntax, parameters, usage, and how to return True
if all elements in both objects are identical; otherwise, it returns False
.
Key Points –
- The
equals()
method checks if two DataFrame objects are equal in terms of shape, elements, and data types. - Returns a boolean value,
True
if the DataFrames are equal,False
otherwise. - The method is strict in its comparison, meaning that even small differences in data types or NaN positions will result in
False
. - It compares both the data and the structure of the DataFrames, including column names and indexes.
- It is case-sensitive and will return
False
for different data types (e.g., integers vs. floats) even if the numerical values are the same.
Pandas DataFrame equals() Introduction
Let’s know the syntax of the DataFrame equals() method.
# Syntax of Pandas DataFrame equals()
DataFrame.equals(other)
Parameters of the DataFrame equals()
It allows only one parameter.
other
– The DataFrame to be compared with the current DataFrame.
Return Value
It returns True
if the two DataFrames are equal, i.e., have the same shape and elements. Returns False
otherwise.
Usage of Pandas DataFrame equals() Method
The equals()
method in Pandas is used to compare two DataFrame objects to check if they are exactly equal.
Now, Let’s create two Pandas DataFrame using data from a Python dictionary, where the columns are Column1
, Column2
.
# Create DataFrame
import pandas as pd
import numpy as np
df = pd.DataFrame({'Column1':[5, 15, 20, 25], 'Column2': [4, 12, 17, 25]})
print("Create first DataFrame:\n",df)
df1 = pd.DataFrame({'Column1':[5, 15, 20, 25], 'Column2': [4, 12, 17, 25]})
print("Create Second DataFrame:\n",df1)
Yields below output.
To check if two pandas DataFrames are equal, you can use the equals()
method.
# Check if the DataFrames are equal
df2 = df.equals(df1)
print("Are the elements in two DataFrame contain same elements:", df2)
In the above example, the equals()
method is used to check if df
and df1
are identical. The result is stored in the df2
variable and printed, showing True
since the DataFrames are indeed identical.
Comparing DataFrames with Different Values Using DataFrame.equals()
When using the DataFrame.equals()
method to compare DataFrames with different values, the method will return False
.
# Create DataFrame
import pandas as pd
import numpy as np
df = pd.DataFrame({'Column1':[5, 15, 20, 25], 'Column2': [4, 12, 17, 25]})
df1 = pd.DataFrame({'Column1':[10, 15, 20, 25], 'Column2': [5, 10, 17, 20]})
# Checking if the DataFrames are equal
df2 = df.equals(df1)
print("Are the elements in two DataFrame contain same elements:", df2)
# Output:
# Are the elements in two DataFrame contain same elements: False
In the above examples, the equals()
method detects differences in the DataFrames and correctly returns False
.
NaN Values in Different Columns
When working with DataFrames that contain NaN values, the equals()
method in Pandas treats NaNs as equal to each other, this behavior can be particularly useful when you need to compare DataFrames that might have missing data.
# Create DataFrame
import pandas as pd
import numpy as np
df = pd.DataFrame({'Column1':[5, np.nan, 10], 'Column2': [8 ,np.nan, 15]})
df1 = pd.DataFrame({'Column1':[5, 10, np.nan], 'Column2': [np.nan, 8, 15]})
# Checking if the DataFrames are equal
df2 = df.equals(df1)
print("Are the elements in two dataframe contain same elements:", df2)
# Output:
# Are the elements in two DataFrame contain same elements: False
In the above examples, the equals()
method handles NaN values when comparing DataFrames. It treats NaNs as equal if they are in the same locations but will return False
if the NaNs are located differently or if other values in the DataFrames differ.
Frequently Asked Questions on Pandas DataFrame equals()
The equals()
method checks whether two DataFrames have the same shape and elements, including NaN values in the same locations.
The equals()
method treats NaN values as equal if they are in the same locations in both DataFrames.
If the DataFrames have different shapes (i.e., different numbers of rows or columns), the equals()
method will return False
.
The equals()
method is specifically designed to compare DataFrames with other DataFrames. It will return False
if you compare a DataFrame with a Series, list, or other data structures.
The equals()
method does not have an option to ignore the order of columns. You would need to reorder the columns of one DataFrame to match the other before using equals()
.
Conclusion
In this article, you have learned the Pandas DataFrame equals()
function by using its syntax, parameters, usage, and how we can return True
if the DataFrames are equal, and False
otherwise.
Happy Learning!!
Related Articles
- Pandas DataFrame sum() Method
- Pandas DataFrame corr() Method
- Pandas DataFrame copy() Function
- Pandas DataFrame assign() Method
- Pandas Get DataFrame Shape
- Pandas DataFrame isna() Function
- Pandas DataFrame insert() Function
- Pandas DataFrame tail() Method
- Pandas DataFrame pivot() Method
- Pandas DataFrame explode() Method
- Pandas DataFrame nunique() Method
- How to Unpivot DataFrame in Pandas?