• Post author:
  • Post category:Pandas
  • Post last modified:July 29, 2024
  • Reading time:13 mins read
You are currently viewing Pandas DataFrame equals() Method

In Pandas, the equals() method is used to determine if two DataFrame objects contain the same data. It returns True if the DataFrames are exactly equal (i.e., they have the same shape, elements, and data types), otherwise it returns False.

Advertisements

In this article, I will explain the Pandas DataFrame equals() method by using its syntax, parameters, usage, and how to return True if all elements in both objects are identical; otherwise, it returns False.

Key Points –

  • The equals() method checks if two DataFrame objects are equal in terms of shape, elements, and data types.
  • Returns a boolean value, True if the DataFrames are equal, False otherwise.
  • The method is strict in its comparison, meaning that even small differences in data types or NaN positions will result in False.
  • It compares both the data and the structure of the DataFrames, including column names and indexes.
  • It is case-sensitive and will return False for different data types (e.g., integers vs. floats) even if the numerical values are the same.

Pandas DataFrame equals() Introduction

Let’s know the syntax of the DataFrame equals() method.


# Syntax of Pandas DataFrame equals()
DataFrame.equals(other)

Parameters of the DataFrame equals()

It allows only one parameter.

  • other – The DataFrame to be compared with the current DataFrame.

Return Value

It returns True if the two DataFrames are equal, i.e., have the same shape and elements. Returns False otherwise.

Usage of Pandas DataFrame equals() Method

The equals() method in Pandas is used to compare two DataFrame objects to check if they are exactly equal.

Now, Let’s create two Pandas DataFrame using data from a Python dictionary, where the columns are Column1, Column2.


# Create DataFrame 
import pandas as pd
import numpy as np

df = pd.DataFrame({'Column1':[5, 15, 20, 25], 'Column2': [4, 12, 17, 25]})
print("Create first DataFrame:\n",df)

df1 = pd.DataFrame({'Column1':[5, 15, 20, 25], 'Column2': [4, 12, 17, 25]})
print("Create Second DataFrame:\n",df1)

Yields below output.

pandas dataframe equals

To check if two pandas DataFrames are equal, you can use the equals() method.


# Check if the DataFrames are equal
df2 = df.equals(df1)
print("Are the elements in two DataFrame contain same elements:", df2)

In the above example, the equals() method is used to check if df and df1 are identical. The result is stored in the df2 variable and printed, showing True since the DataFrames are indeed identical.

pandas dataframe equals

Comparing DataFrames with Different Values Using DataFrame.equals()

When using the DataFrame.equals() method to compare DataFrames with different values, the method will return False.


# Create DataFrame 
import pandas as pd
import numpy as np

df = pd.DataFrame({'Column1':[5, 15, 20, 25], 'Column2': [4, 12, 17, 25]})
df1 = pd.DataFrame({'Column1':[10, 15, 20, 25], 'Column2': [5, 10, 17, 20]})

# Checking if the DataFrames are equal
df2 = df.equals(df1)
print("Are the elements in two DataFrame contain same elements:", df2)

# Output:
# Are the elements in two DataFrame contain same elements: False

In the above examples, the equals() method detects differences in the DataFrames and correctly returns False.

NaN Values in Different Columns

When working with DataFrames that contain NaN values, the equals() method in Pandas treats NaNs as equal to each other, this behavior can be particularly useful when you need to compare DataFrames that might have missing data.


# Create DataFrame 
import pandas as pd
import numpy as np

df = pd.DataFrame({'Column1':[5, np.nan, 10], 'Column2': [8 ,np.nan, 15]})
df1 = pd.DataFrame({'Column1':[5, 10, np.nan], 'Column2': [np.nan, 8, 15]})

# Checking if the DataFrames are equal
df2 = df.equals(df1)
print("Are the elements in two dataframe contain same elements:", df2)

# Output:
# Are the elements in two DataFrame contain same elements: False

In the above examples, the equals() method handles NaN values when comparing DataFrames. It treats NaNs as equal if they are in the same locations but will return False if the NaNs are located differently or if other values in the DataFrames differ.

Frequently Asked Questions on Pandas DataFrame equals()

What does the equals() method do in Pandas?

The equals() method checks whether two DataFrames have the same shape and elements, including NaN values in the same locations.

Does the equals() method consider NaN values equal?

The equals() method treats NaN values as equal if they are in the same locations in both DataFrames.

Can the equals() method compare DataFrames with different shapes?

If the DataFrames have different shapes (i.e., different numbers of rows or columns), the equals() method will return False.

Can I use equals() to compare a DataFrame with other data structures like Series or lists?

The equals() method is specifically designed to compare DataFrames with other DataFrames. It will return False if you compare a DataFrame with a Series, list, or other data structures.

Is there a way to ignore the order of columns when comparing DataFrames?

The equals() method does not have an option to ignore the order of columns. You would need to reorder the columns of one DataFrame to match the other before using equals().

Conclusion

In this article, you have learned the Pandas DataFrame equals() function by using its syntax, parameters, usage, and how we can return True if the DataFrames are equal, and False otherwise.

Happy Learning!!

Reference