• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:15 mins read
You are currently viewing Pandas – Drop Infinite Values From DataFrame

By using replace() & dropna() methods you can remove infinite values from rows & columns in pandas DataFrame. Infinite values are represented in NumPy as np.inf & -np.inf for negative values. you get np with the statement import numpy as np .

In this article, I will explain how to drop/remove infinite values from Pandas DataFrame. In order to remove infinite values, you can either first replace infinite values with NaN and remove NaN from DataFrame or use pd.set_option('use_inf_as_na',True) to consider all infinite values as Nan.

1. Create a Pandas DataFrame With Sample Data

Let’s create a DataFrame with a few rows and columns, execute some examples and validate the results. Our DataFrame contains column names Courses, Fee, Duration, and Discount with infinite values on all columns.


# Create DataFrame
import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas",np.inf,"Python",-np.inf],
    'Fee' :[22000,25000,23000,np.inf,26000,25000,-np.inf,24000],
    'Duration':['30days','50days','55days', '40days','60days',-np.inf,'55days',np.inf],
    'Discount':[1000,2300,1200,np.inf,2500,-np.inf,2000,1500]
                }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n",df)

Yields below output.

Pandas Drop Infinite values

2. Pandas Drop Infinite Values

By using df.replace() function is used to replace infinite values with NaN, and then use the pandas.DataFrame.dropna() method to remove the rows with NaN, Null/None values. This eventually drops infinite values from pandas DataFrame. The inplace=True parameter modifies the original DataFrame in place.


# Replace infinite updated data with nan
df.replace([np.inf, -np.inf], np.nan, inplace=True)

# Drop rows with NaN
df.dropna(inplace=True)
print("DataFrame without infinite values:\n",df)

Yields below output. df.replace([np.inf, -np.inf], np.nan, inplace=True) replaces all np.inf & -np,inf values with NaN on the current DataFrame.

Pandas Drop Infinite values

3. Using pandas.option.context() to Consider Infinite as NaN

You can use with pd.option_context('mode.use_inf_as_na',True): to consider all inf as Nan within a block of code. In python with is used to specify the scope of the block. IN case if you wanted to consider all inf as Nan in a complete program the use pd.set_option('use_inf_as_na',True).

In this example, option_context is used as a context manager to set the 'mode.use_inf_as_na' option to True temporarily, which means treating infinite values as NaN. The subsequent code within the with block will operate with this option setting.

Note: For older versions, replace use_inf_as_na with use_inf_as_null.


# Using option_context 
# To consider infinite as NaN temporarily
with pd.option_context('mode.use_inf_as_na', True):
    df = df.dropna()
print("DataFrame without infinite values:\n",df)

# Drop the rows with nan or infinite values
with pd.option_context('mode.use_inf_as_na', True):
  df.dropna(inplace=True)
print(df)

Yields the same output as above.

4. Using pandas replace() & dropna() To Drop Infinite Values

Use df.replace() to replace entire infinite values with np.nan and use pd.DataFrame.dropna(axis=0). to drop rows. This ideally drops all infinite values from pandas DataFrame.

For instance, first uses replace() to replace infinite values with NaN, and then dropna() is applied to remove rows containing NaN. The resulting DataFrame (df) will not have rows containing infinite values.


# Replace to drop rows or columns infinite values
df = df.replace([np.inf, -np.inf], np.nan).dropna(axis=0)
print("DataFrame without infinite values:\n",df)

Yields the same output as above.

5. Pandas Changing Option to Consider Infinite as NaN

You can do using pd.set_option() to pandas provided the option to use consider infinite as NaN. It makes the entire pandas module consider the infinite values as NaN. Use the pandas.DataFrame.dropna() method to drop the rows with infinite values.


# Changing option to consider infinite as nan
pd.set_option('mode.use_inf_as_na', True)
df.dropna(inplace=True)
print("DataFrame without infinite values:\n",df)

Yields the same output as above.

6. Using DataFrame.isin() to Create Filter

Use filter df=df[~df_filter] to mask the infinite values. This version combines the filtering step (df[~df_filter]) with dropping NaN values (.dropna()), making the code more concise while achieving the same result. Adjust the code as needed for your specific use case.


# Using DataFrame.isin() to Create Filter
df_filter = df.isin([np.nan, np.inf, -np.inf])

# Mask df with the filter
df = df[~df_filter]
df.dropna(inplace=True)
print("DataFrame without infinite values:\n",df)

# Mask df with the filter and drop NaN values
df = df[~df_filter].dropna()
print("DataFrame without infinite values:\n", df)

Yields the same output as above.

7. Select Non-Null Rows Using DataFrame.replace()

You can use df[df.replace([np.inf,-np.inf],np.nan).notnull().all(axis=1)] to replace infinite and -infinite with NaN, and then select non-null rows. axis set 1 to drop columns.


# Using replace method to select non-null rows
df = df[df.replace([np.inf, -np.inf], np.nan).notnull().all(axis=1)] 
print("DataFrame without infinite values:\n", df)

Yields the same output as above.

8. Complete Example of pandas Drop Infinite Values


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas",np.inf,"Python",-np.inf],
    'Fee' :[22000,25000,23000,np.inf,26000,25000,-np.inf,24000],
    'Duration':['30day','50days','55days', '40days','60days',-np.inf,'55days',np.inf],
    'Discount':[1000,2300,1200,np.inf,2500,-np.inf,2000,1500]
                }
df = pd.DataFrame(technologies)
print(df)

# Replace infinite display updated data with nan
df.replace([np.inf, -np.inf], np.nan)
print(df)

# Replace infinite updated data with nan
df.replace([np.inf, -np.inf], np.nan, inplace=True)
# Drop rows with NaN
df.dropna(inplace=True)
print(df)

# Changing option context to use infinite as nan
with pd.option_context('mode.use_inf_as_na', True):
# Drop the rows with nan or infinite values
  df.dropna(inplace=True)
print(df)

# Replace to drop rows or columns infinite values
df = df.replace([np.inf, -np.inf], np.nan).dropna(axis=0)
print(df)

# Changing option to consider infinite as nan
pd.set_option('mode.use_inf_as_na', True)
df.dropna(inplace=True)
print(df)

# Using DataFrame.isin() to Create Filter
df_filter = df.isin([np.nan, np.inf, -np.inf])
# Mask df with the filter
df = df[~df_filter]
df.dropna(inplace=True)
print(df)

# Using replace method to select non-null rows
df = df[df.replace([np.inf, -np.inf], np.nan).notnull().all(axis=1)] 
print(df)

Frequently Asked Questions on Drop Infinite Values From DataFrame

How can I drop rows with infinite values from a pandas DataFrame?

To drop rows with infinite values from a pandas DataFrame, you can use a combination of replace() and dropna() functions.

Can I use pd.option_context() to treat infinite values as NaN temporarily?

You can use pd.option_context() to set the option 'mode.use_inf_as_na' to True temporarily. This will make operations treat infinite values as NaN within the specified context.

Can I use DataFrame.isin() to create a filter for infinite values?

You can use DataFrame.isin() to create a filter for infinite values in a pandas DataFrame.

How can I use DataFrame.isin() to create a filter and drop rows with infinite values?

You can use DataFrame.isin() to create a filter for infinite values and then apply this filter to drop rows with infinite values.

Conclusion

In this article, you have learned how to drop infinite values from pandas DataFrame using DataFrame.replace(), DataFrame.dropna(), and DataFrame.isin() method. Also, you have learned how to replace all infinite values with Nan or any specific values.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

Leave a Reply