By using replace() & dropna() methods you can remove infinite values from rows & columns in pandas DataFrame. Infinite values are represented in NumPy
as np.inf
& -np.inf
for negative values. you get np with the statement import numpy as np
.
In this article, I will explain how to drop/remove infinite values from pandas DataFrame. In order to remove infinite values, you can either first replace infinite values with NaN
and remove NaN
from DataFrame or use pd.set_option('use_inf_as_na',True)
to consider all infinite values as Nan.
1. Create a Pandas DataFrame With Sample Data
Let’s create a DataFrame with a few rows and columns, execute some examples and validate the results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
with infinite values on all columns.
# Create DataFrame
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas",np.inf,"Python",-np.inf],
'Fee' :[22000,25000,23000,np.inf,26000,25000,-np.inf,24000],
'Duration':['30day','50days','55days', '40days','60days',-np.inf,'55days',np.inf],
'Discount':[1000,2300,1200,np.inf,2500,-np.inf,2000,1500]
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 22000.0 30day 1000.0
1 PySpark 25000.0 50days 2300.0
2 Hadoop 23000.0 55days 1200.0
3 Python inf 40days inf
4 pandas 26000.0 60days 2500.0
5 inf 25000.0 -inf -inf
6 Python -inf 55days 2000.0
7 -inf 24000.0 inf 1500.0
2. pands Drop Infinite Values
By using df.replace()
, replace the infinite values with the NaN values and then use the pandas.DataFrame.dropna() method to remove the rows with NaN, Null/None values. This eventually drops infinite values from pandas DataFrame. inplace=True is used to update the existing DataFrame.
# Replace infinite updated data with nan
df.replace([np.inf, -np.inf], np.nan, inplace=True)
# Drop rows with NaN
df.dropna(inplace=True)
print(df)
Yields below output. df.replace([np.inf, -np.inf], np.nan, inplace=True)
replaces all np.inf & -np,inf values with NaN on current DataFrame.
# Output:
Courses Fee Duration Discount
0 Spark 22000.0 30day 1000.0
1 PySpark 25000.0 50days 2300.0
2 Hadoop 23000.0 55days 1200.0
4 pandas 26000.0 60days 2500.0
3. Using pandas.option.context() to Consider Infinite as NaN
You can use with pd.option_context('mode.use_inf_as_na',True):
to consider all inf as Nan within a block of code. In python with
is used to specify the scope of the block. IN case if you wanted to consider all inf as Nan in a complete program the use pd.set_option('use_inf_as_na',True)
.
Note: For older versions, replace use_inf_as_na
with use_inf_as_null
.
# Changing option context to use infinite as nan
# Drop the rows with nan or infinite values
with pd.option_context('mode.use_inf_as_na', True):
df.dropna(inplace=True)
print(df)
Yields same output as above.
4. Using pandas replace() & dropna() To Drop Infinite Values
Use df.replace()
to replace entire infinite values with np.nan
and use pd.DataFrame.dropna(axis=0)
. to drop rows. This ideally drops all infinite values from pandas DataFrame.
# Replace to drop rows or columns infinite values
df = df.replace([np.inf, -np.inf], np.nan).dropna(axis=0)
print(df)
5. Pandas Changing Option to Consider Infinite as NaN
You can do using pd.set_option()
to pandas provided the option to use consider infinite as NaN. It makes the entire pandas module consider the infinite values as NaN. Use the pandas.DataFrame.dropna()
method to drop the rows with infinite values.
# Changing option to consider infinite as nan
pd.set_option('mode.use_inf_as_na', True)
df.dropna(inplace=True)
print(df)
Yields same output as above.
6. Using DataFrame.isin() to Create Filter
Use filter df=df[~df_filter]
to mask the infinite values.
# Using DataFrame.isin() to Create Filter
df_filter = df.isin([np.nan, np.inf, -np.inf])
# Mask df with the filter
df = df[~df_filter]
df.dropna(inplace=True)
print(df)
Yields same output as above.
7. Select Non-Null Rows Using DataFrame.replace()
You can use df[df.replace([np.inf,-np.inf],np.nan).notnull().all(axis=1)]
to replace infinite and -infinite with NaN, and then select non-null rows. axis set 1 to drop columns.
# Using replace method to select non-null rows
df = df[df.replace([np.inf, -np.inf], np.nan).notnull().all(axis=1)]
print(df)
8. Complete Example of pandas Drop Infinite Values
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas",np.inf,"Python",-np.inf],
'Fee' :[22000,25000,23000,np.inf,26000,25000,-np.inf,24000],
'Duration':['30day','50days','55days', '40days','60days',-np.inf,'55days',np.inf],
'Discount':[1000,2300,1200,np.inf,2500,-np.inf,2000,1500]
}
df = pd.DataFrame(technologies)
print(df)
# Replace infinite display updated data with nan
df.replace([np.inf, -np.inf], np.nan)
print(df)
# Replace infinite updated data with nan
df.replace([np.inf, -np.inf], np.nan, inplace=True)
# Drop rows with NaN
df.dropna(inplace=True)
print(df)
# Changing option context to use infinite as nan
with pd.option_context('mode.use_inf_as_na', True):
# Drop the rows with nan or infinite values
df.dropna(inplace=True)
print(df)
# Replace to drop rows or columns infinite values
df = df.replace([np.inf, -np.inf], np.nan).dropna(axis=0)
print(df)
# Changing option to consider infinite as nan
pd.set_option('mode.use_inf_as_na', True)
df.dropna(inplace=True)
print(df)
# Using DataFrame.isin() to Create Filter
df_filter = df.isin([np.nan, np.inf, -np.inf])
# Mask df with the filter
df = df[~df_filter]
df.dropna(inplace=True)
print(df)
# Using replace method to select non-null rows
df = df[df.replace([np.inf, -np.inf], np.nan).notnull().all(axis=1)]
print(df)
Conclusion
In this article, you have learned how to drop infinite values from pandas DataFrame using DataFrame.replace()
, DataFrame.dropna()
, and DataFrame.isin()
method. Also, you have learned how to replace all infinite values with Nan or any specific values.
Happy Learning !!
Related Articles
- How to Apply a Function to Two Columns on pandas DataFrame
- Drop a Level from a Multi-Level Column Index in pandas
- Replace NaN with Blank/Empty String on Pandas
- How to Change Position of a Column in Pandas
- Install Python Pandas on Windows, Linux & Mac OS
- Pandas Join Explained With Examples
- Pandas set_index() – Set Index to DataFrame
- Pandas Create DataFrame From Dict (Dictionary)