• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:17 mins read
You are currently viewing Pandas Replace Column Value in DataFrame

In Pandas library there are several ways to replace or update the column value in DataFarame. Changing the column values is required to curate/clean the data on DataFrame. When we are working with data we have to edit or remove certain pieces of data. We can also create new columns from existing ones or modify existing columns. If we want to do this, Pandas provides a wide range of methods that you can use to work with columns of all data types in your DataFrames.

Advertisements

Now, we will look specifically at replacing column values and changing part of the string (sub-strings) within columns in a DataFrame. 

Related: pandas Get Column Cell value from DataFrame

Below are some approaches to replace column values in Pandas DataFrame.

1. Quick Examples of Replace Column Value on Pandas DataFrame

If you are in a hurry, below are some quick examples of replace/edit/update column values in Pandas DataFrame.


# Quick examples of replace column value on pandas dataframe

# Example 1: Replace a single value with a new value 
# For an individual DataFrame column
df['Course'] = df['Course'].replace(['Spark'],'Pyspark')

# Example 2: Replace multiple values with a new value 
# For an individual DataFrame column
df['Course'] = df['Course'].replace(['Pyspark','Python',...],'Spark')

# Example 3: Replace multiple values with multiple new values 
# For an individual DataFrame column
df['Course'] = df['Course'].replace(['Pyspark','Python'....,]['Spark','22000'....,])

# Example 4: Replace a single value with a new value 
# For an entire DataFrame
df = df.replace(['Pyspark'],'Spark')

Now, we will run these examples with a sample DataFrame and explore the output.

Let’s create a Pandas DataFrame with a few rows and columns, execute these examples, and validate the results.


# Create a Pandas DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","Pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame:\n", df)

Yields below output.

pandas replace column value

2. Replace Single Value with a New Value in Pandas DataFrame

If you want to replace a single value with a new value in a Pandas DataFrame, you can use the replace() method. For instance, the replaces the value ‘Spark’ in the ‘Courses’ column with ‘Pyspark’. The resulting DataFrame (df) will have the updated value in the specified column. In order to replace a value in Pandas DataFrame, use the replace() method with the column the from and to values.


# Replace values in pandas DataFrame
df = pd.DataFrame(technologies, columns= ['Courses','Fee'])
df['Courses'] = df['Courses'].replace(['Spark'],'Pyspark')
print("DataFrame after replacement:\n",df)

Notice that all the Spark values are replaced with the Pyspark values under the first column.

pandas replace column value

3. Replace Multiple Values with a New Value in DataFrame

Let’s see how to replace multiple values with a new value on DataFrame column. In the below example, this will replace occurrences of 'Pyspark‘ and 'Python' with 'Spark' in the ‘Courses’ column of your DataFrame. The resulting DataFrame (df) will have the updated values in the specified column.


# Replace multiple values with a new value in DataFrame
df = pd.DataFrame(technologies, columns= ['Courses','Fee'])
df['Courses'] = df['Courses'].replace(['PySpark','Python'],'Spark')
print("DataFrame after replacement:\n",df)

We can notice that both the Pyspark and Python courses got replaced with a Spark course.


# Output:
DataFrame after replacement:
   Courses    Fee
0   Spark  20000
1   Spark  25000
2   Spark  22000
3  Pandas  30000

4. Replace Multiple Values With Multiple New Values For a  DataFrame

If you want to replace multiple values with multiple new values for a single DataFrame column. For example.

  • The Pyspark with a Spark
  • The Python with a 22000

# Replace multiple values with multiple new values
df = pd.DataFrame(technologies, columns= ['Courses','Fee'])
df['Courses'] = df['Courses'].replace(['Pyspark','Python'],['Spark','22000'])
print("DataFrame after replacement:\n",df)

We can see that the ‘Pyspark’ became ‘Spark’ and the ‘Python’ became ‘22000’ under the first column.


# Output:
DataFrame after replacement:
      Courses  Fee
0     Spark    20000
1     Spark    25000
2     22000    22000
3     Pandas   30000

5. Replace Single Value With New Value on All Columns of DataFrame

By now, you have seen how to replace values under a Single DataFrame column. But now, we will look at how to replace a value across the entire DataFrame.

For example, If you, run the code below, it replaces the Pyspark course with a Spark course throughout the entire DataFrame on all columns


# Replace single value with new value in entire DataFrame
df = pd.DataFrame(technologies, columns= ['Courses','Fee'])
df = df.replace(['Pyspark'],'Spark')
print("DataFrame after replacement:\n",df)

Now we run the code, we can see that Pyspark became Spark across all the columns in the DataFrame.


# Output:
DataFrame after replacement:
      Course  Fee
0     Spark   20000
1     Spark   25000
2     Python  22000
3     Pandas   30000

6. Replace Values on Multiple Columns of DataFrame

If we want to replace values on multiple columns with different values on each column use df.loc() and repalce() method.


# Replace values on multpile columns
df.loc[:,('Fee', 'Duration')].replace(25000, Spark)
print("DataFrame after replacement:\n",df)

Yields below output.


# Output:
DataFrame after replacement:
      Courses Fee  Duration
0     Spark    20000 30days
1     Pyspark  Spark 40days
2     Python   22000  35days
3     Pandas   30000  50days

Frequently Asked Questions on Replace Column Value in DataFrame

How can I replace a specific value in a column with a new value?

You can replace a specific value in a column with a new value using the replace() method in Pandas. For example, the replaces the value ‘A’ with ‘X’ in the ‘Column_Name’ column. The resulting DataFrame (df) will have the updated values in the specified column. You can modify the old and new values based on your specific requirements.

How do I replace multiple values in a column?

To replace multiple values in a column, you can use the replace() method with a dictionary specifying the mapping of old values to new values.

Can I replace values in multiple columns simultaneously?

You can replace values in multiple columns simultaneously using the replace() method in Pandas. For example, the replaces ‘A’ with ‘X’ in Column1 and ‘Y’ with ‘Z’ in Column3 using the replace_dict. You can extend the replacement to more columns by including them in the list passed to df[['Column1', 'Column3']].

How to replace values based on a condition?

To replace values in a column based on a condition, you can use boolean indexing along with the loc accessor in Pandas. For example, the code replaces values in the ‘Column_Name’ column with ‘High’ where the original values are greater than 20. Adjust the condition in the loc statement based on your specific criteria.

Does the replace() method modify the DataFrame in place?

By default, the replace() method in Pandas does not modify the DataFrame in place. Instead, it returns a new DataFrame with the specified replacements. If you want to modify the original DataFrame in place, you can use the inplace=True parameter.

Can I use regular expressions for replacement?

You can use regular expressions for replacement in Pandas using the replace() method. The regex parameter allows you to specify whether the search for the old values should be treated as regular expressions.

Conclusion

In this article, you have learned about how to replace a single value, multiple values, and multiple values with a new DataFrame.

Reference

Leave a Reply