Pandas Replace Column value in DataFrame

  • Post author:
  • Post category:Pandas
  • Post last modified:November 23, 2023

In Pandas library there are several ways to replace or update the column value in DataFarame. Changing the column values is required to curate/clean the data on DataFrame. When we are working with data we have to edit or remove certain pieces of data. We can also create new columns from existing ones or modify existing columns. If we want to do this, Pandas provides a wide range of methods that you can use to work with columns of all data types in your DataFrames.

Now, we will look specifically at replacing column values and changing part of the string (sub-strings) within columns in a DataFrame. 

Related: pandas Get Column Cell value from DataFrame

Below are some approaches to replace column values in Pandas DataFrame.

1.Quick Examples of Replace Column Value on Pandas DataFrame

Below are some of the quick examples that replace/edit/update column value in pandas DataFrame.


# Below are the quick examples

# Replace a single value with a new value for an individual DataFrame column.
df['Course'] = df['Course'].replace(['Spark'],'Pyspark')

# Replace multiple values with a new value for an individual DataFrame column.
df['Course'] = df['Course'].replace(['Pyspark','Python',...],'Spark')

# Replace multiple values with multiple new values for an individual DataFrame column.
df['Course'] = df['Course'].replace(['Pyspark','Python'....,]['Spark','22000'....,])

# Replace a single value with a new value for an entire DataFrame.
df = df.replace(['Pyspark'],'Spark')

Now, we will run these examples with a sample DataFrame and explore the output.

Let’s create a Pandas DataFrame with a few rows and columns, execute these examples and validate results.


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","Pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

Yields below output.


# Output:
Courses    Fee Duration  Discount
r1    Spark  20000   30days      1000
r2  PySpark  25000   40days      2300
r3   Python  22000   35days      1200
r4   pandas  30000   50days      2000

2. Replace Single Value with a New Value in Pandas DataFrame

In order to replace a value in Pandas DataFrame, use the replace() method with the column the from and to values. Below example replace Spark with PySpark value on the Course column.


# Replace values in pandas DataFrame.
df = pd.DataFrame(technologies, columns= ['Course','Fee'])
df['Course'] = df['Course'].replace(['Spark'],'Pyspark')
print(df)

Notice that all the Spark values are replaced with the Pyspark values under the first column.


# Output:
       Course  Fee
0      pyspark 20000
1      Pyspark 25000
2      Python 22000
3      Pandas 30000

3. Replace Multiple Values with a New Value in DataFrame

Let’s see how to replace multiple values with a new value on DataFrame column. In the below example, I am replacing the Pyspark and Python courses with a Spark value under the Courses column.


# Replace Multiple Values with a New Value in DataFrame
df = pd.DataFrame(technologies, columns= ['Courses','Fee'])
df['Courses'] = df['Courses'].replace(['Pyspark','Python'],'Spark')
print (df)

We can notice that both the Pyspark and Python course got replaced with a Spark course.


# Output:
        Courses  Fee
0       Spark    20000
1       Spark    25000
2       Spark    22000
3       Pandas   30000

4. Replace Multiple Values With Multiple New Values For a  DataFrame

If you want to replace multiple values with multiple new values for a single DataFrame column. For example.

  • The Pyspark with a Spark
  • The Python with a 22000

# Replace multiple values with multiple new values.
df = pd.DataFrame(technologies, columns= ['Courses','Fee'])
df['Courses'] = df['Courses'].replace(['Pyspark','Python'],['Spark','22000'])
print (df)

We can see that the ‘Pyspark’ became ‘Spark’ and the ‘Python’ became ‘22000’ under the first column.


# Output:
      Courses  Fee
0     Spark    20000
1     Spark    25000
2     22000    22000
3     Pandas   30000

5. Replace Single Value With New Value on All Columns of DataFrame

By now, you have seen how to replace values under a Single DataFrame column. But now, we will look at how to replace a value across the entire DataFrame?.

For example, If you, run the code below, it replaces the Pyspark course with a Spark course throughout the entire DataFrame on all columns


# Replace single value with new value in entire DataFrame.
df = pd.DataFrame(technologies, columns= ['Courses','Fee'])
df = df.replace(['Pyspark'],'Spark')
print(df)

Now we run the code, we can see that Pyspark became Spark across all the columns in the DataFrame.


# Output:
      Course  Fee
0     Spark   20000
1     Spark   25000
2     Python  22000
3     Pandas   30000

6. Replace Values on Multiple Columns of DataFrame

If we want to replace values on Multiple Columns with different values on each column use df.loc() and repalce() method.


# Replace Values on Multpile Columns.
df.loc[:,('Fee', 'Duration')].replace(25000, Spark)
print(df)

Yields below output.


# Output:
      Courses Fee  Duration
0     Spark    20000 30days
1     Pyspark  Spark 40days
2     Python   22000  35days
3     Pandas   30000  50days

7. Conclusion

In this article, you have learned about how to replace the single value, multiple values, multiple values with a new Data frame.

Reference

Leave a Reply

You are currently viewing Pandas Replace Column value in DataFrame