• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:13 mins read
You are currently viewing pandas map() Function – Examples

pandas map() function from Series is used to substitute each value in a Series with another value, that may be derived from a function, a dict or a Series. Since DataFrame columns are series, you can use map() to update the column and assign it back to the DataFrame.

pandas Series is a one-dimensional array-like object containing a sequence of values. Each of these values is associated with a label called index. We can create a Series by using an array-like object (e.g., a list) or a dictionary.

pandas map() Key Points

  • This method is defined only in Series and not present in DataFrame.
  • map() accepts dictSeries, or callable
  • You can use this to perform operations on a specific column of a DataFrame as each column in a DataFrame is a Series.
  • map() when passed a dictionary/Series will map elements based on the keys in that dictionary/Series. Missing values will be recorded as NaN in the output.
  • Series.map() operate on one element at a time.

1. Syntax of pandas map()

The following is the syntax of the pandas map() function. This accepts arg and na_action as parameters and returns a Series.


# Syntax of Series.map()
Series.map(arg, na_action=None)

Following are the parameters

  • arg – Accepts function, dict, or Series
  • na_action – Accepts ignore, None. Default set to None.

Let’s create a DataFrame and use it with the map() function to update the DataFrame column.


# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
    'Fee' :[22000,25000,23000,np.NaN,26000],
    'Duration':['30days','50days','30days','35days','40days']
          }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

pandas map

2. Series.map() Example

You can only use the Series.map() function with the particular column of a pandas DataFrame. If you are not aware, every column in DataFrame is a Series. For example, df[‘Fee’] returns a Series object. Let’s see how to apply the map function on one of the DataFrame columns and assign it back to the DataFrame.


# Using Lambda Function
df['Fee'] = df['Fee'].map(lambda x: x - (x*10/100))
print("After applying map to specific column:\n", df)

Yields below output. This example substitutes 10% from the Fee column vaue.

pandas map

You can also apply a function with the lambda as below. This yields the same output as above.


# Using custom function
def fun1(x):
    return x/100
df['Fee'] = df['Fee'].map(lambda x:fun1(x))

3. Handling NaN by using na_action param

The na_action param is used to handle NaN values. The default option for this argument is None, using which the NaN values are passed to the mapping function which may result in incorrect. You can also use ‘ignore’, where no action is performed.


# Let's add the currently to the Fee
df['Fee'] = df['Fee'].map('{} RS'.format)
print(df)

Yields below output. Notice that the Value for Fee column for index 3 is ‘nan RS’ which doesn’t make sense.


# Output:
       Fee Duration
0  198.0 RS   30days
1  225.0 RS   50days
2  207.0 RS   30days
3    nan RS   35days
4  234.0 RS   40days

Now let’s use the na_action=’ignore’. This ignores the updating when it sees the NaN value.


# Use na_action param
df['Fee'] = df['Fee'].map('{} RS'.format, na_action='ignore')
print(df)

Yields below output


# Output:
        Fee Duration
0  198.0 RS   30days
1  225.0 RS   50days
2  207.0 RS   30days
3       NaN   35days
4  234.0 RS   40days

4. Using map() with Dictionary

Alternatively, you can also use the dictionary as the mapping function.


# Using Dictionary for mapping
dict_map = {'30days':'35 Days','50days':'55 Days',
            '40days':'45 Days'}
updateSer = df['Duration'].map(dict_map)
df['Duration'] = updateSer
print(df)

Yields below output.


# Output:
        Fee Duration
0  198.0 RS  35 Days
1  225.0 RS  55 Days
2  207.0 RS  35 Days
3       NaN      NaN
4  234.0 RS  45 Days

5. Complete Example of pandas map() Function


# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
    'Fee' :[22000,25000,23000,np.NaN,26000],
    'Duration':['30days','50days','30days','35days','40days']
          }
df = pd.DataFrame(technologies)
print(df)

# Using Lambda Function
df['Fee'] = df['Fee'].map(lambda x: x - (x*10/100))
print(df)

# Using custom function
def fun1(x):
    return x/100
ser = df['Fee'].map(lambda x:fun1(x))
print(ser)

# Let's add the currently to the Fee
df['Fee'] = df['Fee'].map('{} RS'.format)
print(df)

df['Fee'] = df['Fee'].map('{} RS'.format, na_action='ignore')
print(df)

# Using Dictionary for mapping
dict_map = {'30days':'35 Days','50days':'55 Days',
            '40days':'45 Days'}
updateSer = df['Duration'].map(dict_map)
df['Duration'] = updateSer
print(df)

FAQ on Pandas map() Function

What is the map() function in Pandas?

The map() function is a built-in function of Pandas that is used for applying a function or mapping values to elements of an iterable, such as a list, Series, or DataFrame.

How does the map() function work?

The map() function allows two arguments: a function and an iterable. It applies the function to each element of the iterable and returns a new iterable.

What types of iterables can the map() function be applied to?

You can use the map() function with a wide range of iterables, including lists, Series, DataFrames, and other iterable objects.

How can the map() function be used with custom functions?

The map() function can be used with custom functions. You can define your own functions and pass them as the first argument to map().

Conclusion

In this article, I have explained map() function is from the Series which is used to substitute each value in a Series with another value and return a Series object, since DataFrame is a collection of Series, you can use the map() function to update the DataFrame.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium