In Pandas, the series.map()
function is used to replace the values of a Series based on a specified mapping (a dictionary, a function, or another Series). It’s a convenient method for element-wise transformations.
In this article, I will explain the series.map()
function and using its syntax, parameters, and usage how we can return a new Series containing the transformed values by applying a function or mapping a dictionary.
Key Points –
- The
map()
method in Pandas Series is primarily used for transforming values based on a mapping or function. It provides a convenient way to apply a function or a mapping dictionary to each element in the Series, creating a new Series with the transformed values. - When you use the
map()
function, it creates a new Series with the transformed values. The original Series remains unchanged. If you want to modify the original Series, you need to assign the result back to it. - The
map()
function is highly flexible and can accommodate various use cases. It allows you to use a dictionary, a function, or even another Series to define the mapping for transforming values. - Custom functions, including lambda functions, can be employed to define complex transformations. This flexibility makes it a versatile tool for data manipulation, especially when dealing with categorical data, data cleaning, or creating derived features based on existing ones.
- The
map()
function provides thena_action
parameter, allowing you to specify how to handle NaN (Not a Number) values. You can choose to ignore them, raise an error, or handle them in a custom way.
Syntax of pandas Series map()
Following is the syntax of the series map() function.
# Syntax of series map()
Series.map(arg, na_action=None)
Parameters of the Series map()
Following are the parameters of the map() function.
arg
– It can be a dictionary, a function, or a Series. Ifarg
is a dictionary, it is used to map values from the Series to new values. Ifarg
is a function, it is applied to each element of the Series. Ifarg
is a Series, it is used to map values from the Series to corresponding values in the other Series.na_action
– This parameter specifies the action to take when encountering NaN values. It can take values like ‘ignore’, ‘raise’, or None. The default isNone
, which means NaN values are not treated specially.
Return Value
The map()
function in pandas Series returns a new Series with the transformed values. The original Series remains unchanged.
Mapping the Pandas Series with a Dictionary
Let’s use the map()
function to map the values in a Pandas Series using a dictionary.
First, let’s create a Pandas Series from a Python dictionary.
import pandas as pd
# Create a sample Series
data = {'A': 1, 'B': 2, 'C': 3, 'D': 4}
series = pd.Series(data)
print("Create Pandas Series:\n",series)
Yields below output.
You can use the map()
function to apply the mapping to a Pandas Series. First, create the original Series that has numeric values, and then create a dictionary(having a common column of both Series and dictionary) for mapping. Then you can apply the map() function to the dictionary along with the Series accessor. It replaces the Series values with corresponding mapping values based on a common column. The resulting Series (result
) contains the transformed values.
# Define a mapping
mapping = {1:'Python', 2:'Spark', 3:'Pandas', 4:'Pyspark'}
# Use map() to apply the mapping to the Series
result = series.map(mapping)
print("Mapped Series:\n", result)
Yields below output.
Mapping the Series to Apply a Custom Function
Alternatively, you can use the map()
function to map the Pandas Series values by applying a custom function. For instance, the custom_function
is a simple function that squares each element in the original Series and adds 10 to it. The map()
function is then applies this custom function to each element in the Series, resulting in a new Series with the transformed values.
# Define a custom function
def custom_function(x):
return x ** 2 + 10
# Use map() to apply the custom function to the Series
result = series.map(custom_function)
print("Mapped series with custom function:\n", result)
# Output:
# Mapped series with custom function:
# A 11
# B 14
# C 19
# D 26
# dtype: int64
Handling Unknown Values with na_action
The na_action
parameter in the map()
function of a Pandas Series represents handling the NaN (Not a Number) values during the mapping process. You can specify three options for na_action
: ‘ignore’, ‘raise’, or None
.
import pandas as pd
# Create a sample Series
data = {'A': 1, 'B': 2, 'C': 3, 'D': 4, 'E': None}
series = pd.Series(data)
# Mapping with a dictionary and handling NaN values
mapping = {1:'Python', 2:'Spark', 3:'Pandas', 4:'Pyspark'}
result = series.map(mapping, na_action='ignore')
print("Mapped Series with 'ignore' option:\n", result)
# Output:
# Mapped Series with 'ignore' option
# A Python
# B Spark
# C Pandas
# D Pyspark
# E NaN
# dtype: object
With na_action='ignore'
, the NaN values in the original Series are returned in the result without applying the mapping.
Mapping Values Based on Substring Matching
If you want to map values based on substring matching using a custom function, you can use the map() function along with a lambda function or a custom function.
import pandas as pd
# Create a sample Series
data = {'A': 'Python', 'B': 'Spark', 'C': 'Pandas', 'D': 'Pyspark'}
series = pd.Series(data)
# Define a mapping function based on substring matching
substring_mapping = lambda x: 'Courses' if 'Pandas' in x or 'Spark' in x else 'Other'
# Use map() to apply the substring mapping function to the Series
result = series.map(substring_mapping)
print("Mapped series based on substring matching:\n", result)
# Output:
# Mapped series based on substring matching:
# A Other
# B Courses
# C Courses
# D Other
# dtype: object
In the above example, the substring_mapping
function checks if the substring ‘Pandas’ or ‘Spark’ is present in each element of the original Series. If either substring is found, the value is mapped to ‘Courses’; otherwise, it is mapped to ‘Other’. You can customize the logic inside the mapping function based on your specific substring matching requirements.
Mapping Values to Boolean based on a Condition
If you want to map values in a Pandas Series to boolean values based on a condition, you can use a custom function with the map()
function.
import pandas as pd
# Create a sample Series
data = {'A': 10, 'B': 25, 'C': 8, 'D': 15}
series = pd.Series(data)
# Define a mapping function based on a condition
def condition_mapping(value):
return value > 15
# Use map() to apply the condition mapping function to the Series
result = series.map(condition_mapping)
print("Mapped series based on condition:\n", result)
# Output:
# Mapped series based on condition:
# A False
# B True
# C False
# D False
# dtype: bool
In the above example, the condition_mapping
function checks if each element in the original Series is greater than 15. If the condition is met, it returns True
; otherwise, it returns False
. The map()
function is then used to apply this condition mapping function to each element in the Series, resulting in a new series of boolean values based on the specified condition.
Frequently Asked Questions on Pandas Series map() Function
The map()
function in Pandas Series replaces values in the Series based on specified mapping or a function. It allows for element-wise transformations, where each element in the Series replaces or modifies based on the provided mapping.
The map()
function supports various types of mappings, such as dictionaries, functions, or other Series. This flexibility allows users to perform a wide range of transformations on the values within a Series.
The na_action
parameter in the map()
function determines to handle NaN (Not a Number) values during the mapping process. It can take values such as ‘ignore’, ‘raise’, or None
. ‘ignore’ ignores NaN values, ‘raise’ raises an error if NaN values are encountered, and None
treats NaN values like any other value in the mapping.
The map()
function can be used for conditional mapping. You can define a custom function that applies a condition to each element in the Series and returns a new value based on that condition. This allows for mapping values to boolean values or applying various conditions to modify the Series.
The map()
function can be used for string operations by defining a custom function that performs string manipulations and applying it to the Series. For example, you can use string methods or regular expressions inside the custom function to modify or categorize string values in the Series.
The map()
function does not modify the original Series in place. Instead, it returns a new Series with the transformed values. If you want to modify the original Series, you need to assign the result back to the original Series or use the inplace=True
parameter.
Conclusion
In this article, I have explained the map()
function and using its syntax, parameters, and usage how we can return a new Series containing the transformed values by applying a function or mapping a dictionary.
Happy Learning!!
Related Articles
- Pandas Series.mean() Function
- Pandas Iterate Over Series
- Pandas Series.isin() Function
- Convert Pandas Series to String
- How to Rename a Pandas Series
- Pandas Series sum() Function
- Pandas Series astype() Function
- Pandas Series concat() Function
- Pandas Series.clip() Function
- Pandas Series iloc[] Function
- Pandas Series round() Function
- Pandas series.str.get() Function
- Pandas.Series.combine() Function
- Convert Pandas Series to DataFrame
- How To Get Value From Pandas Series?