In Pandas, the str.get()
function is used to retrieve the individual characters within each string in the Series. Using this function, you can also get individual characters from the ending side of each string within a Series.
In this article, I will explain the series.str.get()
function and using its syntax, parameters, and usage how we can return a new Series containing the character at the specified position for each string.
Key Points –
series.str.get()
allows you to extract characters from each string in a Series based on their index position.- The index used in
get()
is zero-based, meaning the first character has an index of 0, the second has an index of 1, and so on. - You can use negative indices to extract characters from the end of the string.
-1
refers to the last character,-2
to the second last, and so forth. get()
can accept either a single index or a list/array of indices. When passing multiple indices, it returns a new Series with characters extracted from each string at the specified positions.series.str.get()
facilitates extracting characters from strings across an entire Pandas Series in a vectorized manner, offering efficiency in handling large datasets.
Series str.get() Introduction
Following is the syntax of the pandas series.str.get() function.
# Syntax of series.str.get() function
Series.str.get(i, default=None)
Parameters of the series.str.get()
Following are the parameters of the series.str.get() function.
i
– The position of the character to retrieve. It can be an integer or a slice object. Ifi
is positive, it counts from the beginning of the string, and if it’s negative, it counts from the end of the string. Ifi
is a slice object, it returns a substring according to the slice.default
– (Optional) The default value to return if the specified position is out of range for any element in the Series.
Return Value
It returns a new Series containing the characters retrieved from the specified positions in each string of the original Series.
Get the First Character of Each String
To get the first character of each string in a Pandas Series, you can use the str
accessor along with the get()
function.
Let’s create the Series containing string elements using Python lists.
import pandas as pd
# Create a sample Series
series = pd.Series(['Spark', 'PySpark', 'Hadoop', 'Pandas'])
print("Original Series:\n",series)
Yields below output.
You can get the first character of each string in a Pandas Series using the str.get()
function, you can pass the index 0
as an argument. It will return the new Series where the elements are characters of the first position in each string.
# Get the first character of each string
result = series.str.get(0)
print("Getting the first character of each string:\n",result)
In the above example, str.get(0)
extracts the first character from each string in the Series, resulting in a new Series containing those characters. This example yields the below output.
Get the Third Character of Each String
To get the third character of each string in a Pandas Series, you can again utilize the str
accessor along with the get()
function. Pass the index in which position we want to get to the str.get() function. It will return the new Series with character elements of a specified position in each string.
# Get the third character of each string
result = series.str.get(2)
print("Getting the third character of each string:\n",result)
# Output:
# Getting the third character of each string:
# 0 a
# 1 S
# 2 d
# 3 n
# dtype: object
Here, you can use the str.get(2)
function to get the third character of each string in the Series. Finally, we print the resulting Series, which contains the third character of each string.
Get the Last Character of Each String
Alternatively, you can get specified characters from the ending side of each string using the str.get()
function. To get the last character of each string in a Pandas Series, you can use the str
accessor along with a negative index.
# Get the last character of each string
result = series.str.get(-1)
print("Get the last character of each string:\n",result)
# Output:
# Get the last character of each string:
# 0 k
# 1 k
# 2 p
# 3 s
# dtype: object
Here, you can use the str.get(-1)
function to get the last character of each string in the Series by using a negative index. Finally, we print the resulting Series, which contains the last character of each string.
Get a Group of Characters from Each String
Similarly, to get a group of characters from each string in a Pandas Series, you can use the slicing technique with the str
accessor. Let’s use slicing to get the specified selection of characters from each string within a Series.
# Get group of characters of each string
result = series.str[3:]
print("Get group of characters from each string:\n",result)
# Output:
# Get group of characters from each string:
# 0 rk
# 1 park
# 2 oop
# 3 das
# dtype: object
Here, you can use the slicing syntax [3:]
to retrieve characters from the fourth position to the end of each string in the Series. Finally, we print the resulting Series, which contains the characters from the fourth position to the end of each string.
Extract Characters from a Specific Range
To extract characters from a specific range of indices for each string in a Pandas Series, you can utilize the str.slice()
function.
# To extract characters from a specific range
# Use str.slice()
result = series.str.slice(start=1, stop=3)
print(result)
# Output:
# 0 pa
# 1 yS
# 2 ad
# 3 an
# dtype: object
In the above example, str.slice(start=1, stop=3)
extracts characters from index 1 (inclusive) to index 3 (exclusive) for each string in the Series, resulting in a new Series containing the characters within that range.
Frequently Asked Questions on Pandas series.str.get() Function
The series.str.get()
function in Pandas is used to extract characters from strings within a Series based on their index positions.
series.str.get()
supports negative indices, enabling extraction from the end of strings. For instance, -1
refers to the last character.
series.str.get()
specifically retrieves characters at specified index positions, while other methods like series.str.slice()
extract substrings based on index ranges or conditions.
If an index is out of bounds for a particular string or if the string itself is missing (NaN), series.str.get()
returns NaN for that element.
You can pass a list of indices to series.str.get()
to extract multiple characters simultaneously. It will return a new Series containing characters at specified positions for each string.
Conclusion
In this article, I have explained the series.str.get()
function in Pandas and using its syntax, parameters, and usage how we can extract an individual character from each element in a Series of strings based on the specified position with examples. Also explained how to get a group of characters of each string in a Series based on the index using the slicing method and the str.slice() function.
Happy Learning!!
Related Articles
- Pandas Series.clip() Function
- Pandas Series any() Function
- Pandas Series map() Function
- Pandas Series.quantile() Function
- Pandas Series where() Function
- Pandas Series astype() Function
- Pandas Series concat() Function
- Use pandas.to_numeric() Function
- Pandas Series iloc[] Function
- Pandas Series round() Function
- Pandas Series.dtype() Function
- Pandas Series Drop duplicates() Function
- Change the Index Order in Pandas Series
- What is a Pandas Series Explained With Examples