In Pandas, you can use the unique()
function to get the unique elements in a Pandas Series. It returns an array containing the unique values in the order of their first occurrence in the original Series.
In this article, I will explain the unique()
function and using its syntax, parameters, and usage how we can get the unique values from a Pandas Series with multiple examples.
Key Points –
- The
unique()
function extracts and returns the distinct (unique) values present in a pandas Series, eliminating any duplicate entries. - The result of the
unique()
function is a numpy array. This array contains the unique values in the same order as they appear in the original Series. - The order of unique values in the output array corresponds to their order of first occurrence in the Series. This ensures that the original sequence is maintained in the result.
- The function can be applied to Series containing various data types, including integers, floats, strings, booleans, and datetime objects.
- The
unique()
function does not modify the original Series; it returns a new array with unique values. If you want to modify the original Series in place, you can use thedrop_duplicates()
method.
Syntax of Pandas Series unique() Function
Following is the syntax of the pandas series.unique() function.
# Syntax of series.unique() function
Series.unique()
Parameters of the Series unique()
Following are the parameters of the unique() function
Series
– This represents the pandas Series in which we want to get the unique values.
Return Value
It returns a numpy array containing the unique values present in the Series. The order of the elements in the array corresponds to the order of the first occurrence in the original Series.
Create Pandas Series
Pandas Series can be created in several ways by using Python lists & dictionaries, below example create a Series from a list. To use Pandas first, you need to import using import pandas as pd
.
import pandas as pd
# Create a pandas Series with integers
integer_series = pd.Series([1, 2, 2, 3, 4, 4, 5])
print("Original Series:\n",integer_series)
Yields below output.
Use unique() Function to Get Unique Integers
If you have a pandas Series containing integers and you want to get unique values from it, you can use the unique()
function.
# Use the unique() function to get unique integers
unique_integers = integer_series.unique()
print("\nUnique Integers:\n",unique_integers)
In the above example, the unique()
function is applied to a Series containing integers, and it returns a numpy array with the unique integers. The order of the elements in the array corresponds to their order of first occurrence in the original Series. This example yields the below output.
Use unique() Function to Get Unique Strings
Alternatively, you can get unique string values from the Pandas Series using this function. For that you need to pass the given Series into this function, it will return the array of unique values from the Series where the order of these values is as present in the original Series.
import pandas as pd
# Create a pandas Series with strings
string_series = pd.Series(['Spark', 'Pandas', 'Spark', 'Pyspark', 'Pandas', 'Python'])
# Use the unique() function to get unique strings
unique_strings = string_series.unique()
print("Unique Strings:\n",unique_strings)
# Output:
# Unique Strings:
# ['Spark' 'Pandas' 'Pyspark' 'Python']
The above example string_series
is a pandas Series containing strings, and unique_strings
will contain the unique strings present in the original Series. The output will display the original Series and the array of unique strings. The order of elements in the array corresponds to their order of first occurrence in the original Series.
Use unique() Function to Get Unique Dates
You can also get unique date values from a Series using the unique() function. Create the Series with Date values and Pass it into the unique() function, it will return the array of unique Date values.
import pandas as pd
# Create a pandas Series with dates
date_strings = ['2024-01-01', '2024-01-02', '2024-01-01', '2024-01-03']
date_series = pd.to_datetime(date_strings)
# Use the unique() function to get unique dates
unique_dates = date_series.unique()
print("Unique Dates:\n",unique_dates)
# Output:
# Unique Dates:
# DatetimeIndex(['2024-01-01', '2024-01-02', '2024-01-03'], dtype='datetime64[ns]', freq=None)
The above example date_series
is a pandas Series containing dates, and the pd.to_datetime()
function is used to convert date strings to datetime objects. Then, the unique()
function is applied to obtain an array of unique dates. The order of elements in the array corresponds to their order of first occurrence in the original Series.
Use unique() Function to Get Unique Booleans
Similarly, the unique()
function in pandas can also be used to get unique boolean values from a Series. Let’s create a Pandas Series with boolean values and pass them into the unique() function to get the array of unique boolean values from a Series.
import pandas as pd
# Create a pandas Series with boolean values
bool_series = pd.Series([True, False, True, True, False])
# Use the unique() function to get unique booleans
unique_booleans = bool_series.unique()
print("Unique Booleans:\n",unique_booleans)
# Output:
# Unique Booleans:
# [ True False]
In this program, bool_series
is a pandas Series containing boolean values, and unique_booleans
will contain the unique booleans present in the original Series. The output will display the original Series and the array of unique boolean values.
Frequently Asked Questions on Pandas Series Unique
The unique()
function in pandas is used to return an array of unique elements from a Series. It provides the unique values present in the Series, preserving their original order of appearance.
The unique()
function in pandas can be applied to Series containing values of various data types. It is versatile and works with numeric, string, datetime, and boolean values, among others. The function is designed to handle different data types and return an array of unique values.
The unique()
function in pandas handles NaN (Not a Number) values by including them in the result. When applied to a Series that contains NaN values, the unique()
function will return an array containing the unique values, including any NaN values.
The unique()
function does not modify the original Series. It returns a new array with the unique values. If you want to modify the original Series in-place, you can use the drop_duplicates()
method.
You can use the unique()
function with pandas Series containing datetime values. The unique()
function is versatile and can be applied to a Series of various data types, including datetime objects.
To convert the result of unique()
back to a pandas Series, you can use the pd.Series()
constructor. For instance, unique()
is applied to the original Series to get an array of unique values. Then, the pd.Series()
constructor is used to create a new Series from the unique values.
Conclusion
In this article, I have explained the unique()
function in Pandas is a valuable tool for extracting distinct values from a Series, maintaining the original sequence order. It is versatile and applicable to various data types, including integers, strings, dates, and booleans. By providing a numpy array of unique values, this function enhances the flexibility of data exploration and analysis in Python.
Happy Learning!!
Related Articles
- Pandas Iterate Over Series
- Pandas Series.shift() Function
- Pandas Series.isin() Function
- Convert Pandas Series to String
- How to Rename a Pandas Series
- Pandas.Series.combine() Function
- Pandas Series any() Function
- Pandas Series.mean() Function
- Pandas Series.max() Function
- Pandas Series sum() Function
- Pandas Series.quantile() Function
- Pandas Get Floor or Ceil of Series
- Convert Pandas Series to DataFrame
- How To Get Value From Pandas Series?
- How to Get the Length of a Series in Pandas?
- Pandas Series groupby() Function with Examples