• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:12 mins read
You are currently viewing Create a Set From a Series in Pandas

We can create a set from a series of pandas by using set(), Series.unique() function. The set object is used to store multiple items which are heterogeneous. Just like a list, tuple, and dictionary, the set is another built-in data type in Python which is used to store elements without duplicates.

In this article, I will explain how to create a set from a series of pandas by using the set(), Series.unique() function.

Key Points –

  • Use the pd.Series constructor to create a Series in Pandas.
  • Transform the Series into a set using the set() function to obtain a unique collection of elements.
  • Sets in Python are unordered, meaning they don’t have a specific order for elements.
  • Sets eliminate duplicate values, ensuring only distinct elements are retained.
  • The resulting set can be used for various set operations, such as union, intersection, and difference.
  • Ensure that the Series contains elements that can be hashed, as sets in Python require hashable elements.

Quick Examples of Creating a Set From a Series

If you are in a hurry, below are some quick examples of how to create a set object from a series in pandas.


# Quick examples of creating a set from a series

# Example 1: using series.unique() & set() function
setObj = ser.unique()
ser2 = set(setObj)
print(ser2)

# Example 2: using set() function to create Set
ser2 = set(ser)
print(ser2)

Create Pandas Series

Pandas Series is a one-dimensional, Index-labeled data structure available only in the Pandas library. It can store all the data types such as strings, integers, float, and other python objects. We can access each element in the Series with the help of corresponding default indices.

Now, let’s create pandas series using a list of values.


import pandas as pd
# Create the Series
ser = pd.Series([20,25,15,10,5,20,30])
print(ser)

Yields below output.


# Output:
0    20
1    25
2    15
3    10
4     5
5    20
6    30
dtype: int64

Create a Set from Pandas Series

Set is used to store unique values hence it is best practice to remove duplicates from the pandas Servies using Series.unique() function and then create a Set object. Series.unique() function returns the NumPy Array object.


# use series.unique() function
ser2 = ser.unique()
print(ser2)

# Output
# [20 25 15 10  5 30]

# using series.unique() & set() function
ser.unique()
ser2 = set(ser)
print(ser2)

# Output
# {5, 10, 15, 20, 25, 30}

The set() function also removes all duplicate values and gets only unique values from Series. We can use this set() function to get unique values from Series. For examples.


Using set() function to create Set
ser2 = set(ser)
print(ser2)

Yields below output.


# Output
{5, 10, 15, 20, 25, 30}

Complete Example


import pandas as pd

# Create the Series
ser = pd.Series([20,25,15,10,5,20,30])
print(ser)

# use series.unique() function
ser2 = ser.unique()
print(ser2)

# using series.unique() & set() function
ser.unique()
ser2 = set(ser)
print(ser2)

# Using set() function to create Set
ser2 = set(ser.unique())
print(ser2)

Frequently Asked Questions on Creating a Set from a Series

What is the purpose of creating a set from a Pandas Series?

Creating a set from a Pandas Series is useful for obtaining a unique collection of elements. Sets automatically eliminate duplicates, allowing for efficient identification of unique values within the Series.

How are duplicates handled when creating a set from a Series?

Sets store only unique elements, automatically discarding duplicates. This ensures that the resulting set contains distinct values from the original Series.

Can sets created from a Pandas Series be used for mathematical set operations?

Sets support common set operations such as union, intersection, and difference. This allows for easy analysis and manipulation of data based on set principles.

What happens if the original Series contains a mix of data types?

The conversion to a set is still possible. However, it’s important to be aware that sets store elements in an unordered manner, and the resulting set will only include unique elements, disregarding the original order.

How does the process of converting a Series to a set benefit data analysis?

The conversion is efficient for handling large datasets with duplicate values. It provides a quick way to identify unique values and simplifies tasks such as filtering and analysis based on distinct elements.

Can sets created from a Series be modified?

Sets are mutable, allowing for the addition or removal of elements. This can be useful for dynamically updating the set based on changing requirements.

Conclusion

In this article, I have explained how to create a set from a Pandas Series using set(), Series.unique() functions with examples.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium