• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:22 mins read
You are currently viewing Pandas Series Tutorial with Examples

1. Pandas Series Introduction

This is a beginner’s guide of Python pandas Series Tutorial where you will learn what is pandas Series? its features, advantages, and how to use panda Series with sample examples.

Every sample example explained in this tutorial is tested in our development environment and is available for reference.

All pandas Series examples provided in this tutorial are basic, simple, and easy to practice for beginners who are enthusiastic to learn pandas and advance their career in Data Science, analytics, and Machine Learning.

Note: In case you can’t find the pandas Series examples you are looking for on this tutorial page, I would recommend using the Search option from the menu bar to find your tutorial and sample example code, there are hundreds of tutorials in pandas on this website you can learn from.

2. What is Pandas Series

pandas Series is a one-dimensional array that is capable of storing various data types (integer, string, float, python objects, etc.). We can easily convert the list, tuple, and dictionary into Series using the Series() method. In pandas Series, the row labels of Series are called the index. The Series can have only one column. A List, NumPy Array, Dict can be turned into a pandas Series.   

3. Pandas Series vs DataFrame?

  • As I explained above, pandas Series is a one-dimensional labeled array of the same data type whereas DataFrame is a 2-dimensional labeled data structure with columns of potentially different types. 
  • In a DataFrame, each column of data is represented as a pandas Series.
  • DataFrame column can have a name/label but, Series cannot have a column name.
  • DataFrame can also be converted to Series and single or multiple Series can be converted to a DataFrame. Refer to pandas DataFrame Tutorial for more details and examples on DataFrame.

4. Pandas.series() Constructor

Below is the syntax of pandas Series Constructor, which is used to create Pandas Series objects.


# Pandas Series Constructor Syntax
Pandas.series(data,index,dtype,copy)
  • data: The data contains ndarray, list, constants.
  • Index: The index must be unique and hashable. np.arrange(n) if no index is passed.
  • dtype: dtype is also a data type.
  • copy: It is used to copy the data. The data contains ndarray, list, constants.

5. Create pandas Series

pandas Series can be created in multiple ways, From array, list, dict, and from existing DataFrame.

5.1 Create Series using array

Before creating a Series, first, we have to import the NumPy module and use array() function in the program. If the data is ndarray, then the passed index should be in the same length, if the index is not passed the default value is range(n).


# Create Series from array
import pandas as pd 
import numpy as np
data = np.array(['python','php','java'])
series = pd.Series(data)
print (series)

Yields below output. Notice that the column doesn’t have a name. And Series also adds an incremental sequence number as Index (first column) by default.


# Output:
0    python
1       php
2      java
dtype: object

Now, let’s see how to create a pandas DataFrame with a custom Index. To do so, will use index param which takes a list of index values. Make sure the index list matches the data size.


# Create pandas DataFrame with custom index
s2=pd.Series(data=data, index=['r1', 'r2', 'r3'])
print(s2)

Yields below output.


# Output:
r1    python
r2       php
r3      java
dtype: object

5.2 Create Series using Dict

Dict can be used as input. Keys from Dict are used as Index and values are used as a column.


# Create a Dict from a input
data = {'Courses' :"pandas", 'Fees' : 20000, 'Duration' : "30days"}
s2 = pd.Series(data)
print (s2)

Yields below output.


# Output:
Courses     pandas
Fees         20000
Duration    30days
dtype: object

Now let’s see how to ignore Index from Dict and add the Index while creating a Series with Dict.


# To See index from Dict and add index while creating a Series.
data = {'Courses' :"pandas", 'Fees' : 20000, 'Duration' : "30days"}
s2 = pd.Series(data, index=['Courses','Course_Fee','Course_Duration'])
print (s2)

5.3 Create Series using List

Below is an example of creating DataFrame from List.


# Creating DataFrame from List
data = ['python','php','java']
s2 = pd.Series(data, index=['r1', 'r2','r3'])
print(s2)

Yields below output.


# Output:
r1 python
r2 php
r3 java
dtype:object

5.4 Create Empty Series

Sometimes you would require to create an empty Series. you can do so by using its empty constructor.


# Create empty Series
import pandas as pd
s = pd.Series()
print(s)

This shows an empty series.

6. Convert a Series into a DataFrame?

To convert Series into DataFrame, you can use pandas.concat(), pandas.merge(), DataFrame.join(). Below I have explained using concat() function. For others, please refer to pandas combine two Series to DataFrame


# Convert series to dataframe
courses = pd.Series(["Spark","PySpark","Hadoop"], name='courses')
fees = pd.Series([22000,25000,23000], name='fees')
df=pd.concat([courses,fees],axis=1)
print(df)

Yields below output.


# Output:
   courses   fees
0    Spark  22000
1  PySpark  25000
2   Hadoop  23000

7. Convert pandas DataFrame to Series?

In this section of the pandas Series Tutorial, I will explain different ways to convert DataFrame to Series. As I explained in the beginning, every column in a DataFrame is a Series. hence, we can convert single or multiple columns into Series.

  1. Single DataFrame column into a Series (from a single-column DataFrame)
  2. Specific DataFrame column into a Series (from a multi-column DataFrame)
  3. Single row in the DataFrame into a Series

7.1 Convert a single DataFrame column into a series:

Let’s create a DataFrame with a single column. By using DataFrame.squeeze() to convert the DataFrame into a Series:


# Create DataFrame with single column
data =  ["Python","PHP","Java"]
df = pd.DataFrame(data, columns = ['Courses'])
my_series = df.squeeze()
print(my_series)
print (type(my_series))

The DataFrame will now get converted into a Series:


# Output:
0    Python
1       PHP
2      Java
Name: Courses, dtype: object
<class 'pandas.core.series.Series'>

7.2. Convert a specific DataFrame column into a series:

If you have a DataFrame with multiple columns, and you’d like to convert a specific column into a series.

For example, if we a multiple-column DataFrame


# Create DataFrame with multiple columns
import pandas as pd
data = {'Courses': ['Spark', 'PySpark', 'Python'],
        'Duration':['30 days', '40 days', '50 days'],
        'Fee':[20000, 25000, 26000]
        }
df = pd.DataFrame(data, columns = ['Courses', 'Duration', 'Fee'])
print(df)
print (type(df))

Let’s convert the Fee column into a Series.


# Pandas DataFrame column to series
my_series= df['Fee'].squeeze()

7.3 Convert DataFrame Row into a Series

Above, we have seen converting DataFrame columns into Series, here, I will explain converting rows into Series.


# Convert DataFrame row to series
my_series = df.iloc[2].squeeze()
print(my_series)
print (type(my_series))

Then, we can get the following series:


# Output:
Courses      Python
Duration    50 days
Fee             NaN
Name: 2, dtype: object
<class 'pandas.core.series.Series'>

8. Merge DataFrame and Series?

  1. Construct a dataframe from the series.
  2. After that merge with the dataframe.
  3. Specify the data as the values, multiply them by the length, set the columns to the index and set params for left_index and set the right_index to True.

# Syntax for merge with the DataFrame.
df.merge(pd.DataFrame(data = [s.values] * len(s), columns = s.index), left_index=True, right_index=True)

9. Pandas Series Attributes:

TReturn the transpose, which is by definition self.
arrayThe ExtensionArray of the data backing this Series or Index.
atAccess a single value for a row/column label pair.
attrsDictionary of global attributes of this dataset.
axesReturn a list of the row axis labels.
dtypeReturn the dtype object of the underlying data.
dtypesReturn the dtype object of the underlying data.
flagsGet the properties associated with this pandas object.
hasnasReturn if I have any nans; enables various perf speedups.
iatAccess a single value for a row/column pair by integer position.
ilocPurely integer-location based indexing for selection by position.
indexThe index (axis labels) of the Series.
is_monotonicReturn boolean if values in the object are monotonic_increasing.
is_monotonic_decreasingReturn boolean if values in the object are monotonic_decreasing.
is_monotonic_increasingAlias for is_monotonic.
is_uniqueReturn boolean if values in the object are unique.
locAccess a group of rows and columns by label(s) or a boolean array.
nameReturn the name of the Series.
nbytesReturn the number of bytes in the underlying data.
ndimNumber of dimensions of the underlying data, by definition 1.
shapeReturn a tuple of the shape of the underlying data.
sizeReturn the number of elements in the underlying data.
valuesReturn Series as ndarray or ndarray-like depending on the dtype.

10. Pandas Series Methods:

abs()Return a Series/DataFrame with absolute numeric value of each element.
add(other[, level, fill_value, axis])Return Addition of series and other, element-wise (binary operator add).
add_prefix(prefix)Prefix labels with string prefix.
add_suffix(suffix)Suffix labels with string suffix.
agg([func, axis])Aggregate using one or more operations over the specified axis.
aggregate([func, axis])Aggregate using one or more operations over the specified axis.
align(other[, join, axis, level, copy, …])Align two objects on their axes with the specified join method.
all([axis, bool_only, skipna, level])Return whether all elements are True, potentially over an axis.
any([axis, bool_only, skipna, level])Return whether any element is True, potentially over an axis.
append(to_append[, ignore_index, …])Concatenate two or more Series.
apply(func[, convert_dtype, args])Invoke function on values of Series.
argmax([axis, skipna])Return int position of the largest value in the Series.
argmin([axis, skipna])Return int position of the smallest value in the Series.
argsort([axis, kind, order])Return the integer indices that would sort the Series values.
asfreq(freq[, method, how, normalize, …])Convert time series to specified frequency.
asof(where[, subset])Return the last row(s) without any NaNs before where.
astype(dtype[, copy, errors])Cast a pandas object to a specified dtype .
at_time(time[, asof, axis])Select values at particular time of day (e.g., 9:30AM).
autocorr([lag])Compute the lag-N autocorrelation.
backfill([axis, inplace, limit, downcast])Synonym for DataFrame.fillna() with method=”bfill”.
between(left, right[, inclusive])Return boolean Series equivalent to left <= series <= right.
between_(start_time, end_time[, …])Select values between particular times of the day (e.g., 9:00-9:30 AM).

Continue..

bfill([axis, inplace, limit, downcast])Synonym for DataFrame.fillna() with method=”bfill” .
bool()Return the bool of a single element Series or DataFrame.
catalias of pandas.core.arrays.categorical.categoricalAccessor
clip([lower, upper, axis, inplace])Trim values at input threshold(s).
combine(other, func[, fill_value])Combine the Series with a Series or scalar according to func.
combine_first(other)Update null elements with value in the same location in ‘other’.
compare(other[, align_axis, keep_shape, …])Compare to another Series and show the differences.
convert([infer_objects, …])Convert columns to best possible dtypes using dtypes supporting pd.NA.
copy([deep])Make a copy of this object’s indices and data.
corr(other[, method, min_periods])Compute correlation with other Series, excluding missing values.
count([level])Return number of non-NA/null observations in the Series.
cov(other[, min_periods, ddof])Compute covariance with Series, excluding missing values.
cummax([axis, skipna])Return cumulative maximum over a DataFrame or Series axis.
cummin([axis, skipna])Return cumulative minimum over a DataFrame or Series axis.
cumprod([axis, skipna])Return cumulative product over a DataFrame or Series axis.
cumsum([axis, skipna])Return cumulative sum over a DataFrame or Series axis.
describe([percentiles, include, exclude, …])Generate descriptive statistics.
diff([periods])First discrete difference of element.
div(other[, level, fill_value, axis])Return Floating division of series and other, element-wise (binary operator truediv).
divide(other[, level, fill_value, axis])Return Floating division of series and other, element-wise (binary operator truediv).
divmod(other[, level, fill_value, axis])Return Integer division and modulo of series and other, element-wise (binary operator divmod).
dot(other)Compute the dot product between the Series and the columns of other.
drop([labels, axis, index, columns, level, …])Return Series with specified index labels removed.
drop_duplicate([keep, inplace])Return Series with duplicate values removed.
droplevel(level[, axis])Return Series/DataFrame with requested index / column level(s) removed.
dropna([axis, inplace, how])Return a new Series with missing values removed.
dtalias of pandas.core.indexes.accessors.CombinedDatetimelikeproperties.
duplicated([keep])Indicate duplicate Series values.
eq(other[, level, fill_value, axis])Return Equal to of series and other, element-wise (binary operator eq).
equals(other)Test whether two objects contain the same elements.
ewm([com, span, halflife, alpha, …])Provide exponential weighted (EW) functions.
expanding([min_periods, center, axis, method])Provide expanding transformations.
explode([ignore_index])Transform each element of a list-like to a row.
factorize([sort, na_sentinel])Encode the object as an enumerated type or categorical variable.
ffill([axis, inplace, limit, downcast])Synonym for DataFrame.fillna()with method=ffill().
fillna([value, method, axis, inplace, …])Fill NA/NaN values using the specified method.
filter([items, like, regex, axis])Subset the dataframe rows or columns according to the specified index labels.
first(offset)Select initial periods of time series data based on a date offset.
first_valid_other()Return index for first non-NA value or None, if no NA value is found.

Conclusion

In this pandas Series tutorial, we have learned about what is panda series? how to create a Panda Series with different types of inputs, convert pandas Series to DataFrame, and vice versa with working examples.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

Leave a Reply