• Post author:
  • Post category:Pandas
  • Post last modified:July 24, 2024
  • Reading time:13 mins read
You are currently viewing Pandas DataFrame nunique() Method

In pandas, the nunique() method is used to count the number of unique values along a specified axis of a DataFrame. This method is useful for data exploration and understanding the distribution of unique values within your dataset.

Advertisements

In this article, I will explain the Pandas DataFrame nunique() method by using its syntax, parameters, and usage, and how to return a series with the count of unique values for each column or row.

Key Points –

  • Counts the number of unique values in each column or row of a DataFrame.
  • Accepts axis to specify whether to count along columns (0 or ‘index’) or rows (1 or ‘columns’), and dropna to include or exclude NA/null values.
  • Commonly used for data exploration to understand the uniqueness and distribution of data.
  • Can handle missing values by either including or excluding them in the count, depending on the dropna parameter.

Pandas DataFrame nunique() Introduction

Let’s know the syntax of the nunique() method.


# Syntax of Pandas DataFrame nunique()
DataFrame.nunique(axis=0, dropna=True)

Parameters of the DataFrame nunique()

Following are the parameters of the DataFrame nunique() function.

  • axis – {0 or ‘index’, 1 or ‘columns’}, default 0
    • 0 or 'index': Count unique values for each column.
    • 1 or 'columns': Count unique values for each row.
  • dropna – bool, default True
    • If True, NaN values are excluded from the count.
    • If False, NaN values are included in the count.

Return Value

It returns a Series with the count of unique values for each column or row, depending on the axis specified.

Usage of Pandas DataFrame nunique() Method

The nunique() method in pandas is used to count the number of unique values along an axis of a DataFrame or Series

To run some examples of the Pandas DataFrame nunique() method, let’s create two Pandas DataFrames using data from Python dictionaries.


# Create DataFrame 
import pandas as pd
import numpy as np

df = pd.DataFrame({'A':[5, 10, 5, 15], 
                   'B': [4, 6, 4, 2], 
                   'C': [3, 7, 5, 9]})
print("Original DataFrame:\n",df)

Yields below output.

pandas nunique

To count the unique values for each column in the DataFrame, you can use the nunique() method.


# Counting unique values for each column
df2 = df.nunique()
print("Unique values for each column:\n", df2)

# Counting unique values for each column
df2 = df.nunique(axis=0)
print("Unique values for each column:\n", df2)

Yields below output.

pandas nunique

Counting Unique Values for Each Row

Alternatively, to count the unique values for each row in the DataFrame, you can use the nunique() method with axis=1.


# Counting unique values for each row
df2 = df.nunique(axis=1)
print("Unique values for each row:\n", df2)

# Output:
# Unique values for each row:
#  0    3
# 1    3
# 2    2
# 3    3
# dtype: int64

Including NaN Values in the Count

To include NaN values in the count of unique values for each column or row in a DataFrame, you can set the dropna parameter of the nunique() method to False.

Counting Unique Values for Each Column Including NaN

To count the unique values for each column in a DataFrame, including NaN values, you can use the nunique() method with the dropna=False parameter.


import pandas as pd
import numpy as np

# Creating the DataFrame with NaN values
df = pd.DataFrame({
    'A': [5, 10, 5, np.nan],
    'B': [4, 6, 4, 2],
    'C': [3, 7, np.nan, 9]
})

# Counting unique values for each column, including NaN values
df2 = df.nunique(axis=0, dropna=False)
print("Unique values for each column (including NaN):\n", df2)

# Output:
# Unique values for each column (including NaN):
#  A    3
# B    3
# C    4
# dtype: int64

Counting Unique Values for Each Row Including NaN

To count the unique values for each row in a DataFrame, including NaN values, you can use the nunique() method with axis=1 and dropna=False.


# Counting unique values for each row, including NaN values
df2 = df.nunique(axis=1, dropna=False)
print("Unique values for each row (including NaN):\n", df2)

# Output:
# Unique values for each row (including NaN):
#  0    3
# 1    3
# 2    3
# 3    3
# dtype: int64

Frequently Asked Questions Pandas DataFrame nunique() Method

What does the nunique() method do in Pandas?

The nunique() method in Pandas is used to count the number of unique values along a specified axis (rows or columns) of a DataFrame.

How do you use nunique() to count unique values in each column?

You can use df.nunique(axis=0) to count unique values in each column (axis=0 is the default for columns).

Can nunique() count unique values in each row?

By using df.nunique(axis=1), you can count unique values in each row of the DataFrame (axis=1 for rows).

Does the nunique() method include NaN values by default?

By default, the nunique() method excludes NaN values (dropna=True). You can include NaN values in the count by setting dropna=False.

When should I use the nunique() method?

Use nunique() when you need to quickly understand the diversity and distribution of unique values within your dataset, which is useful for data exploration and initial data analysis tasks.

Conclusion

In this article, I have explained the Pandas DataFrame nunique() function by using its syntax, parameters, usage, and how to return a Pandas Series object with the count of unique values for each column or row, depending on the specified axis.

Happy Learning!!

Reference