• Post author:
  • Post category:Polars
  • Post last modified:April 3, 2025
  • Reading time:12 mins read
You are currently viewing Polars DataFrame height – Explained by Examples

In Polars, you can get the number of rows (height) of a DataFrame using the height attribute. The height of a DataFrame refers to the number of rows it contains. It is a quick way to check the size of the DataFrame in terms of rows.

Advertisements

In this article, I will explain the Polars DataFrame height property, including its syntax, parameters, and usage. Through detailed examples, I will show how it returns an integer representing the total number of rows in a DataFrame.

Key Points –

  • The height property returns the total number of rows in a Polars DataFrame.
  • It can be used to determine if a DataFrame is empty by checking if height == 0.
  • Accessed using df.height without parentheses, as it is a property, not a method.
  • height is a fast, constant-time operation that does not scan the DataFrame.
  • The height value remains unchanged even if there are missing (null) values in the DataFrame.
  • Unlike shape, which returns both rows and columns, height focuses only on the row count.
  • The height property can be checked after applying filter() or drop_nulls() to verify changes.
  • The row count includes both non-null and null rows; missing values do not affect height.

Syntax of Polars DataFrame height Property

Following is the syntax of the DataFrame height attribute.


# Syntax of height
property DataFrame.height: int

Return Value

This function returns an integer representing the number of rows in the DataFrame.

Usage of Polars DataFrame height Attribute

The height property in Polars is used to get the number of rows in a DataFrame. It is an efficient and direct way to check the size of your dataset.

Now, let’s create a Polars DataFrame.


import polars as pl

technologies= {
    'Courses':["Spark","PySpark","Hadoop","Python","Pandas","PySpark","Java"],
    'Fee' :[22000, 25000, 23000, 24000, 26000, 30000, 35000],
    'Discount':[1000, 2300, 1000, 1200, 2500, 2000, 2200],
    'Duration':['35days','40days','65days','50days','60days','30days','45days']
          }

df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)

Yields below output.

polars height

You can use the height property to get the number of rows in a Polars DataFrame.


# Get the number of rows
print("Number of rows in the DataFrame:", df.height)

Here,

  • The height attribute returns the total number of rows in the DataFrame.
  • In this case, it returns 7 because there are 7 rows in the DataFrame.
polars height

Alternatively, you can use the shape property, which returns both rows and columns as a tuple.


df2 = df.shape[0]
print("Number of rows:", df2)

# Output:
# Number of rows: 7

Conditional Execution Based on Row Count

The height attribute in Polars enables you to execute code conditionally based on the number of rows in a DataFrame. This is useful for handling large datasets, enforcing business rules, or performing validations by applying different logic depending on the DataFrame’s size.


# Conditional execution based on row count
if df.height == 0:
    print("The DataFrame is empty.")
elif df.height < 5:
    print("The DataFrame has a small number of rows:", df.height)
else:
    print("The DataFrame has a large number of rows:", df.height)
    
# Output:
# The DataFrame has a large number of rows: 7

Filtering Data and Checking New Row Count

You can filter a Polars DataFrame based on conditions and then use the height attribute to check the number of rows in the filtered DataFrame. This helps determine how many rows remain after applying the filter.


# Filtering: Select courses with Fee greater than 25,000
filtered_df = df.filter(df["Fee"] > 25000)

# Checking the number of rows after filtering
print("Number of rows before filtering:", df.height)
print("Number of rows after filtering:", filtered_df.height)

# Output:
# Number of rows before filtering: 7
# Number of rows after filtering: 3

Here,

  • The original DataFrame (df.height) gets the number of rows before filtering.
  • Filtered DataFrame (filtered_df.height) gets the number of rows after applying the filter.
  • The original DataFrame has 7 rows, but after filtering for courses with a fee above 25,000, only 3 rows remain.

Checking If a DataFrame is Empty

You can determine if a DataFrame is empty (i.e., has zero rows) using the df.height property. If df.height returns 0, it indicates that the DataFrame contains no rows.


import polars as pl

# Creating an empty Polars DataFrame
df = pl.DataFrame({})

# Check if the DataFrame is empty
if df.height == 0:
    print("The DataFrame is empty!")
else:
    print("The DataFrame has data.")

# Output:
# The DataFrame is empty!

Here,

  • Use df.height == 0 for checking emptiness.

Comparing Two DataFrames by Height

We can compare the number of rows (height) of two DataFrames using the df.height property. This is useful when analyzing datasets of different sizes.


import polars as pl

# Creating two Polars DataFrames
df1 = pl.DataFrame({
    "A": [1, 3, 5]
})

df2 = pl.DataFrame({
    "A": [2, 4, 6, 8, 10]
})

# Comparing DataFrame heights
if df1.height < df2.height:
    print("df2 has more rows than df1")
else:
    print("df1 has more or equal rows as df2")

# Output:
# df2 has more rows than df1

Here,

  • Since df1.height is 3 and df2.height is 5, the condition evaluates to True.

Polars DataFrame height Attribute with Missing Values

The height attribute returns the number of rows in a DataFrame, regardless of missing values (nulls). Even if some values are missing, the height remains the total row count.


import polars as pl

technologies= {
    'Courses':["Spark","PySpark",None,"Python","Pandas","PySpark","Java"],
    'Fee' :[22000, 25000, None, 24000, 26000, 30000, None],
    'Discount':[1000, 2300, 1000, 1200, None, 2000, 2200],
    'Duration':['35days',None,'65days','50days','60days',None,'45days']
          }

df = pl.DataFrame(technologies)

# Get the number of rows
print("Total number of rows:", df.height)

# Output:
# Total number of rows: 7

Counting Only Non-Missing Rows in a Column

You can count the number of non-missing (non-null) rows in a specific column using drop_nulls() or filter(). This is useful for checking data completeness before analysis. We can use drop_nulls() on a specific column and then check the height.


# Count non-null rows in the "Courses" column
df2 = df.drop_nulls(subset=["Courses"]).height
print("Rows where 'Courses' is not null:", df2)

# Output:
# Rows where 'Courses' is not null: 6

Filtering Out Rows with Missing Values

You can remove rows with missing (null) values using drop_nulls(). This is useful when you need a clean dataset for analysis.


# Remove rows with any missing values
df_cleaned = df.drop_nulls()

# Print cleaned DataFrame
print(df_cleaned)
print("Rows after dropping missing values:", df_cleaned.height)

# Output:
# shape: (2, 4)
┌─────────┬───────┬──────────┬──────────┐
│ Courses ┆ Fee   ┆ Discount ┆ Duration │
│ ---     ┆ ---   ┆ ---      ┆ ---      │
│ str     ┆ i64   ┆ i64      ┆ str      │
╞═════════╪═══════╪══════════╪══════════╡
│ Spark   ┆ 22000 ┆ 1000     ┆ 35days   │
│ Python  ┆ 24000 ┆ 1200     ┆ 50days   │
└─────────┴───────┴──────────┴──────────┘
# Rows after dropping missing values: 2

Conclusion

In conclusion, the height attribute in Polars provides a quick and efficient way to determine the number of rows in a DataFrame. It is useful for conditionally executing code, checking if a DataFrame is empty, and analyzing the size of filtered data.

Happy Learning!!

References