• Post author:
  • Post category:Polars
  • Post last modified:March 27, 2025
  • Reading time:13 mins read
You are currently viewing Polars DataFrame clear() Usage & Examples

The polars.DataFrame.clear() method is used to remove all rows from a Polars DataFrame while preserving the column structure and data types. This effectively resets the DataFrame to an empty state without losing its schema.

Advertisements

In this article, I will explain the Polars DataFrame clear() function, covering its syntax, parameters, and usage. I will also demonstrate how it creates a new DataFrame that retains the same column structure and data types while keeping either zero rows (by default) or the first n rows.

Key Points –

  • Removes all rows from a Polars DataFrame while preserving the column structure and data types.
  • Returns a new DataFrame instead of modifying the existing one (Polars follows an immutable paradigm).
  • Default behavior (clear()) removes all rows, but the column names and data types remain intact.
  • Accepts an optional parameter n, which specifies the number of rows to retain.
  • If n=0 (default), the DataFrame becomes completely empty while maintaining its schema.
  • If n>0, only the first n rows are kept, and the rest are removed.
  • Useful for resetting a DataFrame while keeping the schema for future data insertion.
  • Efficient and memory-friendly, as it avoids creating an entirely new schema.

Polars DataFrame clear() Introduction

Let’s know the syntax of the clear() function.


# Syntax of polars clear()
DataFrame.clear(n: int = 0) → DataFrame

Parameters of the Polars DataFrame clear()

It allows only one parameter.

  • n (int, default = 0) – The number of rows to keep in the DataFrame.
    • If n=0 (default), all rows are removed.
    • If n>0, only the first n rows are retained.

Return Value

This function returns a new DataFrame with either zero rows (default) or first n rows retained. The column structure and data types remain unchanged.

Usage of Polars DataFrame clear() Method

The clear() method in Polars removes all rows from a DataFrame while preserving its schema (column names and data types). It returns a new empty DataFrame with the same structure as the original, but containing no data.

First, let’s create a Polars DataFrame.


import polars as pl

technologies= {
    'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
    'Fees' :[22000,25000,23000,24000,26000],
    'Duration':['30days','50days','35days', '40days','55days'],
    'Discount':[1000,2300,1000,1200,2500]
          }
df = pl.DataFrame(technologies)
print("Original DataFrame:\n", df)

Yields below output.

polars clear

You can use the clear() method to remove all rows from a DataFrame while preserving its column structure and data types. This method ensures that the DataFrame remains empty while maintaining its original schema.


# Remove all rows
df2 = df.clear()
print("DataFrame after clear():\n", df2)

Here,

  • clear() removes all rows while keeping the column structure intact.
  • The shape becomes (0, N), where N is the number of columns.
  • Schema remains unchanged, so the DataFrame can accept new data.
  • Returns a new immutable DataFrame, leaving the original one unchanged.
polars clear

Keeping the First 2 Rows (n=2)

The clear(n) method in Polars removes all rows except the first n rows, allowing you to retain a subset of the data while clearing the rest.


# Keeping only the first 2 rows
df2 = df.clear(n=2)
print("DataFrame after clear(n=2):\n", df2)

# Output:
# DataFrame after clear(n=2):
# shape: (2, 4)
┌─────────┬──────┬──────────┬──────────┐
│ Courses ┆ Fees ┆ Duration ┆ Discount │
│ ---     ┆ ---  ┆ ---      ┆ ---      │
│ str     ┆ i64  ┆ str      ┆ i64      │
╞═════════╪══════╪══════════╪══════════╡
│ null    ┆ null ┆ null     ┆ null     │
│ null    ┆ null ┆ null     ┆ null     │
└─────────┴──────┴──────────┴──────────┘

Here,

  • clear(n) keeps only the first n rows and removes all others.
  • The schema (column structure) remains unchanged.
  • If n exceeds the total rows, the DataFrame remains the same.
  • If n=0, the DataFrame is completely cleared (all rows removed).

Using clear() with Conditional Operations

The clear(n) method in Polars allows you to remove rows while optionally keeping the first n rows. However, if you want to conditionally clear specific rows, you need to combine clear() with filtering techniques.

Since clear(n) does not directly support conditions, you can filter the DataFrame first and then apply clear(n).


# Filtering rows where 'Fees' > 24000
filtered_df = df.filter(df['Fees'] > 24000)
print("Filtered DataFrame (Fees > 24000):\n", filtered_df)

# Keeping only the first 2 rows from the filtered DataFrame
df2 = filtered_df.clear(n=2)
print("DataFrame after clear(n=2) on filtered data:\n", df2)

# Output:
# Filtered DataFrame (Fees > 24000):
# shape: (2, 4)
┌─────────┬───────┬──────────┬──────────┐
│ Courses ┆ Fees  ┆ Duration ┆ Discount │
│ ---     ┆ ---   ┆ ---      ┆ ---      │
│ str     ┆ i64   ┆ str      ┆ i64      │
╞═════════╪═══════╪══════════╪══════════╡
│ PySpark ┆ 25000 ┆ 50days   ┆ 2300     │
│ Pandas  ┆ 26000 ┆ 55days   ┆ 2500     │
└─────────┴───────┴──────────┴──────────┘
# DataFrame after clear(n=2) on filtered data:
# shape: (2, 4)
┌─────────┬──────┬──────────┬──────────┐
│ Courses ┆ Fees ┆ Duration ┆ Discount │
│ ---     ┆ ---  ┆ ---      ┆ ---      │
│ str     ┆ i64  ┆ str      ┆ i64      │
╞═════════╪══════╪══════════╪══════════╡
│ null    ┆ null ┆ null     ┆ null     │
│ null    ┆ null ┆ null     ┆ null     │
└─────────┴──────┴──────────┴──────────┘

Here,

  • Filtering is done before applying clear(n).
  • Only the first n rows from the filtered data are retained.
  • This approach allows clearing data based on conditions.

Checking If a DataFrame is Empty After clear()

When you apply the clear() method to a DataFrame, it removes all rows while keeping the column structure. You can check if the DataFrame is empty after using clear() function with the is_empty() method.


# Clearing the DataFrame
df2 = df.clear()
print("DataFrame after clear():\n", df2)

# Checking if the DataFrame is empty
print("Is the DataFrame empty?", df2.is_empty())

# Output:
# DataFrame after clear():
# shape: (0, 4)
┌─────────┬──────┬──────────┬──────────┐
│ Courses ┆ Fees ┆ Duration ┆ Discount │
│ ---     ┆ ---  ┆ ---      ┆ ---      │
│ str     ┆ i64  ┆ str      ┆ i64      │
╞═════════╪══════╪══════════╪══════════╡
└─────────┴──────┴──────────┴──────────┘
# Is the DataFrame empty? True

Here,

  • clear() removes all rows while keeping the column structure intact.
  • is_empty() returns True if the DataFrame has zero rows.
  • A cleared DataFrame is structurally valid but contains no data.

Clear a DataFrame with Missing Values (NaN)

You can use the clear(n) method to remove all rows while keeping the Polars DataFrame structure intact. If your DataFrame contains missing values (NaN or Nulls), clear() still removes all rows, regardless of whether they have missing values or not.


import polars as pl

technologies= {
    'Courses':["Spark","PySpark",None,"Python","Pandas"],
    'Fees' :[22000,None,23000,24000,26000],
    'Duration':['30days','50days','35days', None,'55days'],
    'Discount':[1000,2300,1000,1200,None]
          }
df = pl.DataFrame(technologies)

# Clearing the DataFrame (removes all rows)
df2 = df.clear()
print("DataFrame after clear():\n", df2)

# Keeping only the first 2 rows
df2 = df.clear(n=2)
print("DataFrame after clear(n=2):\n", df2)

# Output:
# DataFrame after clear():
# shape: (0, 4)
┌─────────┬──────┬──────────┬──────────┐
│ Courses ┆ Fees ┆ Duration ┆ Discount │
│ ---     ┆ ---  ┆ ---      ┆ ---      │
│ str     ┆ i64  ┆ str      ┆ i64      │
╞═════════╪══════╪══════════╪══════════╡
└─────────┴──────┴──────────┴──────────┘
# DataFrame after clear(n=2):
# shape: (2, 4)
┌─────────┬──────┬──────────┬──────────┐
│ Courses ┆ Fees ┆ Duration ┆ Discount │
│ ---     ┆ ---  ┆ ---      ┆ ---      │
│ str     ┆ i64  ┆ str      ┆ i64      │
╞═════════╪══════╪══════════╪══════════╡
│ null    ┆ null ┆ null     ┆ null     │
│ null    ┆ null ┆ null     ┆ null     │
└─────────┴──────┴──────────┴──────────┘

Here,

  • clear() removes all rows while preserving column structure, even if there are NaNs.
  • clear(n=2) keeps only the first n rows, including any NaNs in those rows.
  • If you want to remove only rows with NaNs, use drop_nulls() instead.

Conclusion

In conclusion, the clear() method in Polars is a simple yet powerful function that removes all rows from a DataFrame while preserving its column names and data types. It is particularly useful when you need to reset a DataFrame without losing its schema.

Happy Learning!!

References