• Post author:
  • Post category:Pandas
  • Post last modified:June 11, 2024
  • Reading time:16 mins read

In pandas, the rolling() function is used to provide rolling window calculations on Series data. This function allows you to perform operations such as rolling mean, rolling sum, rolling standard deviation, etc., over a specified window size.

Advertisements

In this article, I will explain the Series.rolling() function by using its syntax, parameters, usage and how we can return the value of the Series.rolling() method as a Rolling object. This Rolling object doesn’t directly represent the result of any calculation but rather serves as a configuration for performing rolling window operations on the Series.

Key Points –

  • The rolling() function enables window-based calculations on pandas Series data, allowing you to compute statistics like rolling mean, rolling sum, rolling standard deviation, etc.
  • You can specify the size of the rolling window using the window parameter. This determines the number of consecutive observations used in each calculation.
  • Enables the application of aggregation functions (e.g., sum, mean, max) over the rolling window, providing a convenient way to analyze trends over fixed time periods.
  • By default, the rolling() function handles missing values gracefully, producing NaN values for windows that do not have enough observations to calculate the statistic. You can adjust the behavior using the min_periods parameter.

Pandas Series.rolling() Introduction

Let’s know the syntax of the Series.rolling() function.


# Syntax of Pandas Series.rolling() function
Series.rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None)

Parameters of the Series.rolling()

Following are the parameters of the Series.rolling() function.

  • window – The size of the moving window. This can be an integer (number of observations used for calculating the statistic) or an offset (a time offset string, e.g., ‘2D’ for 2 days if the series has a datetime index).
  • min_periods – Minimum number of observations in window required to have a value; otherwise, the result is NaN. By default, this is equal to the size of the window.
  • center – If False, the window is right-aligned. If True, the window is centered.
  • win_type – The type of window function. Default is None.
  • on – For a DataFrame, a column label or index level on which to perform the rolling operation.
  • axis – The axis to apply the rolling operation on. By default, it’s applied along axis 0 (the rows).
  • closed – The side of the window to use for rolling calculation (e.g., ‘right’ means the window contains the current value and all preceding values).

Return Value

The rolling() function returns a Rolling object, which encapsulates the rolling window calculation applied to the pandas Series. This Rolling object allows you to perform various operations like mean, sum, standard deviation, etc., over the rolling window.

Calculating the Rolling Sum

To calculate the rolling sum of your Pandas Series data, you can use the rolling() function with the sum() method.

Now, let’s create a Pandas Series from a Python list.


import pandas as pd

# Create Pandas Series
data = pd.Series([2, 4, 3, 6, 9])
print("Original Series:\n",data)

Output:

pandas series rolling

Here, the rolling sum is calculated using the rolling() function with a window size of 2 and then applying .sum() to it.


# Calculate the rolling sum with a window size of 2
rolling_sum = data.rolling(window=2).sum()
print("Rolling Sum:\n", rolling_sum)

In the above examples, the rolling sum is calculated over a window size of 2, and NaN is returned for the first element (index 0) since there’s not enough data to compute the sum for it. This example yields the below output.

Calculating the Rolling Mean

Alternatively, to calculate the rolling mean for your Pandas Series data, you can use the rolling() function with the mean() method.


# Calculate the rolling mean with a window size of 2
rolling_mean = data.rolling(window=2).mean()
print("Rolling Mean:\n", rolling_mean)

# Output:
# Rolling Mean:
#  0    NaN
# 1    3.0
# 2    3.5
# 3    4.5
# 4    7.5
# dtype: float64

In this example, the rolling mean is calculated over a window size of 2, so NaN is returned for the first element (index 0) since there’s not enough data to compute the mean for it. For the subsequent elements, the rolling mean is calculated based on the current element and the preceding one.

Calculating the Rolling Mean with a Centered Window

You can calculate the rolling mean with a centered window for your Pandas Series data, you can use the rolling() function with the mean() method and set the center parameter to True.


# Calculate the rolling mean with a centered window of size 3
rolling_mean = data.rolling(window=3, center=True).mean()
print("Rolling mean with centered window:\n", rolling_mean)

# Output:
# Rolling mean with centered window:
#  0         NaN
# 1    3.000000
# 2    4.333333
# 3    6.000000
# 4         NaN
# dtype: float64

In the above example, the rolling mean is calculated over a centered window of size 3, so NaN values are returned for the first and last elements (indices 0 and 4) since there are not enough elements to form a complete window around them.

Applying a Custom Function to Calculate the Rolling Maximum

To apply a custom function, such as calculating the rolling maximum, you can use the rolling().apply() method.


# Define a custom function 
# To calculate rolling maximum
def custom_max(x):
    return x.max()

# Calculate the rolling maximum with a window size of 2 
# Using the custom function
rolling_max = data.rolling(window=2).apply(custom_max)
print("Rolling Maximum:\n", rolling_max)

# Output:
# Rolling Maximum:
#  0    NaN
# 1    4.0
# 2    4.0
# 3    6.0
# 4    9.0
# dtype: float64

In this example, the custom function custom_max is applied to calculate the rolling maximum over a window size of 2. The rolling maximum is then computed based on the maximum value within each window.

Using a Rolling Window with a Minimum Number of Observations

You can use the min_periods parameter to specify the minimum number of observations required in the rolling window for a valid computation.


# Calculate the rolling mean with a window size of 3 and minimum 2 observations
rolling_mean_min_obs = data.rolling(window=3, min_periods=2).mean()
print("Rolling mean with minimum observations:\n", rolling_mean_min_obs)

# Output:
# Rolling mean with minimum observations:
# 0         NaN
# 1    3.000000
# 2    3.000000
# 3    4.333333
# 4    6.000000
# dtype: float64

In this example, the rolling mean is calculated with a window size of 3, but a minimum of 2 observations is required in each window for a valid computation. As a result, NaN is returned for the first element (index 0) since there are not enough observations, and the rolling mean is calculated based on the available observations for subsequent elements.

Calculating the Rolling Standard Deviation

Similarly, to calculate the rolling standard deviation for your Pandas Series data, you can use the rolling() function with the std() method.


# Calculate the rolling standard deviation with a window size of 2
rolling_std = data.rolling(window=2).std()
print("Rolling standard deviation:\n", rolling_std)

# Output:
# Rolling standard deviation:
#  0         NaN
# 1    1.414214
# 2    0.707107
# 3    2.121320
# 4    2.121320
# dtype: float64

In the above example, the rolling standard deviation is calculated over a window size of 2. NaN is returned for the first element (index 0) since there’s not enough data to compute the standard deviation for it. For the subsequent elements, the rolling standard deviation is calculated based on the current element and the preceding one.

FAQ on Pandas Series.rolling() Function

What is pandas.Series.rolling()?

The pandas.Series.rolling() function provides rolling window calculations, which allow you to perform operations on a fixed-size sliding window over the data. This is particularly useful for time series analysis, such as calculating moving averages, rolling sums, etc.

How do I specify the window size for the rolling operation?

The window size is specified using the window parameter, which can be an integer (number of observations) or an offset string (e.g., ‘2D’ for two days if the index is datetime-based).

What does the min_periods parameter do?

The min_periods parameter sets the minimum number of observations required within the window to produce a result. If the number of observations in the window is less than min_periods, the result will be NaN.

How can I center the rolling window?

You can center the rolling window by setting the center parameter to True. By default, center is False, which means the window is right-aligned.

What is the return value of the pandas.Series.rolling() function?

The function returns a Rolling object. To obtain the final computed results, you need to chain additional methods like .mean(), .sum(), etc., to this Rolling object.

Conclusion

In this article, you have learned Pandas Series.rolling() function is a powerful tool for performing rolling window calculations on time-series data. It allows for the computation of various statistics, such as mean, sum, standard deviation, and more, over a specified window size. Key points to remember include examples.

Happy Learning!!