In pandas, the rolling()
function is used to provide rolling window calculations on Series data. This function allows you to perform operations such as rolling mean, rolling sum, rolling standard deviation, etc., over a specified window size.
In this article, I will explain the Series.rolling()
function by using its syntax, parameters, usage and how we can return the value of the Series.rolling()
method as a Rolling object. This Rolling object doesn’t directly represent the result of any calculation but rather serves as a configuration for performing rolling window operations on the Series.
Key Points –
- The
rolling()
function enables window-based calculations on pandas Series data, allowing you to compute statistics like rolling mean, rolling sum, rolling standard deviation, etc. - You can specify the size of the rolling window using the
window
parameter. This determines the number of consecutive observations used in each calculation. - Enables the application of aggregation functions (e.g., sum, mean, max) over the rolling window, providing a convenient way to analyze trends over fixed time periods.
- By default, the
rolling()
function handles missing values gracefully, producing NaN values for windows that do not have enough observations to calculate the statistic. You can adjust the behavior using themin_periods
parameter.
Pandas Series.rolling() Introduction
Let’s know the syntax of the Series.rolling() function.
# Syntax of Pandas Series.rolling() function
Series.rolling(window, min_periods=None, center=False, win_type=None, on=None, axis=0, closed=None)
Parameters of the Series.rolling()
Following are the parameters of the Series.rolling() function.
window
– The size of the moving window. This can be an integer (number of observations used for calculating the statistic) or an offset (a time offset string, e.g., ‘2D’ for 2 days if the series has a datetime index).min_periods
– Minimum number of observations in window required to have a value; otherwise, the result isNaN
. By default, this is equal to the size of the window.center
– IfFalse
, the window is right-aligned. IfTrue
, the window is centered.win_type
– The type of window function. Default is None.on
– For a DataFrame, a column label or index level on which to perform the rolling operation.axis
– The axis to apply the rolling operation on. By default, it’s applied along axis 0 (the rows).closed
– The side of the window to use for rolling calculation (e.g., ‘right’ means the window contains the current value and all preceding values).
Return Value
The rolling()
function returns a Rolling
object, which encapsulates the rolling window calculation applied to the pandas Series. This Rolling
object allows you to perform various operations like mean, sum, standard deviation, etc., over the rolling window.
Calculating the Rolling Sum
To calculate the rolling sum of your Pandas Series data
, you can use the rolling()
function with the sum()
method.
Now, let’s create a Pandas Series from a Python list.
import pandas as pd
# Create Pandas Series
data = pd.Series([2, 4, 3, 6, 9])
print("Original Series:\n",data)
Output:
Here, the rolling sum is calculated using the rolling()
function with a window size of 2 and then applying .sum()
to it.
# Calculate the rolling sum with a window size of 2
rolling_sum = data.rolling(window=2).sum()
print("Rolling Sum:\n", rolling_sum)
In the above examples, the rolling sum is calculated over a window size of 2, and NaN is returned for the first element (index 0) since there’s not enough data to compute the sum for it. This example yields the below output.
Calculating the Rolling Mean
Alternatively, to calculate the rolling mean for your Pandas Series data
, you can use the rolling()
function with the mean()
method.
# Calculate the rolling mean with a window size of 2
rolling_mean = data.rolling(window=2).mean()
print("Rolling Mean:\n", rolling_mean)
# Output:
# Rolling Mean:
# 0 NaN
# 1 3.0
# 2 3.5
# 3 4.5
# 4 7.5
# dtype: float64
In this example, the rolling mean is calculated over a window size of 2, so NaN is returned for the first element (index 0) since there’s not enough data to compute the mean for it. For the subsequent elements, the rolling mean is calculated based on the current element and the preceding one.
Calculating the Rolling Mean with a Centered Window
You can calculate the rolling mean with a centered window for your Pandas Series data
, you can use the rolling()
function with the mean()
method and set the center
parameter to True
.
# Calculate the rolling mean with a centered window of size 3
rolling_mean = data.rolling(window=3, center=True).mean()
print("Rolling mean with centered window:\n", rolling_mean)
# Output:
# Rolling mean with centered window:
# 0 NaN
# 1 3.000000
# 2 4.333333
# 3 6.000000
# 4 NaN
# dtype: float64
In the above example, the rolling mean is calculated over a centered window of size 3, so NaN values are returned for the first and last elements (indices 0 and 4) since there are not enough elements to form a complete window around them.
Applying a Custom Function to Calculate the Rolling Maximum
To apply a custom function, such as calculating the rolling maximum, you can use the rolling().apply()
method.
# Define a custom function
# To calculate rolling maximum
def custom_max(x):
return x.max()
# Calculate the rolling maximum with a window size of 2
# Using the custom function
rolling_max = data.rolling(window=2).apply(custom_max)
print("Rolling Maximum:\n", rolling_max)
# Output:
# Rolling Maximum:
# 0 NaN
# 1 4.0
# 2 4.0
# 3 6.0
# 4 9.0
# dtype: float64
In this example, the custom function custom_max
is applied to calculate the rolling maximum over a window size of 2. The rolling maximum is then computed based on the maximum value within each window.
Using a Rolling Window with a Minimum Number of Observations
You can use the min_periods
parameter to specify the minimum number of observations required in the rolling window for a valid computation.
# Calculate the rolling mean with a window size of 3 and minimum 2 observations
rolling_mean_min_obs = data.rolling(window=3, min_periods=2).mean()
print("Rolling mean with minimum observations:\n", rolling_mean_min_obs)
# Output:
# Rolling mean with minimum observations:
# 0 NaN
# 1 3.000000
# 2 3.000000
# 3 4.333333
# 4 6.000000
# dtype: float64
In this example, the rolling mean is calculated with a window size of 3, but a minimum of 2 observations is required in each window for a valid computation. As a result, NaN is returned for the first element (index 0) since there are not enough observations, and the rolling mean is calculated based on the available observations for subsequent elements.
Calculating the Rolling Standard Deviation
Similarly, to calculate the rolling standard deviation for your Pandas Series data
, you can use the rolling()
function with the std()
method.
# Calculate the rolling standard deviation with a window size of 2
rolling_std = data.rolling(window=2).std()
print("Rolling standard deviation:\n", rolling_std)
# Output:
# Rolling standard deviation:
# 0 NaN
# 1 1.414214
# 2 0.707107
# 3 2.121320
# 4 2.121320
# dtype: float64
In the above example, the rolling standard deviation is calculated over a window size of 2. NaN is returned for the first element (index 0) since there’s not enough data to compute the standard deviation for it. For the subsequent elements, the rolling standard deviation is calculated based on the current element and the preceding one.
FAQ on Pandas Series.rolling() Function
The pandas.Series.rolling()
function provides rolling window calculations, which allow you to perform operations on a fixed-size sliding window over the data. This is particularly useful for time series analysis, such as calculating moving averages, rolling sums, etc.
The window size is specified using the window
parameter, which can be an integer (number of observations) or an offset string (e.g., ‘2D’ for two days if the index is datetime-based).
The min_periods
parameter sets the minimum number of observations required within the window to produce a result. If the number of observations in the window is less than min_periods
, the result will be NaN
.
You can center the rolling window by setting the center
parameter to True
. By default, center
is False
, which means the window is right-aligned.
The function returns a Rolling
object. To obtain the final computed results, you need to chain additional methods like .mean()
, .sum()
, etc., to this Rolling
object.
Conclusion
In this article, you have learned Pandas Series.rolling()
function is a powerful tool for performing rolling window calculations on time-series data. It allows for the computation of various statistics, such as mean, sum, standard deviation, and more, over a specified window size. Key points to remember include examples.
Happy Learning!!
Related Articles
- Pandas Rolling Sum
- Pandas Series.clip() Function
- Pandas Series iloc[] Function
- Pandas series.str.get() Function
- Pandas Series round() Function
- Pandas Series.dtype() Function
- Pandas Series mode() Function
- Pandas Series.min() Function
- Pandas Convert Series to Json
- Pandas Series sample() Function
- Pandas Series count() Function
- Pandas Series reset_index() Function
- Pandas Series Drop duplicates() Function
- How to Make a Histogram in Pandas Series?
- Pandas Series.str.contains() With Examples
- Pandas Series.value_counts() With Examples