• Post author:
  • Post category:Pandas
  • Post last modified:July 8, 2024
  • Reading time:16 mins read

In Pandas, the shift() function is used to shift the values in a DataFrame or Series along a particular axis. This is useful for creating lagged versions of data, which is often needed in time series analysis.

Advertisements

In this article, I will explain the Pandas DataFrame shift() function by using its syntax, parameters, usage, and how to return a Series or DataFrame with data shifted by the designated number of periods.

Key Points –

  • The shift() function is used to shift the values in a DataFrame or Series by a specified number of periods along the desired axis, often used for time-series data manipulation.
  • When working with time series data, the freq parameter can be used to shift the index by a specific frequency (e.g., days, months), enhancing its utility in time-based analyses.
  • By default, shift() introduces NaN values in positions where data is moved, but these can be replaced with a specified fill_value.
  • The function returns a DataFrame or Series with the same shape and type as the caller, but with the data shifted.

Pandas DataFrame shift() Introduction

Let’s know the syntax of the Pandas DataFrame shift() function.


# Syntax of Pandas dataframe shift()
DataFrame.shift(periods=1, freq=None, axis=0, fill_value=None)

Parameters of the DataFrame shift()

Following are the parameters of the DataFrame shift() function.

  • periods – int. Number of periods to shift. Positive values shift data downward or to the right, and negative values shift data upward or to the left. Default is 1.
  • freq – DateOffset, timedelta, or str. Optional frequency string or DateOffset object to shift the index by a specific frequency increment. Only applicable to time series data with a DateTimeIndex.
  • axis – {0 or ‘index’, 1 or ‘columns’}. Axis along which to shift. 0 or ‘index’ for shifting the index (rows), 1 or ‘columns’ for shifting the columns. Default is 0.
  • fill_value – scalar, optional. The scalar value to use for newly introduced missing values. Default is None, which introduces NaN values.

Return Value

It returns a DataFrame or Series with the same shape and data type, but with the values shifted.

Usage of Pandas DataFrame shift() Function

The shift() function in Pandas is used to shift the values in a DataFrame or Series by a specified number of periods along a particular axis. It is particularly useful in time series analysis, data preprocessing, and creating lagged variables.

Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are Column1Column2.


# Create DataFrame 
import pandas as pd
import numpy as np

data = {'Column1': [2, 4, 6, 8, 10],
        'Column2': [3, 5, 7, 9, 11]}
df = pd.DataFrame(data)
print("Original DataFrame:\n",df)

Yields below output.

 pandas dataframe shift

Shifting Rows Downward by 1 Period (Default)

To shift rows downward by 1 period using the default settings of the shift() function in Pandas, you can simply call the shift() method on your DataFrame or Series without any additional arguments.


# Shifting rows downward by 1 period
df2 = df.shift()
print("Shifted DataFrame:\n", df2)

Here,

  • The shift() function is called on the DataFrame df.
  • By default, shift() shifts the rows downward by 1 period (periods=1).
  • The top row is filled with NaN values because there are no preceding values to fill these positions.
  • All other rows are shifted down by one position, preserving their original order.
 pandas dataframe shift

Shifting Rows Upward by 2 Periods

Alternatively, to shift rows upward by 2 periods in a DataFrame, you can use the shift() function with the periods parameter set to -2.


# Shifting rows upward by 2 periods
df2 = df.shift(periods=-2)
print("Shifted DataFrame:\n", df2)

# Output:
# Shifted DataFrame:
#     Column1  Column2
# 0      6.0      7.0
# 1      8.0      9.0
# 2     10.0     11.0
# 3      NaN      NaN
# 4      NaN      NaN

Here,

  • The periods parameter is set to -2 to shift rows upward by two periods.
  • The last two rows are filled with NaN values because there are no succeeding values to fill these positions.
  • All other rows are shifted up by two positions, preserving their original order.

Shifting Columns to the Right

To shift columns to the right in a DataFrame, you can use the shift() function with the axis parameter set to 1.


# Shifting columns to the right by 1 period
df2 = df.shift(periods=1, axis=1)
print("Shifted DataFrame:\n", df2)

# Output:
# Shifted DataFrame:
#     Column1  Column2
# 0      NaN      2.0
# 1      NaN      4.0
# 2      NaN      6.0
# 3      NaN      8.0
# 4      NaN     10.0

Here,

  • The periods parameter is set to 1 to shift by one period.
  • The axis parameter is set to 1 to indicate that the shift should be along the columns.
  • The first column (Column1) is filled with NaN values because there are no preceding columns to fill these positions.
  • All other columns are shifted to the right by one position, preserving their original order.

Shifting with a Fill Value

When using the shift() function in Pandas, you can specify a fill_value parameter to replace missing values that are introduced as a result of shifting.


# Shifting rows downward by 1 period 
# With a fill value of 5
df2 = df.shift(fill_value=5)
print("Shifted DataFrame:\n", df2)

# Using a fill value
df2 = df.shift(periods=1, fill_value=5)
print("Shifted DataFrame:\n", df2)

# Output:
# Shifted DataFrame:
#     Column1  Column2
# 0        5        5
# 1        2        3
# 2        4        5
# 3        6        7
# 4        8        9

Here,

  • The shift() function is called on the DataFrame df.
  • By default, shift() shifts rows downward by 1 period (periods=1).
  • The fill_value=5 parameter ensures that any NaN values introduced by the shift operation are replaced with 5.
  • As a result, the first row (NaN) in the shifted DataFrame is filled with 5 because there are no preceding rows to fill these positions.

Shifting a Time Series with Frequency

Similarly, when working with time series data, the shift() function can be used with the freq parameter to shift the index by a specific frequency increment. This is particularly useful for aligning time series data or creating lagged features with a clear temporal interpretation.


import pandas as pd

# Creating a sample time series DataFrame
dates = pd.date_range('20220202', periods=5)
df_time = pd.DataFrame({'Value': [1, 2, 3, 4, 5]}, index=dates)

# Shifting the time series by 1 day
df2 = df_time.shift(periods=1, freq='D')
print("Shifting the time series:\n", df2)

# Output:
# Shifting the time series:
#              Value
# 2022-02-03      1
# 2022-02-04      2
# 2022-02-05      3
# 2022-02-06      4
# 2022-02-07      5

Here,

  • A sample time series DataFrame df_time is created with a date range as the index.
  • The shift() function is called on the DataFrame df_time.
  • The periods parameter is set to 1 to shift the data by one period.
  • The freq parameter is set to 'D' to shift the index by one day.
  • As a result, each row’s index is moved one day forward, preserving the values in their original order.

FAQ on Pandas DataFrame shift() Function

What does the shift() function do in Pandas?

The shift() function in Pandas shifts the values in a DataFrame or Series by a specified number of periods along a given axis. This can be useful for creating lagged versions of data, aligning data, and calculating differences between consecutive data points.

How do I shift columns to the right by 1 period in a DataFrame?

To shift columns to the right by 1 period in a DataFrame, you can use the shift() function with the axis parameter set to 1.

What happens to the missing values introduced by the shift operation?

By default, shift() introduces NaN values in positions where data is shifted. You can replace these missing values with a specified scalar using the fill_value parameter.

How do I shift rows downward by 1 period in a DataFrame?

To shift rows downward by 1 period in a DataFrame, you can use the shift() function with its default parameters.

Can I use shift() with a MultiIndex DataFrame?

You can use shift() with a MultiIndex DataFrame. The function will apply the shift to each level of the index based on the specified parameters.

Conclusion

In this article, you have explored the Pandas DataFrame shift() function, including its syntax, parameters, and usage. You also learned how to return a DataFrame or Series with the same shape and data type as the original, but with values shifted based on the specified parameters.

Happy Learning!!

Reference