pyspark.pandas.Series.shift#

Series.shift(periods=1, fill_value=None)#

Shift Series/Index by desired number of periods.

Note

the current implementation of shift uses Spark’s Window without specifying partition specification. This leads to moveing all data into a single partition in a single machine and could cause serious performance degradation. Avoid this method with very large datasets.

Parameters
periodsint

Number of periods to shift. Can be positive or negative.

fill_valueobject, optional

The scalar value to use for newly introduced missing values. The default depends on the dtype of self. For numeric data, np.nan is used.

Returns
Copy of input Series/Index, shifted.

Examples

>>> df = ps.DataFrame({'Col1': [10, 20, 15, 30, 45],
...                    'Col2': [13, 23, 18, 33, 48],
...                    'Col3': [17, 27, 22, 37, 52]},
...                   columns=['Col1', 'Col2', 'Col3'])
>>> df.Col1.shift(periods=3)
0     NaN
1     NaN
2     NaN
3    10.0
4    20.0
Name: Col1, dtype: float64
>>> df.Col2.shift(periods=3, fill_value=0)
0     0
1     0
2     0
3    13
4    23
Name: Col2, dtype: int64
>>> df.index.shift(periods=3, fill_value=0)
Index([0, 0, 0, 0, 1], dtype='int64')