pyspark.pandas.Series.shift#

Series.shift(periods=1, fill_value=None)#

Shift Series/Index by desired number of periods.

Note

the current implementation of shift uses Spark’s Window without specifying partition specification. This leads to moveing all data into a single partition in a single machine and could cause serious performance degradation. Avoid this method with very large datasets.

Parameters

periodsint: Number of periods to shift. Can be positive or negative.
fill_valueobject, optional: The scalar value to use for newly introduced missing values. The default depends on the dtype of self. For numeric data, np.nan is used.

Returns

Copy of input Series/Index, shifted.

Examples

>>> df = ps.DataFrame({'Col1': [10, 20, 15, 30, 45],
...                    'Col2': [13, 23, 18, 33, 48],
...                    'Col3': [17, 27, 22, 37, 52]},
...                   columns=['Col1', 'Col2', 'Col3'])

>>> df.Col1.shift(periods=3)
   NaN
   NaN
   NaN
  10.0
  20.0
Name: Col1, dtype: float64

>>> df.Col2.shift(periods=3, fill_value=0)
   0
   0
   0
  13
  23
Name: Col2, dtype: int64

>>> df.index.shift(periods=3, fill_value=0)
Index([0, 0, 0, 0, 1], dtype='int64')