pyspark.pandas.Series.shift#
- Series.shift(periods=1, fill_value=None)#
Shift Series/Index by desired number of periods.
Note
the current implementation of shift uses Spark’s Window without specifying partition specification. This leads to moveing all data into a single partition in a single machine and could cause serious performance degradation. Avoid this method with very large datasets.
- Parameters
- periodsint
Number of periods to shift. Can be positive or negative.
- fill_valueobject, optional
The scalar value to use for newly introduced missing values. The default depends on the dtype of self. For numeric data, np.nan is used.
- Returns
- Copy of input Series/Index, shifted.
Examples
>>> df = ps.DataFrame({'Col1': [10, 20, 15, 30, 45], ... 'Col2': [13, 23, 18, 33, 48], ... 'Col3': [17, 27, 22, 37, 52]}, ... columns=['Col1', 'Col2', 'Col3'])
>>> df.Col1.shift(periods=3) 0 NaN 1 NaN 2 NaN 3 10.0 4 20.0 Name: Col1, dtype: float64
>>> df.Col2.shift(periods=3, fill_value=0) 0 0 1 0 2 0 3 13 4 23 Name: Col2, dtype: int64
>>> df.index.shift(periods=3, fill_value=0) Index([0, 0, 0, 0, 1], dtype='int64')