pyspark.pandas.Series.map¶
-
Series.
map
(arg: Union[Dict, Callable]) → pyspark.pandas.series.Series[source]¶ Map values of Series according to input correspondence.
Used for substituting each value in a Series with another value, that may be derived from a function, a
dict
.Note
make sure the size of the dictionary is not huge because it could downgrade the performance or throw OutOfMemoryError due to a huge expression within Spark. Consider the input as a functions as an alternative instead in this case.
- Parameters
- argfunction or dict
Mapping correspondence.
- Returns
- Series
Same index as caller.
See also
Series.apply
For applying more complex functions on a Series.
DataFrame.applymap
Apply a function elementwise on a whole DataFrame.
Notes
When
arg
is a dictionary, values in Series that are not in the dictionary (as keys) are converted toNone
. However, if the dictionary is adict
subclass that defines__missing__
(i.e. provides a method for default values), then this default is used rather thanNone
.Examples
>>> s = ps.Series(['cat', 'dog', None, 'rabbit']) >>> s 0 cat 1 dog 2 None 3 rabbit dtype: object
map
accepts adict
. Values that are not found in thedict
are converted toNone
, unless the dict has a default value (e.g.defaultdict
):>>> s.map({'cat': 'kitten', 'dog': 'puppy'}) 0 kitten 1 puppy 2 None 3 None dtype: object
It also accepts a function:
>>> def format(x) -> str: ... return 'I am a {}'.format(x)
>>> s.map(format) 0 I am a cat 1 I am a dog 2 I am a None 3 I am a rabbit dtype: object