pyspark.pandas.DataFrame.idxmin¶
-
DataFrame.
idxmin
(axis: Union[int, str] = 0) → Series[source]¶ Return index of first occurrence of minimum over requested axis. NA/null values are excluded.
Note
This API collect all rows with minimum value using to_pandas() because we suppose the number of rows with min values are usually small in general.
- Parameters
- axis0 or ‘index’
Can only be set to 0 now.
- Returns
- Series
See also
Examples
>>> psdf = ps.DataFrame({'a': [1, 2, 3, 2], ... 'b': [4.0, 2.0, 3.0, 1.0], ... 'c': [300, 200, 400, 200]}) >>> psdf a b c 0 1 4.0 300 1 2 2.0 200 2 3 3.0 400 3 2 1.0 200
>>> psdf.idxmin() a 0 b 3 c 1 dtype: int64
For Multi-column Index
>>> psdf = ps.DataFrame({'a': [1, 2, 3, 2], ... 'b': [4.0, 2.0, 3.0, 1.0], ... 'c': [300, 200, 400, 200]}) >>> psdf.columns = pd.MultiIndex.from_tuples([('a', 'x'), ('b', 'y'), ('c', 'z')]) >>> psdf a b c x y z 0 1 4.0 300 1 2 2.0 200 2 3 3.0 400 3 2 1.0 200
>>> psdf.idxmin() a x 0 b y 3 c z 1 dtype: int64