pyspark.pandas.groupby.GroupBy.sem#

GroupBy.sem(ddof=1)[source]#

Compute standard error of the mean of groups, excluding missing values.

New in version 3.4.0.

Parameters

ddofint, default 1: Delta Degrees of Freedom. The divisor used in calculations is N - ddof, where N represents the number of elements.

See also

Examples

>>> df = ps.DataFrame({"A": [1, 2, 1, 1], "B": [True, False, False, True],
...                    "C": [3, None, 3, 4], "D": ["a", "b", "b", "a"]})

>>> df.groupby("A").sem()
          B         C
A
1  0.333333  0.333333
2       NaN       NaN

>>> df.groupby("D").sem(ddof=1)
     A    B    C
D
a  0.0  0.0  0.5
b  0.5  0.0  NaN

>>> df.B.groupby(df.A).sem()
A
1    0.333333
2         NaN
Name: B, dtype: float64