plot.
box
Make a box plot of the Series columns.
Additional keyword arguments are documented in pyspark.pandas.Series.plot().
pyspark.pandas.Series.plot()
This argument is used by pandas-on-Spark to compute approximate statistics for building a boxplot. Use smaller values to get more precise statistics (matplotlib-only).
plotly.graph_objs.Figure
Return an custom object when backend!=plotly. Return an ndarray when subplots=True (matplotlib-only).
backend!=plotly
subplots=True
Notes
There are behavior differences between pandas-on-Spark and pandas.
pandas-on-Spark computes approximate statistics - expect differences between pandas and pandas-on-Spark boxplots, especially regarding 1st and 3rd quartiles.
The whis argument is only supported as a single number.
pandas-on-Spark doesn’t support the following argument(s) (matplotlib-only).
bootstrap argument is not supported
autorange argument is not supported
Examples
Draw a box plot from a DataFrame with four columns of randomly generated data.
For Series:
>>> data = np.random.randn(25, 4) >>> df = ps.DataFrame(data, columns=list('ABCD')) >>> df['A'].plot.box()
This is an unsupported function for DataFrame type