DataFrame.
dropna
Returns a new DataFrame omitting rows with null values. DataFrame.dropna() and DataFrameNaFunctions.drop() are aliases of each other.
DataFrame
DataFrame.dropna()
DataFrameNaFunctions.drop()
New in version 1.3.1.
Changed in version 3.4.0: Supports Spark Connect.
‘any’ or ‘all’. If ‘any’, drop a row if it contains any nulls. If ‘all’, drop a row only if all its values are null.
default None If specified, drop rows that have less than thresh non-null values. This overwrites the how parameter.
optional list of column names to consider.
DataFrame with null only rows excluded.
Examples
>>> from pyspark.sql import Row >>> df = spark.createDataFrame([ ... Row(age=10, height=80, name="Alice"), ... Row(age=5, height=None, name="Bob"), ... Row(age=None, height=None, name="Tom"), ... Row(age=None, height=None, name=None), ... ]) >>> df.na.drop().show() +---+------+-----+ |age|height| name| +---+------+-----+ | 10| 80|Alice| +---+------+-----+