pyspark.sql.functions.first_value¶
-
pyspark.sql.functions.
first_value
(col: ColumnOrName, ignoreNulls: Union[bool, pyspark.sql.column.Column, None] = None) → pyspark.sql.column.Column[source]¶ Returns the first value of col for a group of rows. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
New in version 3.5.0.
- Parameters
- Returns
Column
some value of col for a group of rows.
Examples
>>> import pyspark.sql.functions as sf >>> spark.createDataFrame( ... [(None, 1), ("a", 2), ("a", 3), ("b", 8), ("b", 2)], ["a", "b"] ... ).select(sf.first_value('a'), sf.first_value('b')).show() +--------------+--------------+ |first_value(a)|first_value(b)| +--------------+--------------+ | NULL| 1| +--------------+--------------+
>>> import pyspark.sql.functions as sf >>> spark.createDataFrame( ... [(None, 1), ("a", 2), ("a", 3), ("b", 8), ("b", 2)], ["a", "b"] ... ).select(sf.first_value('a', True), sf.first_value('b', True)).show() +--------------+--------------+ |first_value(a)|first_value(b)| +--------------+--------------+ | a| 1| +--------------+--------------+