pyspark.sql.functions.first_value#
- pyspark.sql.functions.first_value(col, ignoreNulls=None)[source]#
Returns the first value of col for a group of rows. It will return the first non-null value it sees when ignoreNulls is set to true. If all values are null, then null is returned.
New in version 3.5.0.
- Parameters
- Returns
Column
some value of col for a group of rows.
Examples
>>> import pyspark.sql.functions as sf >>> spark.createDataFrame( ... [(None, 1), ("a", 2), ("a", 3), ("b", 8), ("b", 2)], ["a", "b"] ... ).select(sf.first_value('a'), sf.first_value('b')).show() +--------------+--------------+ |first_value(a)|first_value(b)| +--------------+--------------+ | NULL| 1| +--------------+--------------+
>>> import pyspark.sql.functions as sf >>> spark.createDataFrame( ... [(None, 1), ("a", 2), ("a", 3), ("b", 8), ("b", 2)], ["a", "b"] ... ).select(sf.first_value('a', True), sf.first_value('b', True)).show() +--------------+--------------+ |first_value(a)|first_value(b)| +--------------+--------------+ | a| 1| +--------------+--------------+