pyspark.sql.functions.
min_by
Returns the value associated with the minimum value of ord.
New in version 3.3.0.
Changed in version 3.4.0: Supports Spark Connect.
Column
target column to compute on.
column to be minimized
value associated with the minimum value of ord.
Examples
>>> df = spark.createDataFrame([ ... ("Java", 2012, 20000), ("dotNET", 2012, 5000), ... ("dotNET", 2013, 48000), ("Java", 2013, 30000)], ... schema=("course", "year", "earnings")) >>> df.groupby("course").agg(min_by("year", "earnings")).show() +------+----------------------+ |course|min_by(year, earnings)| +------+----------------------+ | Java| 2012| |dotNET| 2012| +------+----------------------+