pyspark.sql.functions.window_time¶
-
pyspark.sql.functions.
window_time
(windowColumn: ColumnOrName) → pyspark.sql.column.Column[source]¶ Computes the event time from a window column. The column window values are produced by window aggregating operators and are of type STRUCT<start: TIMESTAMP, end: TIMESTAMP> where start is inclusive and end is exclusive. The event time of records produced by window aggregating operators can be computed as
window_time(window)
and arewindow.end - lit(1).alias("microsecond")
(as microsecond is the minimal supported event time precision). The window column must be one produced by a window aggregating operator.New in version 3.4.0.
- Parameters
- windowColumn
Column
The window column of a window aggregate records.
- windowColumn
- Returns
Column
the column for computed results.
Notes
Supports Spark Connect.
Examples
>>> import datetime >>> df = spark.createDataFrame( ... [(datetime.datetime(2016, 3, 11, 9, 0, 7), 1)], ... ).toDF("date", "val")
Group the data into 5 second time windows and aggregate as sum.
>>> w = df.groupBy(window("date", "5 seconds")).agg(sum("val").alias("sum"))
Extract the window event time using the window_time function.
>>> w.select( ... w.window.end.cast("string").alias("end"), ... window_time(w.window).cast("string").alias("window_time"), ... "sum" ... ).collect() [Row(end='2016-03-11 09:00:10', window_time='2016-03-11 09:00:09.999999', sum=1)]