pyspark.sql.streaming.DataStreamWriter.trigger¶
-
DataStreamWriter.
trigger
(*, processingTime: Optional[str] = None, once: Optional[bool] = None, continuous: Optional[str] = None, availableNow: Optional[bool] = None) → pyspark.sql.streaming.readwriter.DataStreamWriter[source]¶ Set the trigger for the stream query. If this is not set it will run the query as fast as possible, which is equivalent to setting the trigger to
processingTime='0 seconds'
.New in version 2.0.0.
Changed in version 3.5.0: Supports Spark Connect.
- Parameters
- processingTimestr, optional
a processing time interval as a string, e.g. ‘5 seconds’, ‘1 minute’. Set a trigger that runs a microbatch query periodically based on the processing time. Only one trigger can be set.
- oncebool, optional
if set to True, set a trigger that processes only one batch of data in a streaming query then terminates the query. Only one trigger can be set.
- continuousstr, optional
a time interval as a string, e.g. ‘5 seconds’, ‘1 minute’. Set a trigger that runs a continuous query with a given checkpoint interval. Only one trigger can be set.
- availableNowbool, optional
if set to True, set a trigger that processes all available data in multiple batches then terminates the query. Only one trigger can be set.
Notes
This API is evolving.
Examples
>>> df = spark.readStream.format("rate").load()
Trigger the query for execution every 5 seconds
>>> df.writeStream.trigger(processingTime='5 seconds') <...streaming.readwriter.DataStreamWriter object ...>
Trigger the query for execution every 5 seconds
>>> df.writeStream.trigger(continuous='5 seconds') <...streaming.readwriter.DataStreamWriter object ...>
Trigger the query for reading all available data with multiple batches
>>> df.writeStream.trigger(availableNow=True) <...streaming.readwriter.DataStreamWriter object ...>