pyspark.sql.DataFrame.toLocalIterator¶
-
DataFrame.
toLocalIterator
(prefetchPartitions: bool = False) → Iterator[pyspark.sql.types.Row][source]¶ Returns an iterator that contains all of the rows in this
DataFrame
. The iterator will consume as much memory as the largest partition in thisDataFrame
. With prefetch it may consume up to the memory of the 2 largest partitions.New in version 2.0.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- prefetchPartitionsbool, optional
If Spark should pre-fetch the next partition before it is needed.
Changed in version 3.4.0: This argument does not take effect for Spark Connect.
- Returns
- Iterator
Iterator of rows.
Examples
>>> df = spark.createDataFrame( ... [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"]) >>> list(df.toLocalIterator()) [Row(age=14, name='Tom'), Row(age=23, name='Alice'), Row(age=16, name='Bob')]