pyspark.sql.DataFrame.tail¶
-
DataFrame.
tail
(num: int) → List[pyspark.sql.types.Row][source]¶ Returns the last
num
rows as alist
ofRow
.Running tail requires moving data into the application’s driver process, and doing so with a very large
num
can crash the driver process with OutOfMemoryError.New in version 3.0.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- numint
Number of records to return. Will return this number of records or all records if the DataFrame contains less than this number of records.
- Returns
- list
List of rows
Examples
>>> df = spark.createDataFrame( ... [(14, "Tom"), (23, "Alice"), (16, "Bob")], ["age", "name"])
>>> df.tail(2) [Row(age=23, name='Alice'), Row(age=16, name='Bob')]