pyspark.sql.DataFrame.head¶

DataFrame.head(n: Optional[int] = None) → Union[pyspark.sql.types.Row, None, List[pyspark.sql.types.Row]][source]¶

Returns the first n rows.

New in version 1.3.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters

nint, optional: default 1. Number of rows to return.

Returns

If n is greater than 1, return a list of Row.
If n is 1, return a single Row.

Notes

This method should only be used if the resulting array is expected to be small, as all the data is loaded into the driver’s memory.

Examples

>>> df = spark.createDataFrame([
...     (2, "Alice"), (5, "Bob")], schema=["age", "name"])
>>> df.head()
Row(age=2, name='Alice')
>>> df.head(1)
[Row(age=2, name='Alice')]

pyspark.sql.DataFrame.groupBy

pyspark.sql.DataFrame.hint