spark.
repartition
Returns a new DataFrame partitioned by the given partitioning expressions. The resulting DataFrame is hash partitioned.
The target number of partitions.
Examples
>>> psdf = ps.DataFrame({"age": [5, 5, 2, 2], ... "name": ["Bob", "Bob", "Alice", "Alice"]}).set_index("age") >>> psdf.sort_index() name age 2 Alice 2 Alice 5 Bob 5 Bob >>> new_psdf = psdf.spark.repartition(7) >>> new_psdf.to_spark().rdd.getNumPartitions() 7 >>> new_psdf.sort_index() name age 2 Alice 2 Alice 5 Bob 5 Bob