pyspark.SparkContext.range¶
-
SparkContext.
range
(start: int, end: Optional[int] = None, step: int = 1, numSlices: Optional[int] = None) → pyspark.rdd.RDD[int][source]¶ Create a new RDD of int containing elements from start to end (exclusive), increased by step every element. Can be called the same way as python’s built-in range() function. If called with a single argument, the argument is interpreted as end, and start is set to 0.
New in version 1.5.0.
- Parameters
- startint
the start value
- endint, optional
the end value (exclusive)
- stepint, optional, default 1
the incremental step
- numSlicesint, optional
the number of partitions of the new RDD
- Returns
RDD
An RDD of int
See also
Examples
>>> sc.range(5).collect() [0, 1, 2, 3, 4] >>> sc.range(2, 4).collect() [2, 3] >>> sc.range(1, 7, 2).collect() [1, 3, 5]
Generate RDD with a negative step
>>> sc.range(5, 0, -1).collect() [5, 4, 3, 2, 1] >>> sc.range(0, 5, -1).collect() []
Control the number of partitions
>>> sc.range(5, numSlices=1).getNumPartitions() 1 >>> sc.range(5, numSlices=10).getNumPartitions() 10