pyspark.SparkContext.broadcast¶
-
SparkContext.
broadcast
(value: T) → pyspark.broadcast.Broadcast[T][source]¶ Broadcast a read-only variable to the cluster, returning a
Broadcast
object for reading it in distributed functions. The variable will be sent to each cluster only once.New in version 0.7.0.
- Parameters
- valueT
value to broadcast to the Spark nodes
- Returns
Examples
>>> mapping = {1: 10001, 2: 10002} >>> bc = sc.broadcast(mapping)
>>> rdd = sc.range(5) >>> rdd2 = rdd.map(lambda i: bc.value[i] if i in bc.value else -1) >>> rdd2.collect() [-1, 10001, 10002, -1, -1]
>>> bc.destroy()