pyspark.
SparkConf
Configuration for a Spark application. Used to set various Spark parameters as key-value pairs.
Most of the time, you would create a SparkConf object with SparkConf(), which will load values from spark.* Java system properties as well. In this case, any parameters you set directly on the SparkConf object take priority over system properties.
SparkConf()
For unit tests, you can also call SparkConf(false) to skip loading external settings and get the same configuration no matter what the system properties are.
SparkConf(false)
All setter methods in this class support chaining. For example, you can write conf.setMaster("local").setAppName("My app").
conf.setMaster("local").setAppName("My app")
whether to load values from Java system properties (True by default)
internal parameter used to pass a handle to the Java VM; does not need to be set by users
Optionally pass in an existing SparkConf handle to use its parameters
Notes
Once a SparkConf object is passed to Spark, it is cloned and can no longer be modified by the user.
Examples
>>> from pyspark.conf import SparkConf >>> from pyspark.context import SparkContext >>> conf = SparkConf() >>> conf.setMaster("local").setAppName("My app") <pyspark.conf.SparkConf object at ...> >>> conf.get("spark.master") 'local' >>> conf.get("spark.app.name") 'My app' >>> sc = SparkContext(conf=conf) >>> sc.master 'local' >>> sc.appName 'My app' >>> sc.sparkHome is None True
>>> conf = SparkConf(loadDefaults=False) >>> conf.setSparkHome("/path") <pyspark.conf.SparkConf object at ...> >>> conf.get("spark.home") '/path' >>> conf.setExecutorEnv("VAR1", "value1") <pyspark.conf.SparkConf object at ...> >>> conf.setExecutorEnv(pairs = [("VAR3", "value3"), ("VAR4", "value4")]) <pyspark.conf.SparkConf object at ...> >>> conf.get("spark.executorEnv.VAR1") 'value1' >>> print(conf.toDebugString()) spark.executorEnv.VAR1=value1 spark.executorEnv.VAR3=value3 spark.executorEnv.VAR4=value4 spark.home=/path >>> for p in sorted(conf.getAll(), key=lambda p: p[0]): ... print(p) ('spark.executorEnv.VAR1', 'value1') ('spark.executorEnv.VAR3', 'value3') ('spark.executorEnv.VAR4', 'value4') ('spark.home', '/path') >>> conf._jconf.setExecutorEnv("VAR5", "value5") JavaObject id... >>> print(conf.toDebugString()) spark.executorEnv.VAR1=value1 spark.executorEnv.VAR3=value3 spark.executorEnv.VAR4=value4 spark.executorEnv.VAR5=value5 spark.home=/path
Methods
contains(key)
contains
Does this configuration contain a given key?
get(key[, defaultValue])
get
Get the configured value for some key, or return a default otherwise.
getAll()
getAll
Get all values as a list of key-value pairs.
set(key, value)
set
Set a configuration property.
setAll(pairs)
setAll
Set multiple parameters, passed as a list of key-value pairs.
setAppName(value)
setAppName
Set application name.
setExecutorEnv([key, value, pairs])
setExecutorEnv
Set an environment variable to be passed to executors.
setIfMissing(key, value)
setIfMissing
Set a configuration property, if not already set.
setMaster(value)
setMaster
Set master URL to connect to.
setSparkHome(value)
setSparkHome
Set path where Spark is installed on worker nodes.
toDebugString()
toDebugString
Returns a printable version of the configuration, as a list of key=value pairs, one per line.