pyspark.sql.DataFrameReader.option#

DataFrameReader.option(key, value)[source]#

Adds an input option for the underlying data source.

New in version 1.5.0.

Changed in version 3.4.0: Supports Spark Connect.

Parameters
keystr

The key for the option to set.

value

The value for the option to set.

Examples

>>> spark.read.option("key", "value")
<...readwriter.DataFrameReader object ...>

Specify the option ‘nullValue’ with reading a CSV file.

>>> import tempfile
>>> with tempfile.TemporaryDirectory(prefix="option") as d:
...     # Write a DataFrame into a CSV file
...     df = spark.createDataFrame([{"age": 100, "name": "Hyukjin Kwon"}])
...     df.write.mode("overwrite").format("csv").save(d)
...
...     # Read the CSV file as a DataFrame with 'nullValue' option set to 'Hyukjin Kwon'.
...     spark.read.schema(df.schema).option(
...         "nullValue", "Hyukjin Kwon").format('csv').load(d).show()
+---+----+
|age|name|
+---+----+
|100|NULL|
+---+----+