pyspark.sql.DataFrameWriter¶
-
class
pyspark.sql.
DataFrameWriter
(df: DataFrame)[source]¶ Interface used to write a
DataFrame
to external storage systems (e.g. file systems, key-value stores, etc). UseDataFrame.write
to access this.New in version 1.4.0.
Changed in version 3.4.0: Supports Spark Connect.
Methods
bucketBy
(numBuckets, col, *cols)Buckets the output by the given columns.
csv
(path[, mode, compression, sep, quote, …])Saves the content of the
DataFrame
in CSV format at the specified path.format
(source)Specifies the underlying output data source.
insertInto
(tableName[, overwrite])Inserts the content of the
DataFrame
to the specified table.jdbc
(url, table[, mode, properties])Saves the content of the
DataFrame
to an external database table via JDBC.json
(path[, mode, compression, dateFormat, …])Saves the content of the
DataFrame
in JSON format (JSON Lines text format or newline-delimited JSON) at the specified path.mode
(saveMode)Specifies the behavior when data or table already exists.
option
(key, value)Adds an output option for the underlying data source.
options
(**options)Adds output options for the underlying data source.
orc
(path[, mode, partitionBy, compression])Saves the content of the
DataFrame
in ORC format at the specified path.parquet
(path[, mode, partitionBy, compression])Saves the content of the
DataFrame
in Parquet format at the specified path.partitionBy
(*cols)Partitions the output by the given columns on the file system.
save
([path, format, mode, partitionBy])Saves the contents of the
DataFrame
to a data source.saveAsTable
(name[, format, mode, partitionBy])Saves the content of the
DataFrame
as the specified table.sortBy
(col, *cols)Sorts the output in each bucket by the given columns on the file system.
text
(path[, compression, lineSep])Saves the content of the DataFrame in a text file at the specified path.