pyspark.sql.datasource.DataSource#
- class pyspark.sql.datasource.DataSource(options)[source]#
A base class for data sources.
This class represents a custom data source that allows for reading from and/or writing to it. The data source provides methods to create readers and writers for reading and writing data, respectively. At least one of the methods
DataSource.reader()
orDataSource.writer()
must be implemented by any subclass to make the data source either readable or writable (or both).After implementing this interface, you can start to load your data source using
spark.read.format(...).load()
and save data usingdf.write.format(...).save()
.Methods
name
()Returns a string represents the format name of this data source.
reader
(schema)Returns a
DataSourceReader
instance for reading data.schema
()Returns the schema of the data source.
simpleStreamReader
(schema)Returns a
SimpleDataSourceStreamReader
instance for reading data.streamReader
(schema)Returns a
DataSourceStreamReader
instance for reading streaming data.streamWriter
(schema, overwrite)Returns a
DataSourceStreamWriter
instance for writing data into a streaming sink.writer
(schema, overwrite)Returns a
DataSourceWriter
instance for writing data.