pyspark.sql.functions.tuple_sketch_theta_integer#
- pyspark.sql.functions.tuple_sketch_theta_integer(col)[source]#
Returns the theta value from a Datasketches TupleSketch with integer summaries.
New in version 4.2.0.
- Parameters
- col
Columnor column name The column containing a binary TupleSketch representation
- col
- Returns
ColumnThe theta value (between 0.0 and 1.0).
See also
Examples
>>> from pyspark.sql import functions as sf >>> df = spark.createDataFrame([(1, 10), (2, 20)], ["key", "value"]) >>> df.agg(sf.tuple_sketch_theta_integer( ... sf.tuple_sketch_agg_integer("key", "value"))).show() +-------------------------------------------------------------------------+ |tuple_sketch_theta_integer(tuple_sketch_agg_integer(key, value, 12, sum))| +-------------------------------------------------------------------------+ | 1.0| +-------------------------------------------------------------------------+