pyspark.sql.functions.zip_with#
- pyspark.sql.functions.zip_with(left, right, f)[source]#
Merge two given arrays, element-wise, into a single array using a function. If one array is shorter, nulls are appended at the end to match the length of the longer array, before applying the function.
New in version 3.1.0.
Changed in version 3.4.0: Supports Spark Connect.
- Parameters
- left
Column
or str name of the first column or expression
- right
Column
or str name of the second column or expression
- ffunction
a binary function
(x1: Column, x2: Column) -> Column...
Can use methods ofColumn
, functions defined inpyspark.sql.functions
and ScalaUserDefinedFunctions
. PythonUserDefinedFunctions
are not supported (SPARK-27052).
- left
- Returns
Column
array of calculated values derived by applying given function to each pair of arguments.
Examples
>>> df = spark.createDataFrame([(1, [1, 3, 5, 8], [0, 2, 4, 6])], ("id", "xs", "ys")) >>> df.select(zip_with("xs", "ys", lambda x, y: x ** y).alias("powers")).show(truncate=False) +---------------------------+ |powers | +---------------------------+ |[1.0, 9.0, 625.0, 262144.0]| +---------------------------+
>>> df = spark.createDataFrame([(1, ["foo", "bar"], [1, 2, 3])], ("id", "xs", "ys")) >>> df.select(zip_with("xs", "ys", lambda x, y: concat_ws("_", x, y)).alias("xs_ys")).show() +-----------------+ | xs_ys| +-----------------+ |[foo_1, bar_2, 3]| +-----------------+