pyspark.ml.linalg.
DenseVector
A dense vector represented by a value array. We use numpy array for storage and arithmetics will be delegated to the underlying numpy array.
Examples
>>> v = Vectors.dense([1.0, 2.0]) >>> u = Vectors.dense([3.0, 4.0]) >>> v + u DenseVector([4.0, 6.0]) >>> 2 - v DenseVector([1.0, 0.0]) >>> v / 2 DenseVector([0.5, 1.0]) >>> v * u DenseVector([3.0, 8.0]) >>> u / v DenseVector([3.0, 2.0]) >>> u % 2 DenseVector([1.0, 0.0]) >>> -v DenseVector([-1.0, -2.0])
Methods
dot(other)
dot
Compute the dot product of two Vectors.
norm(p)
norm
Calculates the norm of a DenseVector.
numNonzeros()
numNonzeros
Number of nonzero elements.
squared_distance(other)
squared_distance
Squared distance of two Vectors.
toArray()
toArray
Returns the underlying numpy.ndarray
Attributes
values
Methods Documentation
Compute the dot product of two Vectors. We support (Numpy array, list, SparseVector, or SciPy sparse) and a target NumPy array that is either 1- or 2-dimensional. Equivalent to calling numpy.dot of the two vectors.
>>> dense = DenseVector(array.array('d', [1., 2.])) >>> dense.dot(dense) 5.0 >>> dense.dot(SparseVector(2, [0, 1], [2., 1.])) 4.0 >>> dense.dot(range(1, 3)) 5.0 >>> dense.dot(np.array(range(1, 3))) 5.0 >>> dense.dot([1.,]) Traceback (most recent call last): ... AssertionError: dimension mismatch >>> dense.dot(np.reshape([1., 2., 3., 4.], (2, 2), order='F')) array([ 5., 11.]) >>> dense.dot(np.reshape([1., 2., 3.], (3, 1), order='F')) Traceback (most recent call last): ... AssertionError: dimension mismatch
>>> a = DenseVector([0, -1, 2, -3]) >>> a.norm(2) 3.7... >>> a.norm(1) 6.0
Number of nonzero elements. This scans all active values and count non zeros
>>> dense1 = DenseVector(array.array('d', [1., 2.])) >>> dense1.squared_distance(dense1) 0.0 >>> dense2 = np.array([2., 1.]) >>> dense1.squared_distance(dense2) 2.0 >>> dense3 = [2., 1.] >>> dense1.squared_distance(dense3) 2.0 >>> sparse1 = SparseVector(2, [0, 1], [2., 1.]) >>> dense1.squared_distance(sparse1) 2.0 >>> dense1.squared_distance([1.,]) Traceback (most recent call last): ... AssertionError: dimension mismatch >>> dense1.squared_distance(SparseVector(1, [0,], [1.,])) Traceback (most recent call last): ... AssertionError: dimension mismatch
Attributes Documentation