LogisticRegressionWithLBFGS¶
-
class
pyspark.mllib.classification.
LogisticRegressionWithLBFGS
[source]¶ Train a classification model for Multinomial/Binary Logistic Regression using Limited-memory BFGS.
Standard feature scaling and L2 regularization are used by default. .. versionadded:: 1.2.0
Methods
train
(data[, iterations, initialWeights, …])Train a logistic regression model on the given data.
Methods Documentation
-
classmethod
train
(data: pyspark.rdd.RDD[pyspark.mllib.regression.LabeledPoint], iterations: int = 100, initialWeights: Optional[VectorLike] = None, regParam: float = 0.0, regType: str = 'l2', intercept: bool = False, corrections: int = 10, tolerance: float = 1e-06, validateData: bool = True, numClasses: int = 2) → pyspark.mllib.classification.LogisticRegressionModel[source]¶ Train a logistic regression model on the given data.
New in version 1.2.0.
- Parameters
- data
pyspark.RDD
The training data, an RDD of
pyspark.mllib.regression.LabeledPoint
.- iterationsint, optional
The number of iterations. (default: 100)
- initialWeights
pyspark.mllib.linalg.Vector
or convertible, optional The initial weights. (default: None)
- regParamfloat, optional
The regularizer parameter. (default: 0.01)
- regTypestr, optional
The type of regularizer used for training our model. Supported values:
“l1” for using L1 regularization
“l2” for using L2 regularization (default)
None for no regularization
- interceptbool, optional
Boolean parameter which indicates the use or not of the augmented representation for training data (i.e., whether bias features are activated or not). (default: False)
- correctionsint, optional
The number of corrections used in the LBFGS update. If a known updater is used for binary classification, it calls the ml implementation and this parameter will have no effect. (default: 10)
- tolerancefloat, optional
The convergence tolerance of iterations for L-BFGS. (default: 1e-6)
- validateDatabool, optional
Boolean parameter which indicates if the algorithm should validate data before training. (default: True)
- numClassesint, optional
The number of classes (i.e., outcomes) a label can take in Multinomial Logistic Regression. (default: 2)
- data
Examples
>>> data = [ ... LabeledPoint(0.0, [0.0, 1.0]), ... LabeledPoint(1.0, [1.0, 0.0]), ... ] >>> lrm = LogisticRegressionWithLBFGS.train(sc.parallelize(data), iterations=10) >>> lrm.predict([1.0, 0.0]) 1 >>> lrm.predict([0.0, 1.0]) 0
-
classmethod