BinaryClassificationMetrics¶
-
class
pyspark.mllib.evaluation.
BinaryClassificationMetrics
(scoreAndLabels: pyspark.rdd.RDD[Tuple[float, float]])[source]¶ Evaluator for binary classification.
New in version 1.4.0.
- Parameters
- scoreAndLabels
pyspark.RDD
an RDD of score, label and optional weight.
- scoreAndLabels
Examples
>>> scoreAndLabels = sc.parallelize([ ... (0.1, 0.0), (0.1, 1.0), (0.4, 0.0), (0.6, 0.0), (0.6, 1.0), (0.6, 1.0), (0.8, 1.0)], 2) >>> metrics = BinaryClassificationMetrics(scoreAndLabels) >>> metrics.areaUnderROC 0.70... >>> metrics.areaUnderPR 0.83... >>> metrics.unpersist() >>> scoreAndLabelsWithOptWeight = sc.parallelize([ ... (0.1, 0.0, 1.0), (0.1, 1.0, 0.4), (0.4, 0.0, 0.2), (0.6, 0.0, 0.6), (0.6, 1.0, 0.9), ... (0.6, 1.0, 0.5), (0.8, 1.0, 0.7)], 2) >>> metrics = BinaryClassificationMetrics(scoreAndLabelsWithOptWeight) >>> metrics.areaUnderROC 0.79... >>> metrics.areaUnderPR 0.88...
Methods
call
(name, *a)Call method of java_model
Unpersists intermediate RDDs used in the computation.
Attributes
Computes the area under the precision-recall curve.
Computes the area under the receiver operating characteristic (ROC) curve.
Methods Documentation
-
call
(name: str, *a: Any) → Any¶ Call method of java_model
-
unpersist
() → None[source]¶ Unpersists intermediate RDDs used in the computation.
New in version 1.4.0.
Attributes Documentation
-
areaUnderPR
¶ Computes the area under the precision-recall curve.
New in version 1.4.0.
-
areaUnderROC
¶ Computes the area under the receiver operating characteristic (ROC) curve.
New in version 1.4.0.