Unachievable Region in Precision-Recall Space and Its Effect on Empirical Evaluation

File(s)
Date
2012-05-30Author
Santos Costa, Vitor
Page, David
Davis, Jesse
Boyd, Kendrick
Publisher
University of Wisconsin-Madison Department of Computer Sciences
Metadata
Show full item recordAbstract
Precision-recall (PR) curves and the areas under them are widely used to summarize machine learning results, especially for data sets exhibiting class skew. They are often used analogously to ROC curves and the area under ROC curves. It is known that PR curves vary as class skew changes. What was not recognized before this paper is that there is a region of PR space that is completely unachievable, and the size of this region depends only on the skew. This paper precisely characterizes the size of that region and discusses its implications for empirical evaluation methodology in machine learning.
Subject
F1 score
precision-recall curves
Permanent Link
http://digital.library.wisc.edu/1793/61736Citation
TR1772
Part of
Licensed under: