A Dimm View of Misleading Metrics and Irrelevant Research (Accuracy and F1)

eDiscovery expert Dr. Bill Dimm explains why some performance metrics don’t give an accurate view of performance for eDiscovery purposes, and why that makes a lot of research utilizing such metrics irrelevant for eDiscovery.

Extract from an article by Dr. Bill Dimm

If one algorithm achieved 98.2% accuracy while another had 98.6% for the same task, would you be surprised to find that the first algorithm required ten times as much document review to reach 75% recall compared to the second algorithm? This article explains why some performance metrics don’t give an accurate view of performance for eDiscovery purposes, and why that makes a lot of research utilizing such metrics irrelevant for eDiscovery.

The key performance metrics for eDiscovery are precision and recall.  Recall, R, is the percentage of all relevant documents that have been found. High recall is critical to defensibility. Precision, P, is the percentage of documents predicted to be relevant that actually are relevant. High precision is desirable to avoid wasting time reviewing non-relevant documents (if documents will be reviewed to confirm relevance and check for privilege before production).  In other words, precision is related to cost.

Additional Reading

Source: ComplexDiscovery

WP-Backgrounds Lite by InoPlugs Web Design and Juwelier Schönmann 1010 Wien
Read previous post:
Shopping for eDiscovery? Winter 2019 eDiscovery Pricing Survey Results

The eDiscovery Pricing Survey is designed to provide insight into industry eDiscovery pricing through the lens of 15 specific pricing...

Close