Evaluation of Machine-Learning Protocols for Technology-Assisted Review in Electronic Discovery

Using a novel evaluation toolkit that simulates a human reviewer in the loop, we compare the effectiveness of three machine-learning protocols for technology-assisted review as used in document review for discovery in legal proceedings.

By Gordon V. Cormack & Maura R. Grossman

To appear in the proceedings of  SIGIR 2014: The 37th Annual ACM SIGIR Conference on Research and Development in Information Retrieval

Abstract: Using a novel evaluation toolkit that simulates a human reviewer in the loop, we compare the effectiveness of three machine-learning protocols for technology-assisted review as used in document review for discovery in legal proceedings. Our comparison addresses a central question in the deployment of technology-assisted review: Should training documents be selected at random, or should they be selected using one or more non-random methods, such as keyword search or active learning? On eight review tasks — four derived from the TREC […]

2 comments

  • Pingback: Random vs active selection of training examples in e-discovery « Evaluating E-Discovery
  • Pingback: Random vs Active Selection of Training Examples in eDiscovery | @ComplexD

WP-Backgrounds Lite by InoPlugs Web Design and Juwelier Schönmann 1010 Wien