Bill Dimm on the TAR vs. Keyword Search Challenge

A post shared by Rob Robinson (@complexdiscovery) on

Extract from article by Bill Dimm

During my presentation at the NorCal eDiscovery & IG Retreat I challenged the audience to create keyword searches that would work better than technology-assisted review (predictive coding) for two topics.  Half of the room was tasked with finding articles about biology (science-oriented articles, excluding medical treatment) and the other half searched for articles about current law (excluding proposed laws or politics).  I ran one of the searches against TAR in Clustify live during the presentation (Clustify’s “shadow tags” feature allows a full document review to be simulated in a few minutes using documents that were pre-categorized by human reviewers), but couldn’t do the rest due to time constraints.

TAR beat keyword search across the board for both tasks.  The top 3,000 documents returned by TAR achieved higher recall than the top 6,000 documents for any keyword search.  In other words, if documents will be reviewed before production, TAR achieves better results (higher recall) with half as much document review compared to any of the keyword searches.  The top 6,000 documents returned by TAR achieved higher recall than all of the documents matching any individual keyword search, even when the keyword search returned 27,000 documents.

Additional Reading: