Extract from article by Ralph Losey
The stop decision is the most difficult decision in predictive coding. The decision must be made in all types of predictive coding methods, not just our Predictive Coding 4.0. Many of the scientists attending TREC 2015 were discussing this decision process. There was no agreement on criteria for the stop decision, except that all seemed to agree it is a complex issue that cannot be resolved by random sampling alone. The prevalence of most projects is too low for that.
The e-Discovery Team grapples with the stop decision in every project, although in most it is a fairly simple decision because no more relevant documents have surfaced to the higher rankings. Still, in some projects, it can be tricky. That is where experience is especially helpful. We do not want to quit too soon and miss important relevant information. On the other hand, we do not want to waste time look at uninteresting documents.
Still, in most projects, we know it is about time to stop when the stratification of document ranking has stabilized. The training has stabilized when you see very few new documents predicted relevant that have not already been human reviewed and coded as relevant. You essentially run out of documents for step six review. Put another way, your step six no longer uncovers new relevant documents.