Predictive Coding and Providers: A 120-Second Survey


Who Does What and With Whom?

Challenge:  With the growing awareness and use of the technology-assisted review feature of predictive coding in the legal arena today, it is increasingly difficult to determine the origin and approach of the actual predictive coding technologies used by leading eDiscovery providers in their offerings.

Goal: To identify the ability, origin and approach of leading eDiscovery providers in relation to the technology-assisted review feature of predictive coding.

Scope: Using a baseline listing (1) of fifty leading eDiscovery providers as aggregated from mentions in formal industry reports published between August 2011 and December 2012 and a general listing (2) of technology-assisted review providers, determine the ability of leading providers to deliver a predictive coding feature and to determine specifics as to the development approach, technology integration, machine learning approach and sampling approach of offered predictive coding features.

Approach:  To provide a short, 120-Second Survey that contains questions designed to identify and define the predictive coding capability of leading eDiscovery vendors.  The survey is non-comprehensive by design and is aimed at creating a starting point for deeper differentiation discussions on provider predictive coding offerings.

Participation:  Representatives of leading eDiscovery providers are encouraged to complete the short 120-Second Survey on behalf of their organizations.  Results of survey (excluding responder contact information) will be aggregated and published on the ComplexDiscovery website for usage by the eDiscovery community.

The 120-Second Survey.

Provider Predictive Coding Background Information

Company / Firm (required)

Responder First and Last Name (required)

Responder Title / Role with Company (required)

Responder Email (required)

Does your Company offer Predictive Coding?


Predictive Coding: An industry-specific term generally used to describe a Technology-Assisted Review process involving the use of a Machine Learning Algorithm to distinguish Relevant from Non-Relevant Documents, based on a Subject Matter Expert's Coding of a Training Set of Documents. (3)

If you offer predictive coding technology, please continue with the following section.

Provider Predictive Coding Considerations

Feature Nomenclature

Name of Predictive Coding Feature

  • Name of Predictive Coding Feature: The generic and/or brand name used to identify the predictive coding feature within your eDiscovery portfolio of offerings.

Technology Development

Developed InternallyLicensed ExternallyHybrid Development

  • Developed Internally: Proprietary development of predictive coding technology.
  • Licensed Externally: Predictive coding technology completely licensed from software developer.
  • Hybrid Development: Combination of proprietary and licensed predictive coding technology.

If you license any elements of predictive coding technology, who is your technology/development partner?

Content AnalystEquivioOrcaTecRecommindOther (Share In Comment Section)

Offering Integration

Architectural IntegrationProcess IntegrationOther (Share In Comment Section)

  • Architecturally Integrated: Predictive coding technology is integrated into the software architecture (application integration) of core eDiscovery platform.
  • Process Integrated: Predictive coding technology is available as an eDiscovery offering, but not integrated into the software architecture of core eDiscovery platform (process integration).

Machine Learning Approach (4)

Supervised LearningActive LearningOther (Share In Comment Section)

  • Supervised Learning: Human chooses a set of documents (training set) and feeds the documents into the system. The system learns the difference between responsive and non-responsive documents and classifies remaining documents accordingly.
  • Active Learning: System chooses a set of documents and feeds the documents to humans. Humans make a decision about the documents and then the system applies the learning provided by the humans against the rest of the documents in the collection.

Sampling Approach (5)

RandomJudgmentalOther (Share In Comment Section)

  • Random Sampling: A statistical sampling approach that gives each document an equal chance of being chosen for inclusion within a sample.
  • Judgmental Sampling: A sampling approach that draws in part from subjective factors when determining inclusion within a sample.

Additional Clarifications and Comments

Please share any clarifications or comments that may help in providing an understanding of provided answers.


(1) Got Technology-Assisted Review? A Short List of Providers and Terms (January 2013), ComplexDiscovery.

(2) Fifty Top Providers: A Short eDiscovery List (December 2012), ComplexDiscovery.

(3) The Grossman-Cormack Glossary of Technology-Assisted Review (2013 Fed. Cts.L. Rev. 7) by Maura Grossman and Gordan Cormack. EDRM.

(4) Is Technology Assisted Review Supporting Attorneys or Replacing Them? (June 2012) by Maura Grossman, Gordan Cormack, Mary Mack, Johannes Scholtes. ZyLAB.

(5) See Maura Grossman and Gordan Cormack. Glossary. January 2013. EDRM.


Current Responders and Results:

Updated 2/22/13 to remove “Statistical” from Sampling Approach “header” and from Judgmental Sampling definition.

Comments are closed.