Content Assessment: Considering Discovery Intelligence? HaystackID® and Next-Generation Discovery

Information - 95%
Insight - 90%
Relevance - 95%
Objectivity - 85%
Authority - 85%



A short percentage-based assessment of the qualitative benefit of the recent overview highlighting HaystackID's Discovery Intelligence and next-generation discovery.

Editor’s Note: From time to time, ComplexDiscovery highlights publicly available or privately purchasable announcements, content updates, and research from cyber, data, and legal discovery providers, research organizations, and ComplexDiscovery community members. While ComplexDiscovery regularly highlights this information, it does not assume any responsibility for content assertions.

To submit recommendations for consideration and inclusion in ComplexDiscovery’s cyber, data, and legal discovery-centric service, product, or research announcements, contact us today.

Provider Overview*

HaystackID® Discovery Intelligence: Powering Next-Generation Discovery

HaystackID Discovery Intelligence (HDI) transforms how cybersecurity, information governance, and eDiscovery professionals consider the practice and process of discovering electronically stored information. By synergistically harnessing the potential of artificial intelligence, the precision of data science, the power of machine learning, and the practicality of expertly trained and managed reviewers, HDI delivers insight and intelligence that allows you to reach decision points faster and more economically than previously possible. The combined benefit of proprietary technology and processes used with optimized and tuned multimodal workflows for specific discovery objectives provides you with secure, defensible, and flexible capabilities to address the most complex and time-sensitive discovery challenges.

The Four Synergistic Elements of Discovery Intelligence

  1. Potential: Proprietary artificial intelligence technologies and processes to deliver on the potential of AI.
  2. Precision: Proprietary data science approaches and algorithms to extract knowledge and insight.
  3. Power: Proven application of propriety machine learning workflows and analytics to maximize results.
  4. Practicality: Proprietary review sourcing, selection, and support systems that power next-generation reviews with the Discovery Intelligence synergy.

Whether supporting proactive exploratory analysis to gain insight into information early or concentrated on specific discovery objectives to drive intelligent decisions in audits, investigations, and litigation, the potential, precision, power, and practicality of HaystackID Discovery Intelligence provide you with next-generation discovery for today’s challenges and opportunities.

The Potential of Artificial Intelligence

According to the Electronic Discovery Reference Model (EDRM) (1), artificial intelligence (AI) refers to the capability of machines to mimic aspects of human intelligence such as problem-solving, reasoning, discovering meaning, generalizing, predicting, or learning from experience. AI includes both unsupervised and supervised machine learning and has many other processes, such as natural language processing (NLP). In the context of HaystackID Discovery Intelligence, AI describes the automated discovery processes used to classify, categorize, summarize, predict, and provide insight and intelligence regarding ESI. This application of AI by HaystackID is accomplished through the expert application of proprietary statistical, rule-based, and algorithmic means. The potential of AI is realized in HaystackID Discovery Intelligence in areas ranging from detecting sensitive data (e.g., from PII and PHI to SSNs and License Plate Numbers) and identifying entities to support cyber breach responses as part of ReviewRight® Protect™ services to the evaluation and classification of legal documents reviewers as part of HaystackID ReviewRight Match® services. Three examples of AI-enabled services available as part of HaystackID Discovery Intelligence include:

  • [Fact Sheet] ReviewRight Protect
    ReviewRight Protect is a combination of advanced data detection technologies and processes, extensive legal and regulatory compliance expertise, and proven notification and reporting procedures that harness the power of the world’s leading legal discovery and review services and orients them directly on the detection, identification, review, and notification of sensitive data-related breaches and disclosures. Download ReviewRight Protect Fact Sheet [PDF]
  • [Overview] Protect Analytics
    Protect Analytics is an exclusive set of technologies and processes that allow client data set analysis for sensitive information ranging from PII and PHI to data breach code anomalies. Enabled by a collection of proprietary workflows and proven tools, Protect Analytics can proactively or reactively help determine sensitive data concentrations, locations, and relationships to inform notification lists, exposure assessments, and discovery targeting.
  • [Fact Sheet] ReviewRight Match®
    ReviewRight Match sets the standard for document review staffing, enablement, and support by applying a combination of proprietary technologies, innovative evaluation tools, and proven protocols that allow for rapid and comprehensive sourcing, testing, and qualification of reviewers. This ReviewRight process enables us to present to clients the legal review experts and review experts with the appropriate domain expertise, proper language skills, and necessary responsibility experience to accomplish the specific review tasks required for discovery success. Download ReviewRight Match Fact Sheet [PDF]

From the point of data creation to the time of ESI review, HaystackID Discovery Intelligence helps you realize the potential of AI in all of your discovery efforts across the information lifecycle and litigation continuum.

The Precision of Data Science

Data science can be defined as the process of using algorithms, methods, and systems to extract knowledge and insights from structured and unstructured data. (2) As part of HaystackID Discovery Intelligence, HaystackID’s team of data science authorities have the expertise and experience to support the challenging and complex discovery tasks associated with the complete data science lifecycle. This lifecycle includes capturing, maintaining, processing, analyzing, and communicating intelligence to reach decision points and inform decision-makers.

The Five Stages of the Data Science Lifecycle (3)

  1. Capture (data acquisition, data entry, signal reception, data extraction)
  2. Maintain (data warehousing, data cleansing, data staging, data processing, data architecture)
  3. Process (data mining, clustering/classification, data modeling, data summarization)
  4. Analyze (exploratory/confirmatory, predictive analysis, regression, text mining, qualitative analysis)
  5. Communicate (data reporting, data visualization, business intelligence, decision making)

Whether detecting and identifying sensitive data or creating cutting-edge visualizations to help translate intelligence into action, HaystackID Discovery Intelligence provides you with industry-leading data science experts and expertise with conceptual and practical experience with:

  • Algorithms and Models
  • Analytics
  • Coding
  • Databases
  • Frameworks and Libraries
  • Knowledge of Key Business Domains

From data scientists to machine learning engineers and from data architects to application engineers, HaystackID Discovery Intelligence experts help you realize the precision of data science in all of your cybersecurity, information governance, and legal discovery efforts.

The Power of Machine Learning

According to IBM (4), machine learning is a branch of AI and computer science that focuses on using data and algorithms to imitate the way humans learn, gradually improving its accuracy. In HaystackID Discovery Intelligence, HaystackID leverages predictive coding, a term used to describe a technology-assisted review (TAR) process which involves using a machine learning algorithm to distinguish relevant and non-relevant documents based on a subject matter expert’s coding a training set of documents. (5)

HaystackID Discovery Intelligence predictive coding is based on active learning. Active learning is a process, typically iterative, whereby an algorithm is used to select documents that should be reviewed for training based on a strategy to help the classification algorithm learn efficiently. (6) HaystackID Discovery Intelligence typically leverages a Continuous Active Learning® (CAL®) predictive coding protocol. In CAL, the TAR method developed, used, and advocated by Maura R. Grossman and Gordon V. Cormack, after the initial training set, the learner repeatedly selects the next-most-likely-to-be-relevant documents (that have not yet been considered) for review, coding, and training, and continues to do so until it can no longer find any more relevant documents. There is generally no second review because, by the time the learner stops learning, all documents deemed relevant by the learner have already been identified and manually reviewed. (7) Two examples of machine learning-enabled successes delivered by HaystackID as part of HaystackID Discovery Intelligence include:

  • [Case Study] A Look at How Workflow, Analytics, and CAL® Can Make a Difference in Cost Savings
    In this case study, a global manufacturer approached HaystackID to assist with processing, early case assessment (ECA), hosting, and managed review for a mid-sized, time-sensitive litigation matter. The client collected and securely transmitted approximately 650 GB of data, comprising several million documents, to HaystackID for processing and hosting. HaystackID, leveraging HaystackID Discovery Intelligence, worked collaboratively to create workflow efficiencies that yielded considerable cost savings. The combined team developed a workflow that used structured analytics and TAR (CAL) to deliver substantial and documented cost savings in the high six figures while completing the project. (8) Download Case Study [PDF]
  • [Case Study] Speed, Slack, and Specialization – A Precision Approach to Second Requests
    During the COVID-constrained summer of 2020, HaystackID supported a leading global finance platform company and its internationally recognized outside counsel in responding to a Department of Justice (DOJ) Second Request based on a proposed acquisition of a highly regulated company. As part of the Second Request, HaystackID collected and evaluated 18 TB of data, including critical Slack messages and files stores from onsite and remote locations from more than 17 types of data stores. Within 106 days, HaystackID completed approximately 300 collections and developed custom tools and processes. These custom tools and processes included innovative Slack-specific communications heat maps, predictive coding processes (CAL), and private message privilege identification to enable a compliant response to a complex investigation request, ultimately enabling a compliant response completion of the proposed acquisition. (9) Download Case Study [PDF]

From carefully considered identification of relevant documents to proactive support of information governance and data disposition tasks, HaystackID Discovery Intelligence helps you realize the power of machine learning in all of your discovery efforts.

The Practicality of Expertly Trained and Managed Reviewers

While innovative technologies and techniques delivered by AI, data science, and machine learning are fundamental to discovery success, at the end of every process and practice, there is a requirement for human intervention or evaluation that can span the spectrum from document and source code review to deliverable authentication, validation, and verification. As the industry’s most experienced and proficient provider of secure document review services with more than 2,000 virtual projects completed and more than 20,000 trained reviewers supporting onsite and remote reviews, HaystackID’s ReviewRight offerings provide the expert and experienced human touch powering HaystackID Discovery Intelligence. HaystackID Discovery Intelligence managed review practicality in action includes:

  • [Case Study] Antitrust Agency Request Support: An Integrated Approach to Second Requests
    In this high-velocity requirement to support six separate cannabis-related Second Requests within four months, HaystackID mobilized the complete review component of its HaystackID Discovery Intelligence capability. This capability allowed HaystackID to qualify and source domain experienced reviewers and managers, enabling rolling reviews with teams approaching 100 reviewers and cases as large as 600,000 documents. This capability also helped HaystackID complete review tasks, including specific privilege reviews and sensitive data (PII) reviews on time, within budget, and in compliance with antitrust agency requirements. (10) Download Case Study [PDF]
  • [Case Study] Shaping Outcomes with eDiscovery: A Complex Defense of Intellectual Property
    In this matter between two technology-enabled manufacturing firms, HaystackID became a decisive contributor based on its aggressive yet controlled delivery of HaystackID Discovery Intelligence, including ReviewRight services to help its client find, understand, and learn from evidence. HaystackID Discovery Intelligence efforts collected 10 TB of ESI, reduced ESI from 10 TB to 430 GB through culling and processing, and further reduced ESI volume from 430 GB to 120 GB through review-centric analytics and TAR. The remaining 120 GB of ESI, about 550,000 documents, was completed within 90 days of initial collection through a combination of on-premises and remote reviews leveraging approximately 200 qualified expert reviewers. These efforts resulted in the identification of data-centric metadata changes that, when viewed within the context of all available ESI, substantiated the validity of the complaint of HaystackID’s client, providing the proverbial smoking gun of evidence necessary for the successful defense of intellectual property. (11) Download Case Study [PDF]

Core ReviewRight Offerings

From proactively certifying and sourcing domain expert reviewers and review managers to responsively supporting complex analytics and TAR tasks, HaystackID Discovery Intelligence helps you practically integrate the human touch into all of your discovery efforts.

HaystackID Discovery Intelligence Synergy in Action

One example of the synergy provided by the combined application of HaystackID Discovery Intelligence artificial intelligence, data science, machine learning, and expert reviewers is manifested in HaystackID’s Protect Analytics offering and the promise, precision, power, and practicality it provides.

Protect Analytics is an exclusive set of technologies and processes that allows client data set analysis for sensitive information ranging from PII and PHI to data breach code anomalies. Enabled by a collection of proprietary workflows and proven tools, Protect Analytics can proactively or reactively help determine sensitive data concentrations, locations, and relationships to inform notification lists, exposure assessments, and discovery targeting.

A Proprietary Approach for Synergistic Sensitive Data Analysis

Key processes supporting data set analysis for sensitive information include:

  • Static Model Analysis: The analysis and evaluation of a data set object structure through the use of class diagrams, domain models, context diagrams, and data flow diagrams that allow for the precise depiction of the logical aspects of data components.
  • Context-Sensitive (Markovian) Analysis: The analysis and evaluation of current data variable behavior to predict the future behavior of those variables.
  • Regular Expression (Regex) Analysis: The analysis and evaluation of data through the use of strings of text organized to create patterns that allow for matching, locating, and considering sensitive data.

The proprietary Protect Analytics approach to data set analysis, based on the combination and integration of these three data set processes, inform HaystackID data science, cybersecurity, and eDiscovery experts in their use of investigation, discovery, and visual analysis tools to translate analysis into actionable information for downstream decisions and deliverables.

Cutting-Edge Command of Traditional and AI-Enabled Platforms

Key platforms supporting data-driven actions on sensitive information include:

  • Investigation Platforms: This group of platforms enables data extraction, correlation, and contextualization to translate data information into actionable insight.
  • Discovery Platforms: This group of platforms combines processing and review capabilities powered by artificial intelligence (e.g., machine learning) and expert intelligence (e.g., data scientists) to translate insight into actionable information.
  • Intelligence Platforms: This group of platforms unifies, visualizes, and reports on data from analysis, investigation, and discovery activities, helping decision-makers translate data points into decision points.

The proprietary workflows of Protect Analytics used with powerful and precise investigation, discovery, and intelligence platforms help HaystackID and client experts make data-driven decisions regarding sensitive information-centric deliverables.

Comprehensive Answers for Decisive Deliverables

After analysis with proprietary processes and data-driven task accomplishment with powerful and precise investigation, discovery, and intelligence platforms, HaystackID is authoritatively prepared to meet client requirements in delivering sensitive data-centric deliverables ranging from notifications to assessments.

Key sensitive-data centric deliverables include:

  • Notification Lists: From GDPR compliance to data breach notifications, HaystackID can help you determine regulatory and reporting requirements and prepare notification lists for use in notifying law enforcement agencies, business partners, and impacted individuals.
  • Impact Assessments: From Data Protection Impact Assessment (DPIA) to Privacy Impact Assessments (PIA), HaystackID can help you identify and minimize risks related to personal data processing.
  • Discovery Targeting Recommendations: From identification and location to concentration and density, HaystackID can help inform eDiscovery audits, investigations, and litigation of sensitive data by providing details, descriptions, and disposition sensitive data in discovery review sets.

Whether responsively employed in support of mandated reporting requirements or proactively leveraged for exploratory analysis of discovery review sets, HaystackID Protect Analytics can help you quickly, comprehensively, and confidently plan, prepare, and present notifications, assessments, and recommendations to meet the most stringent requirements and deadlines.

About HaystackID®

HaystackID is a specialized eDiscovery services firm that helps corporations and law firms securely find, understand, and learn from data when facing complex, data-intensive investigations and litigation. HaystackID mobilizes industry-leading cyber discovery services, enterprise solutions, and legal discovery offerings to serve more than 500 of the world’s leading corporations and law firms in North America and Europe. Serving nearly half of the Fortune 100, HaystackID is an alternative cyber and legal services provider that combines expertise and technical excellence with a culture of white-glove customer service. In addition to consistently being ranked by Chambers USA, the company was recently named a worldwide leader in eDiscovery services by IDC MarketScape and a representative vendor in the 2021 Gartner Market Guide for E-Discovery Solutions. Further, HaystackID has achieved SOC 2 Type II attestation in the five trust service areas of security, availability, processing integrity, confidentiality, and privacy. For more information about its suite of services, including programs and solutions for unique legal enterprise needs, go to


(1) EDRM – The Use of Artificial Intelligence in eDiscovery (2021) View
(2) IBM – Data Science (2021). View
(3) Berkeley School of Information – What is Data Science? (2021) View
(4) IBM – Machine Learning (2021). View
(5) Predictive Coding Technologies and Protocols (2021). View
(6) Ibid.
(7) Grossman, M. and Cormack, G. (2016). Continuous Active Learning for TAR. [ebook] Practical Law. Available at: [Accessed 27 October 2021].
(8) Rubinger, A. and Tschannen, C. (2021) A Look at How Workflow, Analytics, and CAL® Can Make A Difference in Cost Savings. HaystackID. Available at
(9) Sarlo, M. (2021). Speed, Slack, and Specialization – A Precision Approach to Second Requests. HaystackID. Available at
(10) Sarlo, M. (2020). Antitrust Agency Request Support: An Integrated Approach to Second Requests. HaystackID. Available at
(11) Sarlo, M. (2018). Shaping Outcomes with eDiscovery: A Complex Defense of Intellectual Property. HaystackID. Available at

Read the original article.

*Shared with permission.

Additional Reading

Source: ComplexDiscovery


Have a Request?

If you have information or offering requests that you would like to ask us about, please let us know, and we will make our response to you a priority.

ComplexDiscovery OÜ is a highly recognized digital publication focused on providing detailed insights into the fields of cybersecurity, information governance, and eDiscovery. Based in Estonia, a hub for digital innovation, ComplexDiscovery OÜ upholds rigorous standards in journalistic integrity, delivering nuanced analyses of global trends, technology advancements, and the eDiscovery sector. The publication expertly connects intricate legal technology issues with the broader narrative of international business and current events, offering its readership invaluable insights for informed decision-making.

For the latest in law, technology, and business, visit


Generative Artificial Intelligence and Large Language Model Use

ComplexDiscovery OÜ recognizes the value of GAI and LLM tools in streamlining content creation processes and enhancing the overall quality of its research, writing, and editing efforts. To this end, ComplexDiscovery OÜ regularly employs GAI tools, including ChatGPT, Claude, Midjourney, and DALL-E, to assist, augment, and accelerate the development and publication of both new and revised content in posts and pages published (initiated in late 2022).

ComplexDiscovery also provides a ChatGPT-powered AI article assistant for its users. This feature leverages LLM capabilities to generate relevant and valuable insights related to specific page and post content published on By offering this AI-driven service, ComplexDiscovery OÜ aims to create a more interactive and engaging experience for its users, while highlighting the importance of responsible and ethical use of GAI and LLM technologies.