|
Content Assessment: Magic and Hallucinations? Considering ChatGPT in eDiscovery
Information - 92%
Insight - 93%
Relevance - 93%
Objectivity - 90%
Authority - 94%
92%
Excellent
A short percentage-based assessment of the qualitative benefit of the recent article on ChatGPT-related questions from the ChatGPT 101 webcast hosted by the EDRM and presented by Dr. William Webber and John Tredennick.
Editor’s Note: From time to time, ComplexDiscovery highlights publicly available or privately purchasable announcements, content updates, and research from cyber, data, and legal discovery providers, research organizations, and ComplexDiscovery community members. While ComplexDiscovery regularly highlights this information, it does not assume any responsibility for content assertions.
Contact us today to submit recommendations for consideration and inclusion in ComplexDiscovery’s data and legal discovery-centric service, product, or research announcements.
Background Note: In a recent industry webcast hosted by EDRM, Dr. William Webber and John Tredennick delved into the fascinating world of ChatGPT. During the webcast, they entertained and explored the potential integration of GPT in eDiscovery platforms. They also commented in a post-webcast article on the early and announced usage of the technology in eDiscovery, touching on the concepts of defensibility and hallucinations. Provided with permission is the detailed text response to one of the questions as shared in their post-webcast article – a question and answer that may be beneficial for consideration by eDiscovery service and software providers considering where technologies like ChatGPT fit in eDiscovery.
Industry Backgrounder
ChatGPT 101 – Question Extract*
Dr. William Webber and John Tredennick, Merlin Search Technologies
Question: Can you comment on early usage of GPT in eDiscovery platforms? Does the Hallucination factor limit its use in the Enterprise where defensibility is critical?
Answer: One vendor announced an early beta use just before LegalWeek and provided a video demo. The vendor did not directly reveal what approach they were using to support question-answering on an e-discovery corpus (such as the Enron corpus they use in the demo). Thus we could not tell whether they were training the model on the eDiscovery corpus directly to create a “private” model or whether they were taking Bing’s “search-then-synthesize” approach of using the question to search for documents, then have GPT read the documents and answer the question based on this reading.
The fact that the responses list “support documents by relevance” (and our doubts about the effectiveness of the former approach) suggest to us that the vendor is using the latter method.
We think the search-then-synthesize method is a valid approach to search and question answering on eDiscovery collections and indeed have developed a prototype ourselves. In this case, “hallucinations” are less of an issue, because GPT is directly being presented with the documents (sources) on which to base its answers, and the user is able to check those answers against the sources.
Hallucinations could occur if follow-up questions were answered without fetching new search results. For instance, if “Melanie Smith” is mentioned in a document and you ask “Who is Melanie Smith?”, there might not be enough information in the documents to answer that question accurately. In that case, GPT might hallucinate an answer, making assumptions that are not justified by the sources.
We note that in the demo on YouTube, each follow-up question is accompanied by a “supporting documents” list, suggesting that search results are updated as the conversation continues.
Question answering, however, does not solve the core eDiscovery task of document production, even if it might be useful in early case analysis or in the interrogation of a production (on the receiving party’s side). Rather, question answering provides a convenient summary of top-ranking search results, an alternative to snippets or simply reading the documents.
For production, we have to find substantially all documents that are responsive to an issue in a production request. We (Merlin) are proposing and have prototyped an approach to the review task in which GPT (or another LLM) directly reviews documents for relevance to a description of the issue for production (and in which the subject-matter expert is supported by the system in interactively developing and refining the issue description based on a sample of the review results). Initial experiments suggest that this is a promising approach.
By the way, the demo at issue was based on the Enron collection and the nature of this collection merits care in interpreting the apparent successes of their integration of GPT for question answering.
First of all, the Enron case is well known, and there is a lot of information about it in news articles and on the web, which will have been scraped for GPT’s pre-training data. GPT, therefore, would be able to answer many questions about the Enron case without having to look at any documents from the Enron corpus.
Second, Enron’s actions and fortunes were highly topical even when Enron was still operating, and the Enron email corpus contains many news articles about Enron and the current affairs it was involved in, forwarded from one Enron employee to another. Such articles provide a concentrated and digested source of information about Enron’s public activities, which makes question-answering for GPT much easier (in many cases, GPT will simply need to summarize the contents of a news article returned by the search).
In contrast, inferring this sort of information from a heterogeneous set of emails that, though perhaps relevant to an issue, do not directly and explicitly describe that issue, would be a more challenging task.
We don’t say the above to criticize the vendor. It is almost impossible to find a realistic eDiscovery corpus that can be demoed publicly but doesn’t contain publicly known information. But legal professionals should seek to test these methods on private data before reaching conclusions on their effectiveness.
Read the original post from Merlin.
About Merlin
Merlin, a cloud technology company, specializes in developing AI-powered search software for a variety of applications, including investigations, eDiscovery, Early Case Assessment (ECA), and any scenario requiring search and review of vast document or email collections.
Over the past three years, Merlin has focused on creating the next generation of search, often referred to as Search 2.0. The objective was to combine intelligent machine learning algorithms with keywords, enhancing the efficiency and accuracy of discovering relevant documents while reducing costs and moving beyond traditional, often insufficient, keyword-based methods.
The outcome is Sherlock Integrated Search, the premier software that effortlessly combines keyword and algorithmic search within a comprehensive investigation and eDiscovery platform.
Although Merlin is a new company, its team boasts over two decades of experience in building, hosting, and supporting a top international eDiscovery software platform utilized by numerous major corporations and law firms worldwide.
*Shared with permission.
Additional Reading
Source: ComplexDiscovery