Mon. Mar 18th, 2024

An extract from an article by eDiscovery expert Bill Dimm

This was by far the most significant iteration of the ongoing exercise where I challenge an audience to produce a keyword search that works better than technology-assisted review (also known as predictive coding or supervised machine learning).  There were far more participants than previous rounds, and a structural change in the challenge allowed participants to get immediate feedback on the performance of their queries so they could iteratively improve them.  A total of 1,924 queries were submitted by 42 participants (an average of 45.8 queries per person) and higher recall levels were achieved than in any prior version of the challenge, but the audience still couldn’t beat TAR.

In previous versions of the experiment, the audience submitted search queries on paper or through a web form using their phones, and I evaluated a few of them live on stage to see whether the audience was able to achieve higher recall than TAR.  Because the number of live evaluations was so small, the audience had very little opportunity to use the results to improve their queries.  In the latest iteration, participants each had their own computer in the lab at the 2019 Ipro Tech Show, and the web form evaluated the query and gave the user feedback on the recall achieved immediately.  Furthermore, it displayed the relevance and important keywords for each of the top 100 documents matching the query, so participants could quickly discover useful new search terms to tweak their queries.  This gave participants a significant advantage over a normal e-discovery scenario, since they could try an unlimited number of queries without incurring any cost to make relevance determinations on the retrieved documents in order to decide which keywords would improve the queries.  The number of participants was significantly larger than any of the previous iterations, and they had a full 20 minutes to try as many queries as they wanted.  It was the best chance an audience has ever had of beating TAR. They failed.

Read the complete article at TAR vs. Keyword Search Challenge, Round 6 (Instant Feedback)

Additional Reading

Source: ComplexDiscovery

 

Generative Artificial Intelligence and Large Language Model Use

ComplexDiscovery OÜ recognizes the value of GAI and LLM tools in streamlining content creation processes and enhancing the overall quality of its research, writing, and editing efforts. To this end, ComplexDiscovery OÜ regularly employs GAI tools, including ChatGPT, Claude, Midjourney, and DALL-E, to assist, augment, and accelerate the development and publication of both new and revised content in posts and pages published (initiated in late 2022).

ComplexDiscovery also provides a ChatGPT-powered AI article assistant for its users. This feature leverages LLM capabilities to generate relevant and valuable insights related to specific page and post content published on ComplexDiscovery.com. By offering this AI-driven service, ComplexDiscovery OÜ aims to create a more interactive and engaging experience for its users, while highlighting the importance of responsible and ethical use of GAI and LLM technologies.

 

Have a Request?

If you have information or offering requests that you would like to ask us about, please let us know, and we will make our response to you a priority.

ComplexDiscovery OÜ is a highly recognized digital publication focused on providing detailed insights into the fields of cybersecurity, information governance, and eDiscovery. Based in Estonia, a hub for digital innovation, ComplexDiscovery OÜ upholds rigorous standards in journalistic integrity, delivering nuanced analyses of global trends, technology advancements, and the eDiscovery sector. The publication expertly connects intricate legal technology issues with the broader narrative of international business and current events, offering its readership invaluable insights for informed decision-making.

For the latest in law, technology, and business, visit ComplexDiscovery.com.