The surge in data volumes has transformed eDiscovery from a formidable challenge into an overwhelming obstacle. The scope of eDiscovery, which now spans emails, instant messages, social media, and an ever-growing array of digital formats, demands more than just manpower; it requires smart, scalable solutions.
Enter Large Language Models (LLMs), a huge advancement in the field of Generative Artificial Intelligence (GenAI), which promises to revolutionize the way we conduct eDiscovery reviews. This technology has become widely available to the general public with OpenAI’s ChatGPT, Google’s Gemini and Microsoft Copilot. Now everyone is wondering how this can be applied to eDiscovery.
One of the world’s leading eDiscovery software companies, Relativity, announced in December 2023 that they are releasing a GenAI-powered solution for eDiscovery in 2024: Relativity aiR for Review. This solution is intended to conduct reviews faster, easier and more accurate.
In this blog, we will focus on the idea behind Relativity aiR and the use of language models for Review and how this will improve Technology-Assisted Review (TAR).
Predictive coding
TAR, including predictive coding, is the set of technologies that has been used up to now in eDiscovery tools to help document review, by presenting the reviewers with relevant data faster. It also has the capability to evaluate the performance of the reviewer and whether most substantive documents have been reviewed. This aids in the review quality and speed.
Predictive coding uses a set of labeled documents, to “train” a model. Then the model can be used for automated classification and/or prioritizing potentially relevant documents. This can reduce human effort during review. A downside to this technology is that it needs to be trained to “understand” the context of a case and associated labels. This training needs to be conducted separately for each investigation. There is also the risk that the training set is biased, insufficiently comprehensive or does not contain examples of all relevant typologies. A further limitation is that predictive coding does not provide a rationalization for its classification or prioritization, resulting in a partial black box approach.
Pre-trained models
The potential of GenAI and LLMs brings a whole new realm of possibilities to the area of TAR. The technology has a more sophisticated understanding of natural language and context, which can improve the quality of classification. Also the technology is more broad, which allows for summarizing documents, rationalizing choices and identifying additional patterns of behavior. However, the most important part of the cutting-edge LLMs is that they are (mostly) general and pre-trained. This entails that the model is able to conduct a review of documents without explicit training; it will still be necessary to inform the model what you are looking for, but a broad description may even suffice here. For example, it is possible to ask the system to search for communication indicating aggressive or intimidating behavior. It can therefore perform reviews without or with limited training.
This means that a first review pass in an investigative procedure can be performed by an AI model, which then only needs to be validated by a second (human) reviewer. This will allow for quicker, cheaper, more consistent and better results than any technology has been capable of in recent times.
As with all new technologies, it is important not only to focus on the benefits but also to be aware of limitations and concerns. The use of LLMs involves processing large amounts of potentially sensitive data, raising concerns about data privacy. Also, similar to predictive coding, there is a risk of bias, as most models are trained on real-world data. This can, for example, also relate to cultural differences and the cultural dimension in text. Will a model be sufficiently able to include such context, similar to a human reviewer? How can we employ such models in a responsible way? Also the complexity of the technology, as well as resource requirements, might make it less accessible.
Relativity has been one of the first to announce implementing GenAI within their application, with Relativity aiR. I am sure more eDiscovery providers will soon follow this development, and I personally can't wait to apply this new technology in our Digital Investigations.
Empower your forensic consulting endeavors with AI-driven precision. Together, we can fortify defenses and uphold integrity in an increasingly digital world.
Contact us
We will keep you informed by email.
Enter your preferences here.