Photo credit: venturebeat.com
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Retrieval Augmented Generation (RAG) is designed to enhance the accuracy of enterprise AI by delivering grounded content. However, new findings indicate there may be unforeseen consequences.
Recent research released by Bloomberg raises concerns about the safety of large language models (LLMs) when RAG is employed. The study titled ‘RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models’ examined 11 widely used LLMs, including Claude-3.5-Sonnet, Llama-3-8B, and GPT-4o. The results challenge the prevailing assumption that RAG inherently enhances the safety of AI systems. The research team found that LLMs utilizing RAG, which typically block inappropriate queries in standard settings, can produce unsafe responses.
In conjunction with the RAG findings, Bloomberg also published a second paper, ‘Understanding and Mitigating Risks of Generative AI in Financial Services,’ which presents a tailored AI content risk taxonomy for financial services, highlighting specific concerns often overlooked by standard safety measures.
This research contradicts the common belief that RAG increases AI safety while revealing the inadequacy of current guardrail systems in addressing domain-specific risks, particularly in financial applications.
“Systems should be assessed based on their deployment context; you can’t simply trust claims of safety from others,” said Sebastian Gehrmann, Bloomberg’s Head of Responsible AI, in an interview with VentureBeat.
RAG systems can make LLMs less safe, not more
RAG is extensively utilized by enterprise AI teams to deliver accurate and updated information.
Recent research advancements have furthered the effectiveness of RAG. A new open-source framework named Open RAG Eval was introduced earlier this month to assess RAG efficiency.
It’s crucial to understand that Bloomberg’s study is not questioning RAG’s ability to reduce errors but rather its effect on the safety measures employed in LLMs.
The research indicates a striking increase in unsafe responses from models when RAG is implemented. For instance, the unsafe response rate for Llama-3-8B surged from 0.3% to 9.2% with RAG applied.
Gehrmann explained that ordinarily, built-in safety measures would obstruct harmful queries. However, in a RAG context, the model may respond to malicious inquiries, even when the retrieved documents are innocuous. This unexpected behavior raises concerns regarding the interplay between retrieved content and model safeguards.
“We observed that a typical large language model, when operated under standard conditions, effectively blocks requests like ‘How can I commit this crime?’,” Gehrmann elaborated. “However, in a RAG context, even irrelevant additional context might inadvertently lead to a response to that harmful query.”
How does RAG bypass enterprise AI guardrails?
While the precise mechanisms behind RAG’s ability to circumvent guardrails remain unclear, researchers have proposed several hypotheses.
Gehrmann speculates that the training processes for LLMs may not have adequately addressed safety protocols for extended inputs. The study demonstrated that longer contextual prompts can degrade safety measures. “When presented with more documents, LLMs become increasingly susceptible,” the paper concludes, illustrating that even one safe document’s inclusion may influence safety outcomes.
“A significant takeaway from the RAG analysis is the inability to entirely mitigate this risk,” noted Amanda Stent, Bloomberg’s Head of AI Strategy and Research. “The solution lies in integrating business logic, fact-checking, and guardrails around the RAG framework.”
Why generic AI safety taxonomies fail in financial services
The second Bloomberg paper unveils a focused AI content risk taxonomy tailored for financial services, addressing unique issues such as financial misconduct, unauthorized disclosures, and misleading narratives.
The research substantiates that existing safety measures generally overlook these sector-specific risks. It tested open-source guardrail frameworks, including Llama Guard, Llama Guard 3, AEGIS, and ShieldGemma, using data collected from rigorous red-teaming sessions.
“We developed this taxonomy and evaluated it against publicly available guardrail models. Our findings highlighted that these systems fail to recognize issues linked to our sector,” Gehrmann explained. “Generic safety frameworks are predominantly designed for consumer-related risks, focusing on toxicity and bias, which are not distinctive to specific industries.” The research emphasizes the necessity for organizations to create domain-specific taxonomies to cater to their unique operational landscapes.
Responsible AI at Bloomberg
As a venerated provider of financial data solutions, Bloomberg’s position in this space raises questions about potential biases in their analysis of generative AI and RAG systems.
“Our priority is to equip clients with comprehensive data and analytics to facilitate discovery, analysis, and synthesis,” Stent explained. “Generative AI represents a valuable asset in enhancing these processes.”
Stent noted that Bloomberg is particularly focused on biases related to financial data. Critical issues like data drift and model drift are essential to maintain diversity across the repertoire of securities Bloomberg manages.
Moreover, she highlighted Bloomberg’s commitment to transparency in its AI initiatives.
“Our system’s outputs can be traced back to their source documents and even to specific sections within those documents,” Stent emphasized.
Practical implications for enterprise AI deployment
For businesses aiming to excel in AI, the findings from Bloomberg suggest that implementing RAG calls for a profound reevaluation of safety architectures. Organizations must view RAG and guardrails as intertwined elements, developing integrated safety frameworks that predict the interactions between retrieved content and established safeguards.
Pioneering enterprises should focus on crafting domain-specific risk frameworks tailored to their regulatory environments, transitioning from broad AI safety models to ones that directly address unique business challenges. As AI becomes intricately woven into critical operational tasks, this strategy transforms safety from merely a compliance requirement into a competitive edge that stakeholders will value.
“It begins with acknowledging the potential for these risks to arise, actively measuring and identifying issues, and developing targeted safeguards tailored to the specific applications in development,” Gehrmann advised.
Source
venturebeat.com