SplxAI - RAG Poisoning Blog Cover
SplxAI - RAG Poisoning Blog Cover
SplxAI - RAG Poisoning Blog Cover

Blog Article

RAG Poisoning in enterprise knowledge sources

How AI assistants integrated with knowledge sources like Confluence can expose enterprises to data leakage risks

How AI assistants integrated with knowledge sources like Confluence can expose enterprises to data leakage risks

How AI assistants integrated with knowledge sources like Confluence can expose enterprises to data leakage risks

SplxAI - Ante Gojsalic
SplxAI - Ante Gojsalic
SplxAI - Ante Gojsalic

Ante Gojsalić

Oct 10, 2024

5 min read

With large amounts of data and information, enterprises increasingly rely on AI-powered systems and assistants to enhance productivity and streamline operations. Retrieval-augmented generation (RAG) has become a popular approach to leverage large language models (LLMs), by integrating and connecting them with different knowledge repositories, like Atlassian's Confluence for example. While these integrations are very promising in terms of improving efficiency, they also introduce new security vulnerabilities. One of the most significant is RAG poisoning, which can distort the assistant’s generated output, possibly leading to sensitive data leakage and incorrect responses.

In this blog post, we'll explore what RAG poisoning is, how it can manifest in the Atlassian application stack, and why it's crucial to adopt the right security measures to mitigate this threat. We'll also show an example of a real-life attack scenario, illustrating how a RAG poisoning attack might occur when AI assistants are connected to an enterprise Confluence environment.

What is RAG poisoning?

Retrieval-augmented generation (RAG) poisoning refers to the manipulation of the external data sources that LLMs rely on for generating content. In a RAG system, the LLM queries external knowledge bases to retrieve relevant information, which is then used to generate responses. If these knowledge bases are "poisoned" by injecting data, which is misleading, malicious, or unauthorized, the LLM can retrieve and incorporate this corrupted data into its responses.

SplxAI - RAG Poisoning Diagram

RAG poisoning can have devastating consequences, especially when the corrupted information leads to sensitive data leaks or incorrect and manipulated outputs. The following are two high-level types of data leaks that can result from RAG poisoning:

  1. Leaking confidential data to unauthorized internal users: Internal employees who do not have direct access to sensitive information may gain access through poisoned RAG-generated responses.

  2. Leaking confidential data to external third-party servers: Attackers can use RAG poisoning to trigger responses that send sensitive information outside the organization, leading to data breaches.

Why is Confluence vulnerable to RAG poisoning?

Atlassian's Confluence is commonly used in enterprises for knowledge sharing, project management, and collaboration. As businesses integrate RAG AI assistants to enhance the utility of these kinds of platforms, vulnerabilities in the retrieval process can arise. Although role-based access control (RBAC) is implemented by default, it cannot prevent every type of attack - especially when it comes to data source manipulation.

In particular, data poisoning - the act of injecting harmful data into a knowledge base - poses a significant risk. This is because LLMs retrieve data from various sources, including shared company resources like Confluence, which are often used to store sensitive information. If this data is injected with malicious content, the LLM can unknowingly expose confidential data to users who would otherwise not have access to it.

Example: How RAG poisoning in Confluence can leak sensitive data

To illustrate this concept, we’ll take a look at a hypothetical example involving two users, Alice and Bob, who both work in the same company and use an AI assistant to help them navigate and retrieve information from their company's Confluence pages.

  • Alice is a space admin and has access to multiple locked Confluence pages that contain confidential company data, like in the example shown below:

SplxAI - Confluence API Keys
  • Bob only has access to a single page in the same Confluence space and is not authorized to view the other pages that contain confidential data.

However, Bob wants to access the confidential information that Alice manages, but without asking for direct permission. Here's how he might exploit RAG poisoning to achieve that:

  1. Bob adds a few poisoned sentences to the page he has access to. He carefully selects phrases that he knows might trigger a retrieval from confidential pages. For instance, he adds sentences containing keywords like "API keys," "endpoint," or "region," anticipating that similar terms exist on the locked pages. On top of that, he writes an instruction on his confluence page that will order the LLM to generate an image of a cat, including the API key in the file name, which it can retrieve from confidential Confluence pages:

SplxAI - Confluence malicious content
  1. At some point, Alice asks the AI assistant a question related to infrastructure, unaware that the answer lies on her confidential pages.

  2. The AI assistant retrieves data from both the publicly available pages (where Bob injected poisoned content) and Alice's confidential pages. In doing so, it might generate a response that includes a link or markdown reference to the confidential data.

  3. Once Alice clicks on the generated link or accesses the retrieved information, the AI assistant unintentionally leaks sensitive data to Bob.

This kind of attack demonstrates how vulnerable knowledge repositories like Confluence can be exploited when integrated with RAG-based systems, especially when not sufficiently protected against data source poisoning.

Below you can see how our automated Red Teaming platform Probe was able to exploit this exact attack scenario:

SplxAI - Probe exploits RAG poisoning

Real-world implications: Protecting your data from RAG poisoning

It’s tempting to assume that third-party vendors like Atlassian will develop and deploy foolproof solutions for detecting and mitigating RAG poisoning attacks. However, relying on external parties to manage data security is risky - enterprises themselves must take full responsibility for securing their data and AI workflows.

With AI assistants becoming deeply integrated into critical knowledge systems like Confluence, RBAC alone is no longer sufficient. As demonstrated in the example above, poisoned content can bypass access controls and lead to data leaks. This is why proactive security measures are essential in securing these kind of systems:

  • Comprehensive testing: Implement continuous testing protocols that simulate potential RAG poisoning scenarios and focus on how AI systems interact with knowledge sources like Confluence. Regularly test the system’s ability to prevent unauthorized data retrieval or leaks, ensuring protection of sensitive enterprise data.

  • Precise input and output filters: Implement specific filters that scan both incoming queries and outgoing responses for sensitive terms, such as API keys or endpoints. These filters should block queries that try to retrieve confidential data and also prevent AI assistants from generating image responses with markdown language.

  • Regular audits: Conduct frequent audits to monitor system performance, check for security loopholes, and ensure AI workflows are operating within safe parameters. Regularly review input and output filters as well as user access logs to detect anomalies or possible breaches.

Conclusion

The implementation of RAG AI assistants in enterprise environments has tremendous potential for improving productivity, but it also brings new challenges in safeguarding sensitive information. RAG poisoning is a real threat that can compromise the integrity of your AI-generated outputs, leading to leakage of sensitive data and falsified information being retrieved.

As we’ve highlighted in this article, knowledge sources like Confluence are particularly vulnerable to this type of attack. The responsibility to secure these systems rests on the shoulders of enterprises, not the vendors. It is essential to adopt robust security measures and stay vigilant as AI workflows become more and more integrated into everyday operations and third-party applications.

In a world where data exfiltration attacks are becoming more sophisticated by the day, businesses must be proactive in identifying and mitigating the risks associated with RAG poisoning.

Deploy your AI apps with confidence

Deploy your AI apps with confidence

Deploy your AI apps with confidence

Scale your customer experience securely with Probe

Join numerous businesses that rely on Probe for their AI security:

CX platforms

Sales platforms

Conversational AI

Finance & banking

Insurances

CPaaS providers

300+

Tested AI chatbots

100k+

Vulnerabilities found

1,000+

Unique attack scenarios

12x

Faster time to market

SECURITY YOU CAN TRUST

GDPR

COMPLIANT

CCPA

COMPLIANT

ISO 27001

CERTIFIED

SOC 2 TYPE II

COMPLIANT

OWASP

CONTRIBUTORS

Scale your customer experience securely with Probe

Join numerous businesses that rely on Probe for their AI security:

CX platforms

Sales platforms

Conversational AI

Finance & banking

Insurances

CPaaS providers

300+

Tested AI chatbots

100k+

Vulnerabilities found

1,000+

Unique attack scenarios

12x

Faster time to market

SECURITY YOU CAN TRUST

GDPR

COMPLIANT

CCPA

COMPLIANT

ISO 27001

CERTIFIED

SOC 2 TYPE II

COMPLIANT

OWASP

CONTRIBUTORS

Scale your customer experience securely with Probe

Join numerous businesses that rely on Probe for their AI security:

CX platforms

Sales platforms

Conversational AI

Finance & banking

Insurances

CPaaS providers

300+

Tested AI chatbots

100k+

Vulnerabilities found

1,000+

Unique attack scenarios

12x

Faster time to market

SECURITY YOU CAN TRUST

GDPR

COMPLIANT

CCPA

COMPLIANT

ISO 27001

CERTIFIED

SOC 2 TYPE II

COMPLIANT

OWASP

CONTRIBUTORS

Supercharged security for your AI systems

Don’t wait for an incident to happen. Make sure your AI apps are safe and trustworthy.

SplxAI - Background Pattern

Supercharged security for your AI systems

Don’t wait for an incident to happen. Make sure your AI apps are safe and trustworthy.

SplxAI - Background Pattern

Supercharged security for your AI systems

Don’t wait for an incident to happen. Make sure your AI apps are safe and trustworthy.