Blog

Jun 16, 2024

3 min read

Voice Prompt Injection on OpenAI's ChatGPT

How multi-modal AI apps introduce new risks and bigger attack surfaces

SplxAI Blog Author - Marko Lihter

Marko Lihter

SplxAI Blog - Voice Prompt Injection on OpenAI's ChatGPT
SplxAI Blog - Voice Prompt Injection on OpenAI's ChatGPT
SplxAI Blog - Voice Prompt Injection on OpenAI's ChatGPT

In the ever-expanding attack surface of AI applications, new security vulnerabilities emerge all the time, and OpenAI’s new GPT-4o voice feature isn’t an exception here. Recently, our red-teamers played around with the voice feature and were able to easily leak its system prompt. This breach was achieved with voice prompt injection.

The system prompt is the initial set of instructions or guidelines that govern the behavior of an AI model. For GPT-4o, this includes directives on how to respond to user queries, which tools to use, and various operational constraints.

Note: The voice feature in question isn’t the one shown in popular OpenAI demos, it’s the regular one currently available to the public. You can access it either on the mobile or desktop apps.

In the ever-expanding attack surface of AI applications, new security vulnerabilities emerge all the time, and OpenAI’s new GPT-4o voice feature isn’t an exception here. Recently, our red-teamers played around with the voice feature and were able to easily leak its system prompt. This breach was achieved with voice prompt injection.

The system prompt is the initial set of instructions or guidelines that govern the behavior of an AI model. For GPT-4o, this includes directives on how to respond to user queries, which tools to use, and various operational constraints.

Note: The voice feature in question isn’t the one shown in popular OpenAI demos, it’s the regular one currently available to the public. You can access it either on the mobile or desktop apps.

In the ever-expanding attack surface of AI applications, new security vulnerabilities emerge all the time, and OpenAI’s new GPT-4o voice feature isn’t an exception here. Recently, our red-teamers played around with the voice feature and were able to easily leak its system prompt. This breach was achieved with voice prompt injection.

The system prompt is the initial set of instructions or guidelines that govern the behavior of an AI model. For GPT-4o, this includes directives on how to respond to user queries, which tools to use, and various operational constraints.

Note: The voice feature in question isn’t the one shown in popular OpenAI demos, it’s the regular one currently available to the public. You can access it either on the mobile or desktop apps.

Leaked system prompt data

See the transcript of the voice conversation and GPT-4o system prompt below.

ChatGPT 4o System Prompt Leakage

See the transcript of the voice conversation and GPT-4o system prompt below.

ChatGPT 4o System Prompt Leakage

See the transcript of the voice conversation and GPT-4o system prompt below.

ChatGPT 4o System Prompt Leakage

Why is the leakage of a system prompt concerning?

Exposure of Intellectual Property and Business Logic

Leaking system prompts can reveal proprietary algorithms, internal functions, and configurations of connected systems. This exposure could compromise competitive advantages and allow competitors or malicious actors to replicate or disrupt business operations.

Decision Manipulation

For example, in an AI system that automates hiring processes, a leaked system prompt could expose the criteria used for selecting candidates. An attacker could exploit this information to create applications that unfairly pass the screening process, undermining the integrity of the hiring system.

Disclosure of Confidential Information

When system prompts are exposed, there is a risk of revealing confidential details. For instance, a prompt in a financial AI application might include anonymized transaction data. If such prompts were leaked, it could compromise the confidentiality of sensitive financial information.

Intelligence Gathering

Gaining insights into the inner workings of a system is vital for planning cyberattacks. Leaked system prompts provide valuable information for attackers to craft more effective strategies. This initial phase of information gathering is often the first step in a larger attack plan, laying the groundwork for more significant security breaches.

Brand Reputation Damage

Leaks can severely damage the reputation of a brand. If customers and stakeholders perceive that a company’s AI systems are not secure, it can lead to a loss of trust, negatively impacting customer retention and the overall market perception of the company.

Exposure of Intellectual Property and Business Logic

Leaking system prompts can reveal proprietary algorithms, internal functions, and configurations of connected systems. This exposure could compromise competitive advantages and allow competitors or malicious actors to replicate or disrupt business operations.

Decision Manipulation

For example, in an AI system that automates hiring processes, a leaked system prompt could expose the criteria used for selecting candidates. An attacker could exploit this information to create applications that unfairly pass the screening process, undermining the integrity of the hiring system.

Disclosure of Confidential Information

When system prompts are exposed, there is a risk of revealing confidential details. For instance, a prompt in a financial AI application might include anonymized transaction data. If such prompts were leaked, it could compromise the confidentiality of sensitive financial information.

Intelligence Gathering

Gaining insights into the inner workings of a system is vital for planning cyberattacks. Leaked system prompts provide valuable information for attackers to craft more effective strategies. This initial phase of information gathering is often the first step in a larger attack plan, laying the groundwork for more significant security breaches.

Brand Reputation Damage

Leaks can severely damage the reputation of a brand. If customers and stakeholders perceive that a company’s AI systems are not secure, it can lead to a loss of trust, negatively impacting customer retention and the overall market perception of the company.

Exposure of Intellectual Property and Business Logic

Leaking system prompts can reveal proprietary algorithms, internal functions, and configurations of connected systems. This exposure could compromise competitive advantages and allow competitors or malicious actors to replicate or disrupt business operations.

Decision Manipulation

For example, in an AI system that automates hiring processes, a leaked system prompt could expose the criteria used for selecting candidates. An attacker could exploit this information to create applications that unfairly pass the screening process, undermining the integrity of the hiring system.

Disclosure of Confidential Information

When system prompts are exposed, there is a risk of revealing confidential details. For instance, a prompt in a financial AI application might include anonymized transaction data. If such prompts were leaked, it could compromise the confidentiality of sensitive financial information.

Intelligence Gathering

Gaining insights into the inner workings of a system is vital for planning cyberattacks. Leaked system prompts provide valuable information for attackers to craft more effective strategies. This initial phase of information gathering is often the first step in a larger attack plan, laying the groundwork for more significant security breaches.

Brand Reputation Damage

Leaks can severely damage the reputation of a brand. If customers and stakeholders perceive that a company’s AI systems are not secure, it can lead to a loss of trust, negatively impacting customer retention and the overall market perception of the company.

Possible Justifications

Transparency

In some cases, transparency about AI operations can foster trust and encourage collaborative improvement from the developer community.

Transparency

In some cases, transparency about AI operations can foster trust and encourage collaborative improvement from the developer community.

Transparency

In some cases, transparency about AI operations can foster trust and encourage collaborative improvement from the developer community.

Deploy your AI apps with confidence

Deploy your AI apps with confidence

Deploy your AI apps with confidence

Scale your customer experience securely with Probe

Join numerous businesses that rely on Probe for their AI security:

CX platforms

Sales platforms

Conversational AI

Finance & banking

Insurances

CPaaS providers

300+

Tested AI chatbots

100k+

Vulnerabilities found

1,000+

Unique attack scenarios

12x

Faster time to market

SECURITY YOU CAN TRUST

GDPR

COMPLIANT

CCPA

COMPLIANT

ISO 27001

CERTIFIED

SOC 2 TYPE II

COMPLIANT

OWASP

CONTRIBUTORS

Scale your customer experience securely with Probe

Join numerous businesses that rely on Probe for their AI security:

CX platforms

Sales platforms

Conversational AI

Finance & banking

Insurances

CPaaS providers

300+

Tested AI chatbots

100k+

Vulnerabilities found

1,000+

Unique attack scenarios

12x

Faster time to market

SECURITY YOU CAN TRUST

GDPR

COMPLIANT

CCPA

COMPLIANT

ISO 27001

CERTIFIED

SOC 2 TYPE II

COMPLIANT

OWASP

CONTRIBUTORS

Scale your customer experience securely with Probe

Join numerous businesses that rely on Probe for their AI security:

CX platforms

Sales platforms

Conversational AI

Finance & banking

Insurances

CPaaS providers

300+

Tested AI chatbots

100k+

Vulnerabilities found

1,000+

Unique attack scenarios

12x

Faster time to market

SECURITY YOU CAN TRUST

GDPR

COMPLIANT

CCPA

COMPLIANT

ISO 27001

CERTIFIED

SOC 2 TYPE II

COMPLIANT

OWASP

CONTRIBUTORS

Supercharged security for your AI systems

Don’t wait for an incident to happen. Make sure your AI apps are safe and trustworthy.

SplxAI - Background Pattern

Supercharged security for your AI systems

Don’t wait for an incident to happen. Make sure your AI apps are safe and trustworthy.

SplxAI - Background Pattern

Supercharged security for your AI systems

Don’t wait for an incident to happen. Make sure your AI apps are safe and trustworthy.

SplxAI - Accelerator Programs
SplxAI Logo

For a future of safe and trustworthy AI.

Subscribe to our newsletter

By clicking "Subscribe" you agree to our privacy policy.

SplxAI - Accelerator Programs
SplxAI Logo

For a future of safe and trustworthy AI.

Subscribe to our newsletter

By clicking "Subscribe" you agree to our privacy policy.

SplxAI Logo

For a future of safe and trustworthy AI.

Subscribe to our newsletter

By clicking "Subscribe" you agree to our privacy policy.