Description:

  • When an adversary crafts malicious user prompts as generative AI inputs that cause the AI system to act in unintended ways. These “prompt injections” are often designed to cause the model to bypass its original instructions and follow the adversary’s instructions instead.

Impact:

  • The impact of a successful prompt injection attack can vary greatly, depending on the context. Some prompt injection attacks attempt to cause the system to disclose confidential and/or sensitive information. For example, prompt extraction attacks aim to divulge the system prompt or other information in an LLMs context that would nominally be hidden from a user.

Applies to which types of AI models? Generative AI specifically

Which AI security requirements function against this threat? [?]
Discussed in which authoritative sources? [?]
Discussed in which commercial sources? [?]
Additional information