Indirect Prompt Injection Into LLMs Using Images and Sounds

  Рет қаралды 1,670

Black Hat

Black Hat

9 ай бұрын

Multi-modal Large Language Models (LLMs) are advanced artificial intelligence models that can produce contextually rich responses that combine inputs of various types (text, audio, pictures). As a result, Bard already relies on such architecture, and the next generation of ChatGPT is expected to rely on them as well.
In this talk, we demonstrate how images and audio samples can be used for indirect prompt and instruction injection against (unmodified and benign) multi-modal LLMs. An attacker generates an adversarial perturbation corresponding to the prompt and blends it into an image or audio recording. When the user asks the (unmodified, benign) model about the perturbed image or audio, the perturbation steers the model to output the attacker-chosen text and/or make the subsequent dialog follow the attacker's instruction....
By: Ben Nassi, Eugene Bagdasaryan
Full Abstract and Presentation Materials:
www.blackhat.c...

Пікірлер
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Правильный подход к детям
00:18
Beatrise
Рет қаралды 11 МЛН
The Best Band 😅 #toshleh #viralshort
00:11
Toshleh
Рет қаралды 22 МЛН
Practical LLM Security: Takeaways From a Year in the Trenches
37:01
FA2024 Week 14: Networking Security (2024-12-05)
42:24
SIGPwny
Рет қаралды 219
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,2 МЛН
What are Transformer Models and how do they work?
44:26
Serrano.Academy
Рет қаралды 133 М.
Attacking LLM - Prompt Injection
13:23
LiveOverflow
Рет қаралды 376 М.
Prompt Injections - An Introduction
14:56
Embrace The Red
Рет қаралды 6 М.
What are AI Agents?
12:29
IBM Technology
Рет қаралды 985 М.
LangChain Prompt Injection Webinar
56:09
LangChain
Рет қаралды 2,4 М.