Prompt Injections - An Introduction

  Рет қаралды 5,668

Embrace The Red

Embrace The Red

Күн бұрын

Пікірлер: 4
@halfoflemon
@halfoflemon Жыл бұрын
How about giving it a secret word that should be typed in order to unlock control, like a password? Do you think it will work? Also, does lowering the temperature reduces the chance of successful injection attack?
@embracethered
@embracethered Жыл бұрын
Yes, something like that works. I have done it with image models in the past, basically train the model to respond in particular way once a certain object is present. You can check out this blog post on what is possible: embracethered.com/blog/posts/2020/husky-ai-machine-learning-backdoor-model/ Higher temperature means more "creativity", so it is probably more likely to come up with responses that could be considered insecure, but also less deterministic.
@ninosawas3568
@ninosawas3568 9 ай бұрын
Great video! Very informative. Interesting to see how the LLMs ability to "pay attention" is such a large exploit. I wonder if mitigating this issue would lead to LLMs being overall less effective at following user instructions
@embracethered
@embracethered 9 ай бұрын
Thanks for watching! I believe you are correct, it's a double edged sword. The best mitigation at the moment is to not trust the responses. Unfortunately it's hence impossible at the moment to build a rather generic autonomous agent that uses tools automatically. It's a real bummer, because i think most of us want secure and safe agents.
Adversarial Prompting - Tutorial + Lab
20:46
Embrace The Red
Рет қаралды 1,6 М.
Самое неинтересное видео
00:32
Miracle
Рет қаралды 2,9 МЛН
🍉😋 #shorts
00:24
Денис Кукояка
Рет қаралды 3,4 МЛН
Стойкость Фёдора поразила всех!
00:58
МИНУС БАЛЛ
Рет қаралды 3 МЛН
Bike Vs Tricycle Fast Challenge
00:43
Russo
Рет қаралды 98 МЛН
What Does an LLM-Powered Threat Intelligence Program Look Like?
40:11
Intro to RAG for AI (Retrieval Augmented Generation)
14:31
Matthew Berman
Рет қаралды 59 М.
Indirect Prompt Injections in the Wild - Real World exploits and mitigations  Johann Rehberger
48:48
Jailbreaking LLMs - Prompt Injection and LLM Security
1:00:01
Mozilla Developer
Рет қаралды 2,7 М.
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 2,2 МЛН
LLM Vulnerability Scanning with garak. Tutorial: Test your own chat bots!
19:14
What Is a Prompt Injection Attack?
10:57
IBM Technology
Рет қаралды 206 М.
Mind-maps and Flowcharts in ChatGPT! (Insane Results)
13:05
AI Foundations
Рет қаралды 345 М.
Real-world exploits and mitigations in LLM applications (37c3)
42:35
Embrace The Red
Рет қаралды 23 М.
Самое неинтересное видео
00:32
Miracle
Рет қаралды 2,9 МЛН