Gemini 2.0 Flash Thinking - Does it Pass the Misguided Attention Test?

  Рет қаралды 10,027

Prompt Engineering

Prompt Engineering

Күн бұрын

Пікірлер: 19
@elawchess
@elawchess 4 күн бұрын
Please can you be showing us what it does when you warn it that it's a trick question etc, whether it still gets stuck or not?
@jamesjonnes
@jamesjonnes 3 күн бұрын
From my tests even 4o is smarter than Gemini 2.0 Flash. OpenAI has fixed many simple mistakes that Google has not yet.
@nubboi21
@nubboi21 3 күн бұрын
No, actually im using the flash and it’s actually far far better that 4o , I use it for my science questions in physics and stuff and it’s really a lot lot better
@elawchess
@elawchess 4 күн бұрын
Cos you can trick a human too with some of these and it wouldn't warrant the conclusion that the tricked human can't reason. If when alerted that it's a trick question it still can't do it then I'll probably agree about the seriousness of the issue. I've seen a couple videos when you've done this type of thing and seemingly concluded that "they can't reason", and I feel like that conclusion is not warranted.
@Heisenberg2097
@Heisenberg2097 4 күн бұрын
As long as it never learns to think like you... humanity is save.
@fabriziocasula
@fabriziocasula 4 күн бұрын
wow
@carlkenner4581
@carlkenner4581 3 күн бұрын
I've met many humans who can't pass the misaligned attention test.
@jeffwads
@jeffwads 4 күн бұрын
Orion dropping tomorrow. Wait until you get a load of that model.
@chamikk90
@chamikk90 4 күн бұрын
it's capable, but not smart as o1
@Cine95
@Cine95 4 күн бұрын
Its flash bro
@samuelgarcia1802
@samuelgarcia1802 4 күн бұрын
Yhea it's like the equivalent of o1 mini I suppose
@NakedSageAstrology
@NakedSageAstrology 4 күн бұрын
You can get even better results than 01, if you use an API and have it prompt itself back and forth.
@josemarques3454
@josemarques3454 4 күн бұрын
yes!... and it's free.
@ankitnmnaik229
@ankitnmnaik229 4 күн бұрын
​@@NakedSageAstrology it's exterminatal and flash.. Not pro or ultra or specifically a separate reasoning model at all...
@punk3900
@punk3900 4 күн бұрын
o1 pro is the 🔝
@anubisai
@anubisai 4 күн бұрын
Okay moneybagg
@deepbrar1
@deepbrar1 4 күн бұрын
so o1 did solve this problem ?
@ankitnmnaik229
@ankitnmnaik229 4 күн бұрын
200$
Veo 2 Hands On Review - The New King Of AI Video?
13:44
Theoretically Media
Рет қаралды 41 М.
2024's Biggest Breakthroughs in Computer Science
10:47
Quanta Magazine
Рет қаралды 291 М.
黑天使只对C罗有感觉#short #angel #clown
00:39
Super Beauty team
Рет қаралды 36 МЛН
Quilt Challenge, No Skills, Just Luck#Funnyfamily #Partygames #Funny
00:32
Family Games Media
Рет қаралды 55 МЛН
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 41 МЛН
Coding With OpenAI-o1 in Cursor - Can We Replace Claude 3.5 Now?
15:43
Local LightRAG: A GraphRAG Alternative but Fully Local with Ollama
18:55
Prompt Engineering
Рет қаралды 46 М.
Aider + Gemini 2 (Exp) versus Claude 3.5 Sonnet (AI Coding King!)
25:44
Marvijo Software
Рет қаралды 3,1 М.
OpenAI Unveils o3! AGI ACHIEVED!
26:24
Matthew Berman
Рет қаралды 237 М.
Gemini 2.0 Flash Thinking
20:13
Sam Witteveen
Рет қаралды 24 М.
AI Officially Beats 99% of Programmers. What Happens Now? (OpenAi o3)
10:16
Internet Made Coder
Рет қаралды 30 М.
The New Gemini Experimental: Can it Pass the Reasoning Tests?
16:49
Prompt Engineering
Рет қаралды 3 М.
Gemini 2.0 and the evolution of agentic AI with Oriol Vinyals
51:57
Google DeepMind
Рет қаралды 53 М.
黑天使只对C罗有感觉#short #angel #clown
00:39
Super Beauty team
Рет қаралды 36 МЛН