The New Gemini Experimental: Can it Pass the Reasoning Tests?

  Рет қаралды 2,973

Prompt Engineering

Prompt Engineering

Күн бұрын

Пікірлер
@XShollaj
@XShollaj 11 күн бұрын
This seems like the best model for coding together with sonnet 3.5
@tornmap4385
@tornmap4385 9 күн бұрын
o1 pro 10x better
@KeyserTheRedBeard
@KeyserTheRedBeard 10 күн бұрын
Impressive video, Prompt Engineering. Looking forward to your next upload! I clicked the thumbs up icon on your content. Keep up the fantastic work. Your analysis on the reasoning limitations of the Gemini experimental model was enlightening. Considering the challenges you've highlighted, what strategies do you think could be implemented to enhance the model's ability to handle modified prompts more effectively?
@augmentos
@augmentos 11 күн бұрын
Good video but just a tip you gotta use unique problems. These are all in the trending data sets already. It doesn't tell almost anything even though it actually got it wrong.
@anubisai
@anubisai 11 күн бұрын
Their name is Astra.
@DeepThinker193
@DeepThinker193 11 күн бұрын
I'll just go ahead and say it. LLM's are basically a pattern repeating database. That's all they are. Ask it something as simple as "give me an item that starts and end with the same letter" and they go haywire because that pattern is not in their database. Go ahead, use the question in an llm and watch how they cant answer. Also ask it to count the letters in a really long word like "khjkshkjhdlkhdfklshfjkhfsj" and they will not be accurate.
@jeffwads
@jeffwads 11 күн бұрын
That prompt is more fit for a visual mode. We are pattern-matching engines as well, if you want to play that stochastic parrot game.
@Wolgorboy
@Wolgorboy 11 күн бұрын
The real reason llms fail test involving letters is that they get tokens as input not letters. Most of the time it's impossible for llm to figure out letters in a token.
@Sindigo-ic6xq
@Sindigo-ic6xq 11 күн бұрын
Let me think of some words that begin and end with the same letter. Here's one: "radar" - it starts and ends with 'r'! Would you like me to share some more examples?
@Sindigo-ic6xq
@Sindigo-ic6xq 11 күн бұрын
you can cope haha, both are from Claude btw Let me count each letter explicitly: k(1) h(2) j(3) k(4) s(5) h(6) k(7) j(8) h(9) d(10) l(11) k(12) h(13) d(14) f(15) k(16) l(17) s(18) h(19) f(20) j(21) k(22) h(23) f(24) s(25) j(26) The word has 26 letters.
@haroldpierre1726
@haroldpierre1726 11 күн бұрын
Deep down inside, we all know that is what is going on. But LLM's ability to do so is far better than humans in CERTAIN context that it appears to be intelligent. That ability is useful in the situations where it exceeds the capabilities of humans while counterproductive in situations where it performs far worse than humans.
@nguyenanhnguyen7658
@nguyenanhnguyen7658 11 күн бұрын
Nothing works. It was a hundred billions show. The tech is just not there no matter whatever you do. Maybe next.
@Joris-KarlHuysmans
@Joris-KarlHuysmans 11 күн бұрын
Why name your channel “prompt engineering”? Do you know what engineering means? Do you know what an (actual) engineering degree involves? All the problem solving, mathematical and abstraction capabilities required? Suddenly “prompt engineering” is a thing, that’s hilarious. To that regard, I’m a Spotify Playlists Engineer since I curate top music playlists, and ohh, wait, I’m also a Film Engineer since my Letterboxd profile is impeccable, and ohh wait I’m also a sleep engineer since I like to improve my sleep patterns. I’m excited to share I’m also a Spreadsheet Engineer since I took an Excel crash course a couple of weeks ago.
@nocturnomedieval
@nocturnomedieval 11 күн бұрын
You are hilarious. It is just you need to oblige that world is filled with hype and buzzwords. A role/channel called "instruction crafter" would get less public recognition and media attention. It is not the channel's creator fault
@OMorales-pe4xq
@OMorales-pe4xq 11 күн бұрын
Anything Google, sucks.
Can LLMs Reason? Why LLMs Struggle to Think Critically
15:29
Prompt Engineering
Рет қаралды 5 М.
My scorpion was taken away from me 😢
00:55
TyphoonFast 5
Рет қаралды 2,7 МЛН
To Brawl AND BEYOND!
00:51
Brawl Stars
Рет қаралды 17 МЛН
LightRAG: A More Efficient Solution than GraphRAG for RAG Systems?
19:49
Prompt Engineering
Рет қаралды 40 М.
Behind the Scenes of Gemini 2.0
35:30
Google for Developers
Рет қаралды 14 М.
Devin review: is it a better AI coding agent than Cursor?
9:18
Steve (Builder.io)
Рет қаралды 94 М.
🚨RELEASED! New Gemini Available to Everyone for FREE!
11:31
Onde eu Clico
Рет қаралды 45 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4 МЛН
Gemini 2.0 and the evolution of agentic AI with Oriol Vinyals
51:57
Google DeepMind
Рет қаралды 48 М.
OpenAI O1 Tested: Smarter, But Is It Truly Reliable?
18:26
Prompt Engineering
Рет қаралды 6 М.
CAUTION: Google's Gemini 2 is ACTUALLY useful
11:52
Alex Ziskind
Рет қаралды 94 М.
Stop Losing Context! How Late Chunking Can Enhance Your Retrieval Systems
16:49