The New Gemini Experimental: Can it Pass the Reasoning Tests?

Рет қаралды 2,973

Prompt Engineering

Күн бұрын

Пікірлер

@XShollaj 11 күн бұрын

This seems like the best model for coding together with sonnet 3.5

@tornmap4385 9 күн бұрын

o1 pro 10x better

@KeyserTheRedBeard 10 күн бұрын

Impressive video, Prompt Engineering. Looking forward to your next upload! I clicked the thumbs up icon on your content. Keep up the fantastic work. Your analysis on the reasoning limitations of the Gemini experimental model was enlightening. Considering the challenges you've highlighted, what strategies do you think could be implemented to enhance the model's ability to handle modified prompts more effectively?

@augmentos 11 күн бұрын

Good video but just a tip you gotta use unique problems. These are all in the trending data sets already. It doesn't tell almost anything even though it actually got it wrong.

@anubisai 11 күн бұрын

Their name is Astra.

@DeepThinker193 11 күн бұрын

I'll just go ahead and say it. LLM's are basically a pattern repeating database. That's all they are. Ask it something as simple as "give me an item that starts and end with the same letter" and they go haywire because that pattern is not in their database. Go ahead, use the question in an llm and watch how they cant answer. Also ask it to count the letters in a really long word like "khjkshkjhdlkhdfklshfjkhfsj" and they will not be accurate.

@jeffwads 11 күн бұрын

That prompt is more fit for a visual mode. We are pattern-matching engines as well, if you want to play that stochastic parrot game.

@Wolgorboy 11 күн бұрын

The real reason llms fail test involving letters is that they get tokens as input not letters. Most of the time it's impossible for llm to figure out letters in a token.

@Sindigo-ic6xq 11 күн бұрын

Let me think of some words that begin and end with the same letter. Here's one: "radar" - it starts and ends with 'r'! Would you like me to share some more examples?

@Sindigo-ic6xq 11 күн бұрын

you can cope haha, both are from Claude btw Let me count each letter explicitly: k(1) h(2) j(3) k(4) s(5) h(6) k(7) j(8) h(9) d(10) l(11) k(12) h(13) d(14) f(15) k(16) l(17) s(18) h(19) f(20) j(21) k(22) h(23) f(24) s(25) j(26) The word has 26 letters.

@haroldpierre1726 11 күн бұрын

Deep down inside, we all know that is what is going on. But LLM's ability to do so is far better than humans in CERTAIN context that it appears to be intelligent. That ability is useful in the situations where it exceeds the capabilities of humans while counterproductive in situations where it performs far worse than humans.

@nguyenanhnguyen7658 11 күн бұрын

Nothing works. It was a hundred billions show. The tech is just not there no matter whatever you do. Maybe next.

@Joris-KarlHuysmans 11 күн бұрын

Why name your channel “prompt engineering”? Do you know what engineering means? Do you know what an (actual) engineering degree involves? All the problem solving, mathematical and abstraction capabilities required? Suddenly “prompt engineering” is a thing, that’s hilarious. To that regard, I’m a Spotify Playlists Engineer since I curate top music playlists, and ohh, wait, I’m also a Film Engineer since my Letterboxd profile is impeccable, and ohh wait I’m also a sleep engineer since I like to improve my sleep patterns. I’m excited to share I’m also a Spreadsheet Engineer since I took an Excel crash course a couple of weeks ago.

@nocturnomedieval 11 күн бұрын

You are hilarious. It is just you need to oblige that world is filled with hype and buzzwords. A role/channel called "instruction crafter" would get less public recognition and media attention. It is not the channel's creator fault