Gemini 2.0 Flash in Action: How Multi-Modal AI is Changing Everything

  Рет қаралды 7,022

Prompt Engineering

Prompt Engineering

Күн бұрын

Пікірлер: 17
@unclecode
@unclecode 20 күн бұрын
Turning an image into an open book is amazing. What we have today with language models is the natural progression of what started with RNNs, then transformers. Over time, things improved-better architectures, scale laws, larger datasets, and now we have these sophisticated language models. It’s a gradual evolution. But for images, it’s a completely different story. This isn’t an extension of conventional image processing techniques like classification, object detection, or segmentation. It’s something entirely new. The process essentially transforms an image into text, enabling us to dig in, ask questions, or extract information, natively, using the same model that processes text and audio. Everything becomes a text sequence. What’s fascinating is that it bypasses all the classical image processing methods: no need for specialized data preparation, binarization, or other traditional steps. It’s a totally different solution to the problem, redefining how we process and understand images. This shift is what truly amazes me, it’s not just an improvement, but a fundamental change.
@engineerprompt
@engineerprompt 20 күн бұрын
I agree! having worked with CNNs in the early days and making a cat vs. dog classifier felt like magic without hand written features. This is a whole new level. A single model that can understand different modalities unlock applications that were not possible before.
@Joethegamer5
@Joethegamer5 19 күн бұрын
I tested it with 6 languages with the Talk to Gemini feature. It can seamlessly switch between languages and although the accent in some languages is not perfect, it works insanely well!
@DearGeorge3
@DearGeorge3 20 күн бұрын
Very useful!
@AbhishekMane-s9q
@AbhishekMane-s9q 17 күн бұрын
Can you make a video on How to Use Gemini 2.0 api key for our Own Text to Speech and Speech T Text conversation.
@JordanC-f5j
@JordanC-f5j 20 күн бұрын
Would be nice to see how it compares to Sonnet 3.5. Gemini seems to score higher on various benchmarks but it'd be nice to see real problem solving in different fields and how closely it follows the instructions.
@engineerprompt
@engineerprompt 20 күн бұрын
working on it :)
@tollington9414
@tollington9414 20 күн бұрын
Very interesting
@_abdul
@_abdul 20 күн бұрын
They Cooked, And this time it's Tasty.
@CollinParan
@CollinParan 20 күн бұрын
Multimodal V2LMs are the way
@engineerprompt
@engineerprompt 20 күн бұрын
agree
@MrAhsan99
@MrAhsan99 20 күн бұрын
is this me or this guy's voice has changed?
@SingularityReacts505
@SingularityReacts505 20 күн бұрын
just tried chat GPT with vision it's so much better than this garbage it's not even funny opening AI stays winning
@sillybilly346
@sillybilly346 20 күн бұрын
Bot deployed by openai?
@SingularityReacts505
@SingularityReacts505 20 күн бұрын
@sillybilly346 no it's just advanced voice mode with vision it's f****** insane
@shylvari
@shylvari 20 күн бұрын
gemini > chatgpt now
@techtalks8481
@techtalks8481 20 күн бұрын
Ya , we can hope 😂😂
Gemini 2.0 Flash Thinking - Does it Pass the Misguided Attention Test?
14:23
Anthropic’s Blueprint for Building Lean, Powerful AI Agents
28:25
Prompt Engineering
Рет қаралды 10 М.
My scorpion was taken away from me 😢
00:55
TyphoonFast 5
Рет қаралды 2,7 МЛН
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 23 МЛН
Гениальное изобретение из обычного стаканчика!
00:31
Лютая физика | Олимпиадная физика
Рет қаралды 4,8 МЛН
Gemini 2.0 Flash Tested - Is AI Better Than Humans?
8:43
In Depth Tech Reviews
Рет қаралды 44 М.
Meet KAG: Supercharging RAG Systems with Advanced Reasoning
14:41
Prompt Engineering
Рет қаралды 16 М.
The Honey Scam: Explained
10:53
Marques Brownlee
Рет қаралды 3,2 МЛН
Gemini 2.0 Flash Thinking
20:13
Sam Witteveen
Рет қаралды 27 М.
Gemini 2.0: The AI That Sees, Hears, and Understands (Use it for FREE)
11:05
The 8 AI Skills That Will Separate Winners From Losers in 2025
19:32
Web Scraping Has Never Been Easier  - Use Google Gemini 2.0
11:01
My scorpion was taken away from me 😢
00:55
TyphoonFast 5
Рет қаралды 2,7 МЛН