GPT-4o, AI overviews and our multimodal future

  Рет қаралды 17,528

IBM Technology

IBM Technology

Күн бұрын

Subscribe for AI updates → ibm.biz/BdmGsx
Learn more about artificial intelligence → ibm.biz/BdmGsR
In Episode 3 of Mixture of Experts, host Tim Hwang is joined by Shobhit Varshney, Chris Hay, and Bryan Casey for the OpenAI vs. Google showdown. Shobhit analyzes the showcase demos released by OpenAI and Google. Chris breaksdown latency and cost in relation to GPT-4 and Gemini 1.5 Flash. Finally, after years of people proclaiming the death of search, Bryan answers the big question: are LLMs forcing the death of Google search?
The Rise of Multimodality: 3:13
Collapsing Latency & Cost: 16:54
LLMs Eat Google Search: 30:12
The opinions expressed in this podcast are solely those of the participants and do not necessarily reflect the views of IBM or any other organization or entity.

Пікірлер: 20
@ThoughtfulAl
@ThoughtfulAl Ай бұрын
I took a photo of this podcast in my KZbin browser, and asked GPT-4o to analyze it. It took only 5 seconds and here is the analysis: The image shows a virtual panel discussion hosted by IBM, as indicated by the "think | IBM" branding at the top left corner. There are four participants visible, each in their own video feed, suggesting a remote conferencing setup. The title of the discussion, visible at the bottom, is "AI models and our multimodal future." This indicates the conversation likely centers around advancements in artificial intelligence, particularly focusing on multimodal AI, which involves systems that can process and understand multiple types of data simultaneously (e.g., text, images, audio). Other details such as the video being liked 22 times, options to share, download, and access more content, and the KZbin browser tabs at the top, suggest this is a live-streamed or recorded event available for viewing online.
@Charles-Darwin
@Charles-Darwin Ай бұрын
I submitted a link to an 800pg paper and it only took a second or two to summarize and discuss it.
@gaylenwoof
@gaylenwoof 28 күн бұрын
Given the latency differences between quick/shallow answer vs. longer pause/logically deeper answer, I hope systems will have a built-in default option to pick the depth of answer you want. I'm imagining a few keywords to trigger a deeper-thinking mode with the implicit assumption that you are willing to tolerate a longer pause (and/or a possible charge to your account?) - just as you would expect with a human. I assume that if I say something like "take some time to think about this and give me the best response you can" then the AI will, indeed, take some time to give the most deeply intelligent response. Am I thinking about this plausibly? Or am I being naive?
@RomaineGangaram
@RomaineGangaram 23 күн бұрын
Beautiful show first time here. Interesting and informative guests. You have a very comprehensive discussion panel going
@GNARGNARHEAD
@GNARGNARHEAD Ай бұрын
some great perspective, thanks
@Superteastain
@Superteastain Ай бұрын
Let's be super clear about Google advertising. The way to think about paid online advertising is a comparison to what organized crime used to call protection. So in my best Tony Soprano voice I'll now sell you some online advertising. "Hey, I hear you want to sell some service X", "You know the guy down the street sells service X, and he's doing alright", "You know he donates to the.... well... let's say he pays to play", "Are you going to pay to play?" "You know you've got a nice place here, I'd hate to see it burn down..." That's essentially it. You'll hear talk about the success of google because of amazing search results. This is incorrect. When Google launched most of us were on 56kps dial up connections. The search competition had pages full of crud that took a lifetime to load. Google didn't have any crud so it loaded quickly. That was it. Now Google just serves up a load of trash while it makes 100s of millions of dollars serving up that crud.
@MartinBroadhurst
@MartinBroadhurst Ай бұрын
8:46 My take on this is the extended context window is how it is rooted in enterprise. These huge context windows, 1-2m tokens, is enough to provide the context required. Bumps up the inference cost, sure, but it's how you give the context without relying on RAG
@Charles-Darwin
@Charles-Darwin Ай бұрын
I think 'o' actually stands for orbitofrontal cortex. It could be a smaller/narrow receptive input network that doesn't really retain or memorize beyond simple and critical pathways, and a much larger network that assesses the weighted inputs - for bottom-up top-down approach. Because of this, I think 4o is a double ended model that are working together for distilling input and assessment. This region of the brain is multimodal, but just as our organic builds, vision is the primary input where the other modalities also largely construct to visual representations (hear a garbage truck outside, visualize what that truck looks like in your head). This region is also extremely low latency by necessity as responses to visual input needs near-automatic responses (driving a car, walking). All things considered I think this is the analogue of our orbitofrontal cortex and perhaps the applicability extends far farther and wider than theorized prior to implementing the solution. Shy of having the equivalent biological need to survive, I think this is AGI and we've only seen the baby brother. I don't think we'll get the whole enchilada this year or the next, rather what they've been saying, an agentic version of Jr to do biddings to paid subscribers. This kind of a breakthrough aligns with the primary scientists (and alignment conscientious) taking their leave as the management has turned on the primary objective, allocating infrastructure resources to press forward with the model's expansion over creating safety for it.
@twitter47
@twitter47 29 күн бұрын
Artificial Intelligence is blessing for mankind but it has disadvantage also.
@lighteningrod36
@lighteningrod36 Ай бұрын
Parts geolocation has been around for ten years, with maint scheduling etc
@DrJanpha
@DrJanpha Ай бұрын
Modes of media, fragmented ...here are there What is more useful is smartphones' functions like doing banking and ordering products...this is what is called "multifaceted integration "
@dianedean4170
@dianedean4170 Ай бұрын
🎉❤😊When law enforcement and security footage are translated from image to text...the jury and judge will be more confident in their determination of reasonable doubt. The accuracy of information will be more reliable and hopefully justice will be served.🎉❤😅
@markmuller7962
@markmuller7962 Ай бұрын
Unfortunately the Google demo was pre-generated and they provided a tiny pathetic little disclaimer in the bottom right corner of the screen. Besides, if it goes like the last presentation, everything AI google announce either doesn't work at all or never arrive or if it arrives it comes 6 months later in a scrapped form... So this podcast putting the OpenAI and Google presentations on the same plane is borderline bizarre and probably unprofessional too
@flashmo7
@flashmo7 22 күн бұрын
Environmental metric should be first.. unless you like to sun bath at 80 degrees Celsius some day on an average summer day...
@bamh1re318
@bamh1re318 Ай бұрын
google search, for me, has been garbage. Northern light was used to be way better for search, but not for ad sale
@AlgoNudger
@AlgoNudger Ай бұрын
from Generative to Objective-Driven. 🙃
@TOMESHTI
@TOMESHTI Ай бұрын
M U L T I M O D A L I T Y 4 O
@Hordebarraged
@Hordebarraged Ай бұрын
Imagine that incredibly annoying, ingratiating female 4.o voice dealing with a police officer if you were pulled over- or any authority / face-to-face response scenario....would the authority figure punch you first or simply tear your iphone in two?
@Bronco541
@Bronco541 Ай бұрын
I didnt find it the least bit annoying.
I’m just a kid 🥹🥰 LeoNata family #shorts
00:12
LeoNata Family
Рет қаралды 2,5 МЛН
UFC Vegas 93 : Алмабаев VS Джонсон
02:01
Setanta Sports UFC
Рет қаралды 204 М.
ИРИНА КАЙРАТОВНА - АЙДАХАР (БЕКА) [MV]
02:51
ГОСТ ENTERTAINMENT
Рет қаралды 2,1 МЛН
СНЕЖКИ ЛЕТОМ?? #shorts
00:30
Паша Осадчий
Рет қаралды 7 МЛН
Hey ChatGPT, Summarize Google I/O
1:53:19
WVFRM Podcast
Рет қаралды 472 М.
Mapping GPT revealed something strange...
1:09:14
Machine Learning Street Talk
Рет қаралды 197 М.
The Turing Lectures: The future of generative AI
1:37:37
The Alan Turing Institute
Рет қаралды 561 М.
What Is an AI Anyway? | Mustafa Suleyman | TED
22:02
TED
Рет қаралды 1,1 МЛН
GPT4o: 11 STUNNING Use Cases and Full Breakdown
30:56
Matthew Berman
Рет қаралды 114 М.
Transforming AI | NVIDIA GTC 2024 Panel Hosted by Jensen Huang
53:48
NVIDIA Developer
Рет қаралды 101 М.
I’m just a kid 🥹🥰 LeoNata family #shorts
00:12
LeoNata Family
Рет қаралды 2,5 МЛН