Soon we're gonna have AI create videos about AI that is consumed by AI (that is then summarizd by AI for AI)
@Z-Z-W_origin3 ай бұрын
Only until it becomes ouroboros and poisons itself (and then hallucinates to death.) Then, us humans will have to input real information to continue to improve it. We're already past the golden age of original content (imho). As AI consumes all past data to train, new data will become increasingly valuable. 🙈😅
@lovisakaffe3 ай бұрын
how do we create protected and data siloed content? some sort of honeypot that leaves us alone from the AI
@Z-Z-W_origin3 ай бұрын
@@lovisakaffe i think the job of the future for humans will be creating data specifically to be used by whatever private company needs it to train or refine their models. And people who do not consume AI data will be the most valuable, since they haven't been influenced by it. Picture entire towns or campuses where they are isolated, for the sole purpose of generating new authentic content. Spend several years in one and then retire to the real world. Dystopian, yes. But realistic? Perhaps. People do crazier things for money already, lol
@shadygamererfan3773 ай бұрын
Synthetic data!
@shadygamererfan3773 ай бұрын
Models are already being built on synthetic data.
@brandonm36743 ай бұрын
Imagine giving it to a blind person and instructing it to “guide me throughout the day as we navigate downtown New York. User just uses an ear piece. They could “see” red lights, people waving, or girls winking. Amazing stuff
@MichaelSmith-lm5sl3 ай бұрын
- **Introduction to Momo (00:00)**: The video introduces Momo, a new family of multimodal AI models that surpass existing standards by enabling advanced interactions with both physical and virtual environments. - **Key Features (00:34)**: Momo not only interprets images and text but can also point at what it perceives, enhancing its ability to interact meaningfully. - **Demonstrations of Functionality (01:27)**: A series of demos showcase Momo's capabilities, including counting people, converting data to JSON, and answering various queries related to everyday situations. - **Performance Metrics (03:17)**: Momo's vision capabilities are highlighted, showing that it competes effectively with larger models, with human evaluations rating it highly against other closed-source models. - **Data Quality Over Quantity (05:24)**: Momo's training focuses on high-quality data, using fewer but more detailed images and descriptions to improve learning efficiency and reduce issues like hallucinations. - **Innovative Data Collection Methods (06:27)**: The model employs detailed human descriptions of images and speech-based data collection to enhance understanding and accuracy in recognizing objects and their contexts. - **Integration in Robotics (12:00)**: Momo's vision capabilities are positioned as beneficial for robotics, assisting robots in understanding their environment, identifying objects, and executing tasks efficiently. - **Conclusion and Future Implications (16:36)**: The video wraps up by reflecting on the rapid advancements in AI technology and hints at upcoming innovations from various companies in the field.
@kronux38313 ай бұрын
I’ve been following the A.I. space for a while, and last year, I made a timeline trying to predict when I reasonably expect different products to be released. Updated it again in June. Somehow, nearly every guess gets beaten by at least a couple of months. Very exciting trend to see happening
@Mr.Existence3 ай бұрын
0:30 closing the gap between open... and proprietary systems. Absolutely brilliant.
@BruceWayne153253 ай бұрын
Very impressive. The most impressive bit was actually the end of the demo where they showed it using agents, something the flagship models are still working on. And this is only a 1B parameter model?!?!?
@Merializer3 ай бұрын
5:55 Garbage in garbage out. The expression was popular in the early days of computing. The first known use is in a 1957 (Wikipedia says).
@chinobino14743 ай бұрын
First thing I thought too. It seems obvious. Why take inaccurate samples that will 'poison' your data. Be specific and deliberate.
@picksalot13 ай бұрын
SLAMs - Small Language Agentic Models for the win. Vision is critical, as is quality of data. 😎
I have to admit, the strategy at 8:52 15:29 was so clever. Total respect! 🔥
@Stretesky3 ай бұрын
Hopefully this will work well for self development and learning training, not only for economic functions.
@christopherd.winnan87013 ай бұрын
How does this model work in the real world, beyond the realm of luxury coffee machines and VC honeydew? If I hear another LLM order another silicon valley spiced pumpkin latte, I might just lose it!
@baumwollejr3 ай бұрын
Just imagine the web Versions of the MS Apps! It can do analytics, answer E-Mails and do a lot of workflows
@24-7gpts3 ай бұрын
So you really can't think of other use cases!
@DailyTuna3 ай бұрын
Hey, just be thankful they’re not using robots to operate K cup coffee machines. Like that task is so hard!😂
@christopherd.winnan87013 ай бұрын
@@24-7gpts = There are so many, but I am sick of valley types using it to order pumpkin lattes.
@christopherd.winnan87013 ай бұрын
@@baumwollejr = Sadly copilot and google's offerings in this area are still pretty lame.
@youtube_summarizer-o4m3 ай бұрын
Summary of the video (Powered by NEX, an AI tool which summarizes KZbin videos) Key Points: 1. [Key Point 1]: Momo AI surpasses large models in vision and interaction. 2. [Key Point 2]: Momo's 72 billion parameter model matches GPT-40 in benchmarks. 3. [Key Point 3]: Momo's data quality focuses on fewer, high-quality images. Important Details: Here's the timeline 00:00:00 Introduction to Momo AI • Speaker introduces Momo AI's multimodal capabilities. • Momo interacts with both physical and virtual worlds. 00:00:37 Demonstrating Momo's Capabilities • Momo demonstrates tasks like counting people and converting tables to JSON. • Momo can also write descriptions for items and answer complex questions. 00:02:33 Vision Capabilities and Benchmarks • Momo's vision capabilities match state-of-the-art models. • Momo outperforms other closed-source models in vision benchmarks. 00:03:50 Data Quality and Pixo • Momo uses high-quality data for training, not quantity. • Pixo gathers detailed descriptions for better AI learning. 00:08:11 Integration with Apple Vision Pro • Momo integrates with Apple Vision Pro for enhanced interaction. • Momo can answer questions and point to objects in images. 00:11:34 Robotics and Vision • Momo aids robotics by improving vision models. • Momo helps robots identify and interact with objects. 00:16:07 Conclusion and Future of AI • Momo's advancements highlight AI's rapid progress. • Speaker speculates on future AI advancements from other companies.
@sylversoul883 ай бұрын
Has anyone used it? Does it really work agentically?
@zakyvids65663 ай бұрын
Anyone wondering that now that we have text models vision models now the next logical step would be a model for audio one that can do audio tasks like tts speech to speech training new kinds of audio etc
@godtable3 ай бұрын
If it is actually this good, it's very impressive.
@Boolvtech_official3 ай бұрын
Amazing 😮
@paulyflynn3 ай бұрын
oh Molmo
@AvizuraDnB3 ай бұрын
The breakthroughs don't stop, do they?
@MrRandomPlays_19873 ай бұрын
Can't find Molmo's official site, how come?
@Michael_Jeromy_Kaiser3 ай бұрын
This is incredible!
@Upstatecashew3 ай бұрын
How can i test out the vision model where the guy tells it to order him a coffee ?
@simonstrandgaard55033 ай бұрын
Amazing
@TheRealChrisVeal3 ай бұрын
*grabs popcorn*
@Nightstorm-25163 ай бұрын
20 bucks for a cup of coffee?!!!!
@sephirothcloud39533 ай бұрын
Molmo 1B = 29GB, how do you load this?
@24-7gpts3 ай бұрын
Decent GPU
@marcelogobello97573 ай бұрын
For Lazzy people is PERFECT !
@DieselBlack-b6r3 ай бұрын
I wish this channel would at least make some effort to de-ChatGPT the script before laying down the narration. The steady supply of contrasting sentence structures is a dead giveaway.
@sights33r143 ай бұрын
Imagine this AI model playing Minecraft.
@pollywops92423 ай бұрын
Ouch my privacy though
@DiegoFernandez-cy3fr3 ай бұрын
Future is going to be BaaS (Brain as a Service)
@JohnsonNong3 ай бұрын
cool❤
@ChronicKPOP3 ай бұрын
the start it says "momo" I thought from Twice
@galailliz3 ай бұрын
Yes papi beat it
@VanSocero3 ай бұрын
Its decent but it's not were it needs to be. Tried to take text off of a comic page and had to add about 79% percent of it myself
@Cory-v4w3 ай бұрын
So in this objective reality...who has the pleasure of superposition upon observation. How much pleasure can you give me. How much pleasure can you make it mine. We are having fun today. Today is a good day to blurr the lines of reality.
@janweber16993 ай бұрын
real voice = sub
@69x3 ай бұрын
😢reading off ai script for that intro “diving”
@24-7gpts3 ай бұрын
i use diving too as a human being, because it's a human that made the word
@JellySword83 ай бұрын
Stay at home text adventure here we come
@NneonNTJ3 ай бұрын
That whole presentation feels fake to me, lets wait and see when it releases
@Subpilot13 ай бұрын
$20 Latte 🤪
@hai.18203 ай бұрын
miitary applications are endless...
@mixey013 ай бұрын
I'm afraid at one point we might be too dependent on A.I.'s Momo: "Which girlfriend should I date?" Momo: "Is she cheating on me?"
@ZenTheMC3 ай бұрын
Short term problem. Eventually we'll merge via BCI or nanotech. There's plenty of stuff we no longer do because technology does it for us.
@TheDude_7673 ай бұрын
Garbage tried it, and it doesn't work
@codyfsw3 ай бұрын
Maybe ask it for some grammar correction 😂
@EliteBankQuant3 ай бұрын
distilled vision model perhaps!
@DubStepKid8013 ай бұрын
I was the first person to watch the video so I won 😅
@WillBeebe3 ай бұрын
sounds like joe joe
@CustomComputing3 ай бұрын
This was very cherry picked scenarios. The model is not that impressive. I tested it and it’s efficiency rate is a little bit actually bad.
@Natural_beauty12123 ай бұрын
Hello everyone I hope everyone is going well If someone want to create AI videos but they don't know how to make a quality AI video here i am going to help you generating videos Images Avatar
@JohnnyColchester3 ай бұрын
I tested this Molmo Ai and it is the worst AI i ever came uppon! It halucinates lies and if you fool it with a data like for example say that you say something in the image it then says "Oh i forgot about that! You are right...." And if you then tell it the truth it just keeps lying 🤣🤣🤣🤣 Its like a butthurt egotisticall teen 😅😅
@robertt82793 ай бұрын
This isn't new, is it? Hell, I was taking photos of my refrigerator asking ChatGPT what I could make for dinner last year. It does everything in your demo. And zapiers new tools make all that possible. All the best.
@СергейФалалеев-й7у3 ай бұрын
In general yes, but I think the emphasis here is on being able to accurately determine the position of an object and point it accurately. This is still a mega difficult task for all LLMs - they can't even determine the position of a button on the screen accurately: I tell the GPT where to press, he names it, but specify the coordinates or at least the approximate position in percentages and he can't do it or does it almost always wrong
@DailyTuna3 ай бұрын
Visual AI for people that are not used to functioning in the real world? “ How long can I park here?”😂 Yes the tech can be great for business and such but sad if you have to be dependent on it to function. Amazing people survive thousands of years without it.😂