Agent-S : Unleash The Power Of GUI Computer Use Agents !

  Рет қаралды 11,448

Sam Witteveen

Sam Witteveen

Күн бұрын

Пікірлер: 35
@steelwolf180
@steelwolf180 Ай бұрын
This sounds useful for automated testing to mimic how a user would behave then interact a web app or desktop app or even just carry workflow tasks as a chaos user.
@Avman20
@Avman20 2 ай бұрын
Businesses often have bespoke apps that have user documentation but no API. I can see Agent S being fabulous for this type of thing.
@samwitteveenai
@samwitteveenai 2 ай бұрын
Totally agree
@kenchang3456
@kenchang3456 2 ай бұрын
Thanks Sam. It'll be interesting when you can fine-tune this on your domain specific apps, and what the fine-tuning process would look like.
@samwitteveenai
@samwitteveenai 2 ай бұрын
Interesting idea, you could actually pre-fill a lot of the memory with things that would be useful for the app that you want to use it in. In that way you would be customizing for that use case.
@eado9440
@eado9440 Ай бұрын
I think that agent zero, cline , and omni parser, are already halfway there, and if combined could be even more powerful.
@themax2go
@themax2go 23 күн бұрын
i totally agree
@shiv248_
@shiv248_ 2 ай бұрын
awesome explanation sam. Can you do more of these videos of explaining papers really helps merge understanding between GA and scientific knowledge. Where do you find worthwhile papers, hugging face?
@unclecode
@unclecode 2 ай бұрын
You mentioned something I strongly believe in. A generic solution is required. It's just a matter of time before website owners, apps, and platforms realize they need to create specific layers for AI agents and assistants. Rather than creating weird solutions to communicate with apps, it makes sense now when apps can't provide enough API data for AI applications. Website owners will likely have specific markdown with knowledge and instructions for AI, possibly developing a markup language for AI data. We can even include tools our websites or apps want AI to use. Like with robots.txt, website owners will define which parts AI can control. This isn't far off. Even for other products/services like books, musics or movies, authors can include that AI content layer. Until then, IMO we have patchwork solutions that aren't permanent but help understand the system's needs, weaknesses, and strengths.
@AnmolSharma293
@AnmolSharma293 2 ай бұрын
What would be some of these business which are ripe for this? Like what business would be OK with bots using their web product? I'm curious because I can't seem to think there may be many out there.
@Wotevz
@Wotevz 2 ай бұрын
Semantic markup already exists for SEO. Tools and knowlege can be shared in an agent-friendly way. Linked data and OpenAPI snippets point the way.
@Wotevz
@Wotevz 2 ай бұрын
Agents will become economic. Any business that wants to capture agentic $. Might be behind paywalls or purchase. LLMs grok JSON-LD so it’s kinda exists just haphazardly.
@unclecode
@unclecode Ай бұрын
​@@AnmolSharma293 For instance, at present, marketing has been heavily based on search. This means people were supposed to search, but soon, everyone will have a shopping AI assistant. Their ai-assistant will search and interact with internet to find suitable goods and services. At that point, retail owners will start to incorporate features and data that are AI-agent-friendly. And that's the benefit, as those who do that, such as those with better SEO, will get more sales and conversions. This is just one example of how industries are convinced to provide a layer in their applications for AI assistance. The same thing happened when digital marketing came into the picture, and gradually, all companies and websites realized the need to improve their SEO. So, this is one way of thinking about how it will be in the future.
@unclecode
@unclecode Ай бұрын
@@Wotevz You are very true, that's what happens when digital marketing is first introduced. Everyone understands they need to build things like SEO at the beginning. However, AI assistance can do much more than just provide information on SEO. They have the capability to run an action. I believe this will be a layer that business owners, website owners, and web applications will use to create context or prompt engineering plus some available tools (with their JSON schema) to make it friendlier to AI agents. There will be a special structure allowing businesses to input their context, provide information, and specify their tools. Subsequently, the AI assistant can utilize this setup in real-time to generate content and communicate accordingly. For instance, when everyone has a shopping AI assistant, it will browse the internet and access websites that have this feature, making communication easier for the assistant and ultimately contributing to more sales for businesses and better service for users. This is similar to how SEO operates. Perhaps we will have 'SEO Plus' specifically designed for AI agents in the future.
@muhammadhasnain8177
@muhammadhasnain8177 2 ай бұрын
I have done so many project after getting a lot knowledge from you. We need the new video on image generation model that can handle the text and facial and body problem
@alx8439
@alx8439 2 ай бұрын
Several months ago, when Rabbit R1 device was announced, there was another wave of "large action models" - an attempt of training or fine tuning transformers to do the UI interaction stuff. I wonder where did this eventually go? There were few quite promising products
@samwitteveenai
@samwitteveenai Ай бұрын
Yeah it seemed they just hyped it too much and then didn't get their. Adept was another interesting one as well.
@GNARGNARHEAD
@GNARGNARHEAD 2 ай бұрын
that's really exciting!
@arungnanaable
@arungnanaable 2 ай бұрын
Thanks Sam. I'm learning everything by myself and I need help in identifying worthy recent research papers to study. How do you know which ones are good?
@samwitteveenai
@samwitteveenai Ай бұрын
check out /papers on Huggingface I most see people talking on twitter or friends send me links. There are so many papers these days.
@kyoungd
@kyoungd 2 ай бұрын
Is it an OP version of N8N?
@40bombala
@40bombala 2 ай бұрын
Very interesting, does this compete with Microsoft UFO?
@muhammadhasnain8177
@muhammadhasnain8177 2 ай бұрын
Create a video on image generation model plz
@megaklis.vasilakis
@megaklis.vasilakis Ай бұрын
Funny that the next day this video came out anthropic published their computer use API
@samwitteveenai
@samwitteveenai Ай бұрын
totally. I felt this would be coming but never expected it would be the next day.
@davidmetekingi9694
@davidmetekingi9694 Ай бұрын
24 hours later... Anthropic brings out computer use.
@samwitteveenai
@samwitteveenai Ай бұрын
lol exactly. I felt this would be coming but never expected it would be the next day.
@pensiveintrovert4318
@pensiveintrovert4318 2 ай бұрын
I would be super concerned to allow anything to run directly on my desktop. It could see passwords, cryptographic keys, modify the registry, destroy the system.
@samwitteveenai
@samwitteveenai 2 ай бұрын
What about in a docker instance ?
@pensiveintrovert4318
@pensiveintrovert4318 2 ай бұрын
@@samwitteveenai it certainly would mitigate my concerns, as I could possibly peel off "defective" layers, but this docker would still be subject to control by my machine, and therefore could be damaged beyond repair. I normally do my development in a docker container, but if you have a lot of work that accumulates over time in this container, then losing the container is as bad as losing your machine. There would need be a trusted third software that can selective lock out some APIs, applications, resources, thus limiting the utility of your AI PC overlord.
@tornyu
@tornyu 2 ай бұрын
It might be safer to have it logged in as a different user or on a different machine entirely, and just collaborate with it like you would with a colleague. I wouldn't let a colleague use my computer as me, either.
@micbab-vg2mu
@micbab-vg2mu 2 ай бұрын
I don't trust AI enough to give it access to the files and apps on my computer.
@samwitteveenai
@samwitteveenai 2 ай бұрын
What if it was only allowed to change files with a Human in the Loop system ?
AI Agents' Secret Sauce
17:08
Sam Witteveen
Рет қаралды 11 М.
Cat mode and a glass of water #family #humor #fun
00:22
Kotiki_Z
Рет қаралды 42 МЛН
Mom Hack for Cooking Solo with a Little One! 🍳👶
00:15
5-Minute Crafts HOUSE
Рет қаралды 23 МЛН
How Strong Is Tape?
00:24
Stokes Twins
Рет қаралды 96 МЛН
IL'HAN - Qalqam | Official Music Video
03:17
Ilhan Ihsanov
Рет қаралды 700 М.
PydanticAI - The NEW Agent Builder on the Block
21:45
Sam Witteveen
Рет қаралды 27 М.
Anthropic’s New AI Can Control Your Computer!
18:29
Matthew Berman
Рет қаралды 68 М.
Claude MCP + GitHub = The Future of AI Agents 🚀
18:19
Florenz Erstling
Рет қаралды 2,5 М.
Google’s Quantum Chip: Did We Just Tap Into Parallel Universes?
9:34
Claude Computer Use TESTED - This is VERY Promising!
17:39
All About AI
Рет қаралды 48 М.
Multi-Agent AI EXPLAINED: How Magentic-One Works
16:39
Sam Witteveen
Рет қаралды 18 М.
This is how I scrape 99% websites via LLM
22:44
AI Jason
Рет қаралды 172 М.
Gemini 2.0 Flash Thinking
20:13
Sam Witteveen
Рет қаралды 11 М.
Cat mode and a glass of water #family #humor #fun
00:22
Kotiki_Z
Рет қаралды 42 МЛН