MOST Important AGENTIC Application - Speech to Text to AI Agents (TTS, STT, LLM Router)

Рет қаралды 9,777

Күн бұрын

Пікірлер: 63

@marcossixto5324 8 ай бұрын

2:24 I enjoyed the proof of concept in the beginning, because it helps me determine if it's worth tuning in to the video

@vincentjean6756 8 ай бұрын

The GOAT of AI Coding has given us another gift. Thank you Dan!

@soulessshoe 8 ай бұрын

awesom, i've found for voicellm, having something like vapi where you can interrupt or it waits for your pauses makes it feel so much better

@drumlivetonya 8 ай бұрын

I like where you're going with this this is exactly what I've been looking for!

@loryo80 8 ай бұрын

This is a project that will make me living with my ai passion..I have already a lot of ideas and use cases that will make life easier.. thank you already for these videos series

@goforit5 8 ай бұрын

Excited for series about Ada

@AGI-Bingo 8 ай бұрын

I made something similar right after OpenInterpreter launched. Some things add: TotalRecall, 1Thread, GMoE, AI Telepathy (between agents), Adaptive SkillLibrary, branched flows, self reflection, bechmarks, user/ai feedback, self improvement, async anticipate likely message for faster realtime responses, multilangual, web/api exposed with user permissions, mic threshold for continues flowing conversation + interrupts. Async tasks with updates/notifications/progress reports. Semantic routing instead of keywords. Background online research for urls not in training. Differentiate when talking "about" Ada and "to" Ada (right time to respond based on context). And a couple more.. i got a long list/ bingo board haha Lemme know if you wanna collab ❤ All the best!

@aurora.radial 8 ай бұрын

Holy shit man, the tower is quite high uh? Thanks for sharing it, really interesting stuff. It makes sense to the name of your channel 😄

@AGI-Bingo 8 ай бұрын

@@aurora.radial Thanks man! I'm working on both an intro to the channel - showcasing that bingo board, and also about the AAA+ framework (Advanced Atomic Agents), which is akin to AgentOS but already atomic and composable, with some of the advanced features I listed. Lemme know if there's one you prefer first :) All the best!

@blahblahdrugs 8 ай бұрын

Do these applications use CrewAI or AutoGen or something else? I want to start building mine.

@jak-3D 8 ай бұрын

I am working on personalized agentic assistance as well and would love to collaborate @AGI-Bingo

@blahblahdrugs 8 ай бұрын

@@jak-3D If you want to collaborate I'll need your discord.

@jdallain 8 ай бұрын

Seems like a great use case for langgraph where you can get finer control of your agents and their direction

@MarkoTManninen 8 ай бұрын

Deepgram with Claude tools (function calling) will let you near realtime voice to command to voice flow. But then you better let your assistant to do some real productive work to pay of the bill.

@kubasmide223 8 ай бұрын

This is the best AI YT channel. Quality content Dan

@free_thinker4958 8 ай бұрын

When indey speaks we gotta listen carefully 💯👏!

@AtomicPixels 8 ай бұрын

The second I heard you speaking in this video I knew it was legit. You’ve made a name for yourself that reflects stability, authenticity and authority. First time i saw you was the supabase tutorial on context injection and I’ve never learned ML the same way since. Cheers

@TimeLordRaps 8 ай бұрын

I realistically think it would be best to focus on the dynamics of routing. Creating a base routing agent flow probably is a good first step.

@TimeLordRaps 8 ай бұрын

However, for videos focusing on progressive writing like you did in your agent os video where you are introducing high level ideas in greater detail progressively. I feel like this video was too focused on a working prototype, which I still appreciate due to your pride in it, but as a new viewer I believe that you excel in providing explanatory depth unlike most other youtube creators on technical topics. For example from my perspective I think this video could've been a dive through the motivation, multiple flows of multi-agent systems, other options (IDK), assumptions (You've taken and can be taken), solution (LLM Router), finalized with a call to action for people to build effectively with this solution. Starting the video with a question may be a powerful way to provide the motivation.

@TimeLordRaps 8 ай бұрын

I think dynamic agent generation pre-routing is the next paradigm btw. Great content keep it up.

@TimeLordRaps 8 ай бұрын

I already suggested the feature to langchain/graph.

@AGI-Bingo 8 ай бұрын

I agree, and also bake benchmarking and feedback into it, so us humans and even AIs soon will be able to revise/make new routing flows, and make sure it improves on every iteration

@TimeLordRaps 8 ай бұрын

@@AGI-Bingo I think we don't get self-improvement without explicitly having benchmarks to compare effectiveness on, that then progressively get beaten at lower token constraints, in all steps of learning. Feedback is the training loop already, what I mean is that at each point in the training new feedback is added, so we're probably 1 or 2 new feedback mechanisms for enhancing training, 1 of which is RHO-1 they study so much more about token dynamics in training than anything else I've seen, where they use a smaller model to decide high impact tokens to train a larger model drastically reducing the needed quantity of tokens in pretraining. This I would argue is a ai-ai feedback system. Normally though pretraining lacks effective feedback beyond the loss. Fine tuning is human - ai feedback, and partially human-human feedback, in the first we determine what they train on and that changes based on the ai's performance, in the second we observe what other humans do to decide new paths to expand the human - ai feedback. I have an interesting theory for a feedback mechanism to simulate thinking steps at the cost of inference compute.

@JohnLewis-old 8 ай бұрын

Very exciting. Thanks for sharing. What LLM are you using in the background?

@oldmangrizzz 8 ай бұрын

This is spot on exactly what I have been working on trying to build myself for the last eight months and doing a piss poor job! Strong work to you and your people all great start on your POC, I look forward to seeing snatching some of it up of ya lol

@johnbarros1 8 ай бұрын

I love this project, its very facinating and inspiring. Thank you for sharing this with us!

@MichaelWoodrum 8 ай бұрын

I've been building a multi agent system that works with fastapi and WebSockets for streaming for a few months. It's modular and capable of running anywhere. How could this system being shown now, work without using a computer for interaction? Could you adapt this to web access?

@CHNLTV 8 ай бұрын

Looking forward to riding along... I'm going to try use free STT & TTS libraries (FasterWhisper & OpenVoice) I love the concept and look forward to the innovations here

@SkyEther 8 ай бұрын

Another masterpiece brotha! You are my go to AGI guy now :) I feel like I can build my AGI with your guidance! Would love to have a community of all of us together making this vision a possibility!

@AGI-Bingo 8 ай бұрын

Boom! Let's make it happen!

@kevinrstruck 8 ай бұрын

This is amazing. I am looking forward to this one. Thank you for sharing.

@reality-drift122 8 ай бұрын

tis a beautiful prototype

@bluebabboon 8 ай бұрын

Awesome stuff. Really cool

@christopheboucher127 8 ай бұрын

so great !! look forward to use it ! Do you plan to implement later self improvement, like the learn function of open interpreter O1 ? And do you plan to make sort of wrappers (like MemGPT does) to use it with local llms ? Thanks for all your videos, thanks for sharing your skills and knowledge

@seanzoso 8 ай бұрын

Great insights. Question is the "from modules import llm" code available anywhere?

@ryanscott642 7 ай бұрын

what do you think about using something like octopus llm for the agentic routing?

@ronaldokun 8 ай бұрын

Your code theme is very cool. I don't know if it would strain the eyes if using it exclusively but I'm curious. Amazing work by the way. Keep up the great work!

@EntertainmentZone-jw6bq 7 ай бұрын

you should make it always on, so that when it recognizes a possible prompt or command it activates, if that makes sense, as far as i am aware deepgram is the fastest stt and tts, maybe give it access to the content in your main display to have more context, just some ideas

@indydevdan 7 ай бұрын

this is where we're heading. Just like you mentioned: always on, more commands, faster tts stt, lower costs.

@aurora.radial 8 ай бұрын

yoh man, This is extremely awesome! Thanks a lot for sharing and explaining it! I'll definitely keep an eye on the channel, and see where it goes. I'm also thinking about creating my own assistant, so it will help a lot. I'll try to share, if I get into anything helpful as well.

@mikew2883 8 ай бұрын

Good stuff! 👍

@toromanow 5 ай бұрын

What package to install for line 13: from modules import llm?

@josephtilly258 7 ай бұрын

really cool

@josephtilly258 7 ай бұрын

do you have a discord or something for your community ?

@insight-guy 8 ай бұрын

Craz🎉good. Well done bDan

@brianmi40 8 ай бұрын

YES, somewhere between Open Interpreter, Limitless/Rewind, and Rabbit with your choice of local or online LLM lies OUR FUTURE... The first to LAND THIS could be an overnight UNICORN...

@enton9422 8 ай бұрын

I would like the ai have personality, and with one prompt installation sir

@YossiDahan- 8 ай бұрын

Reaching god mode

@YorkyPoo_UAV 8 ай бұрын

I think you need a beta tester that suffers with all things code and also somehow finds all the bugs. I might know the perfect person.

@lokeshart3340 8 ай бұрын

hello sir . Sir i have also mad an advance personal AI assistant with hand gesture and many advance features . I also need like this can you help me pls. Can we pls collaborate?😅❤

@AGI-Bingo 8 ай бұрын

Agentic Developers.. Assemble

@lokeshart3340 8 ай бұрын

@@AGI-Bingo lets gooo