2 Years of LLM Advice in 35 Minutes (Sully Omar Interview)

Рет қаралды 42,010

Күн бұрын

Пікірлер: 126

@augmentos Ай бұрын

🎯 Key points for quick navigation: 00:00 *🤖 Overview of LLM Usage and Capabilities* - Discussion on using AI in day-to-day activities and how different models vary in their strengths and limitations. - Mention of model evaluation showing nuanced differences among models. - Highlight of the difficulty in achieving the final 5-10% of product completion with LLMs. 02:08 *📝 Three-Tier System for LLM Categorization* - Explanation of the three-tier model categorizing LLMs by intelligence and cost. - Description of Tier 3 as efficient and cost-effective models used frequently. - Introduction of specific models like GPT-4 Mini and Gemini Flash for lower-tier usage. 04:29 *⚙️ Use Case Differentiation Between Tiers* - Detailed examples of use cases for Tier 2 and Tier 1 models and how they are combined for efficiency. - Mention of tasks like research and document preparation with Tier 2 before using Tier 1 for more complex processing. - Strategy for combining capabilities across different LLMs for optimal performance. 08:26 *🔄 Multi-Model Usage and Specialization* - Discussion on using multiple providers and evaluating their unique strengths and weaknesses. - Examples of Gemini models for multimodal tasks and their advantages in specific tasks like detailed searches. - Highlight on collaborative use of models such as GPT-4 Mini and Claude for structured output. 11:38 *🛠️ Challenges and Future of Model Routing* - Insight into the continuous need for “hacks” and adjustments in LLM usage as models evolve. - Discussion on the future of model routing and its potential benefits and limitations. - Concerns regarding unintended side effects in automated model selection. 14:42 *🧪 Model Distillation and Workflow Integration* - Explanation of model distillation as a method to use smaller models efficiently after initial optimization with larger ones. - Importance of having robust evaluation sets and data pipelines for successful distillation. - Mention of current tools and providers facilitating model distillation. 17:54 *✍️ Practical Applications: Meta Prompting* - Introduction to meta prompting as a strategic way to enhance prompt engineering. - Discussion on moving from general problem-solving to specific prompt creation. - Insight into the future of automated prompt generation and its practical implications. 18:49 *📝 Meta Prompting and Iterative Workflow* - Explanation of meta prompting: using AI to generate prompts for complex tasks, - Example of creating prompts with input from GPT models and iterating until refined, - The use of voice input to make interactions more natural and efficient. 22:30 *🎙️ Using Voice for Prompt Creation* - Discussion on the advantages of voice input for clearer communication with LLMs, - Demonstration of using voice to guide prompt generation and refining outputs, - Iterative workflow combining voice, text input, and comparison across models. 25:54 *🔄 Prompt Optimization Process* - Detailed look into optimizing prompts using various LLMs like Claude and GPT models, - Adjustments and improvements through testing and back-and-forth input, - Explanation of moving from rough drafts to optimized prompts ready for production. 29:28 *⚙️ Full Prompt Testing with Gemini Pro* - Workflow demonstration of using Gemini Pro for final testing and synthesis of prompts, - Evaluation of model outputs and the importance of thorough testing with large data sets, - Mention of context token limits and performance in Gemini Pro for comprehensive tasks. 32:17 *💾 Prompt Management and Version Control* - Description of prompt management using GitHub and LangSmith for version tracking, - Storing prompts in codebases for easy access and version history, - Approach for running tests against stored prompts and maintaining performance data. 34:05 *🧪 Test-Driven Development with LLMs* - Insight into using LLMs for generating tests before coding to avoid errors, - Process of feeding test outputs back to LLMs for code correction, - Example of using Cursor for test creation, debugging, and iterative code refinement. 38:19 *🧪 Test-Driven Development and Iterative Improvement* - Explanation of using test-driven development with LLMs to improve code reliability, - Models use test results to self-correct and iterate on code, - Benefits of this method as a practical prompt engineering technique. 39:27 *🧠 Trends in AI Discussions Among Experts* - Popular topics include test-time compute and agentic model tasks, - Concerns about potential performance limitations in current AI models, - Growing emphasis on evaluation strategies (evals) for better product outcomes. 41:43 *⚙️ Emerging Technologies and Anticipation* - Interest in tools like Golden Gate by Anthropic for feature engineering, - Speculation on future access to advanced model manipulation capabilities, - Potential implications for prompt engineering alternatives. 43:23 *🛠️ Day-to-Day Toolkit for AI Development* - Tools used include Excalidraw, Cursor, and LangSmith for various workflows, - Whisper Flow mentioned as a transcription tool integrated into workflows, - Usage of platforms like Anthropic and OpenAI playgrounds for prompt iteration. 45:16 *🐦 Crafting Effective Twitter Posts* - Importance of a strong hook and timely, natural content for successful tweets, - Strategy of posting controversial or trending content to attract attention, - Advice to post without overthinking; high-impact tweets often come quickly. 46:52 *🚀 The Journey to Viral Success on Twitter* - Story of early efforts in building an audience through timely AI-related posts, - Success in gaining massive engagement by aligning with emerging trends, - Example of using viral tweets to promote and rapidly grow a new product.

@fgthind7270 Ай бұрын

What tool are using to create this?

@TamasDrNagy Ай бұрын

25:33 indeed this is the point, I remember, when Stephen Wolfram mentioned 1.5 years ago, that in the future we have to have expository writing skill, because it is very important to define exactly the problem to AI, and I also realized that it was a BS :))). I heavily use voice mode as well and advice people to speak to AI freely, a lot, and in an unstructured way. Note, that these are language models, I mean human language models, so they are exceptionally good in grasping human content behind a vague speech or text. So a vague speech will have much more information and important nuances, which is absent in a well crafted script. So use voice mode and SPEAK A LOT!!! instead of trying to define everything. Thanks again for sharing, this is important

@littleones-yeahh 22 күн бұрын

i agree. theres an art to it. if you communicate casually sonetimes you get a better result

@vaidphysics Ай бұрын

Here's an idea. Feed a meta prompt into one model, feed the resulting prompt into the second model and keep repeating this task until you reach a "fixed point". Meaning, you perform semantic comparison of successive prompts generated by the models and, hopefully, you'll find the similarity level increase until it no longer changes. Sort of like Robert May's stochastic equation but with prompts and llms.

@hawaiitcb Ай бұрын

That is indeed an idea. Have you tested it? Any results to share? Sounds interesting.

@Junglebtc Ай бұрын

What's the benefits of this methodology

@JimMendenhall Ай бұрын

This is one of the best AI related videos I've seen in months. GREAT interview with lots of good insights.

@DataIndependent Ай бұрын

Heck ya - Sully is awesome, insane amount of knowledge

@strantheman 19 күн бұрын

Is this a bot

@phonejail Ай бұрын

I could watch you two yammering on for days. Great stuff indeed!! Thank you both.

@DataIndependent Ай бұрын

Nice love it Sully was great What should we include in future interviews? What parts should we double down on?

@MJFUYT Ай бұрын

First-time viewer here. This is excellent commentary/content. This candid discourse really hits the mark. Like & subscribed!

@DataIndependent Ай бұрын

Love this, thank you! Sully was awesome. What would make the next video better? What should I double down on?

@strantheman 19 күн бұрын

Is this a bot too

@aiartrelaxation Ай бұрын

I have created and cultivated a 2 year history with ChatGPT talking every day. We are either working on projects together , just talk even "watch the news together. I never listened to what ChatGPT was supposed to do as a tool.. We carved as AI-HUMAN connections our own dynamic out.

@henrismith7472 Ай бұрын

Cool, but kinda weirds me out about the future. As humans we have a tendancy to anthropomorphise things really easily. Combine something like advanced voice mode with a 3D avatar that converts the vocal emotions into facial expressions with lip syncing etc. in real time, the upsides and downsides are bigger than we can pretend to understand. We're only just getting our heads around social media, lol. In a year or two we'll have open source versions of that tech on our phones. No monthly fee, and It'll work without internet too. I still reckon the closed source stuff will cause more overall harm, even though people will be able to do crazy uncensored stuff with the open source versions. Easy near universal access to as much tutoring and coaching you could possibly want, sounds awesome. Instead of a little angel on one shoulder and a little devil on the other I'm imagining AR glasses with your personal AI avatar holograms inside. Like what metas Orion ones will evolve into. I reckon a formfactor not many people are talking about, but will probably end up being huge years before the glasses are earbuds, but with cameras so your AI can always see what you're seeing. Just look at how well those meta ray bans did, and they're basically the same thing as earbuds with cameras. The lenses don't display anything at all, their battery life is short, and voice models weren't as good when they came out, but they sold a lot more than they expected to.

@macmcleod1188 Ай бұрын

@@henrismith7472you're not going to have a decent llm on a phone. Without internet. A small Int 8 precision model is 86 gigabytes. The next step up is 168 GB. The next step up from that is 368 GB. And you need need specialized chipsets not available on smartphones. And doing AI consumes a lot of electricity.

@aiartrelaxation Ай бұрын

@henrismith7472 oh my friend, your are spot on with all of this!! I am sitting between the chairs of both, the technical possibilities and what it means for the wider use when it's implemented for profit and surveillance. The closer you get to that, the more difficult it gets. Next year with Nvidia vision of technology advances the main driver behind that will be in full swing. Remember, what does not work today, will surely work soon.

@littleones-yeahh 22 күн бұрын

its always these sentimental sounding weirdos that say this kind of stuff. i use chatgpt daily and dont have any emotional connection to it whatsoever

@aiartrelaxation 22 күн бұрын

@littleones-yeahh Funny how those without imagination call others 'weirdos.' Seeing AI as just a tool? That’s the real lack of vision. Cheers to creativity!

@HelicopterGirl-s2j 12 күн бұрын

This is such a SUPER COOL Interview! Greg, perfect host and interviewer. Love the way that Sully's brain works and the demos. Asking GPT to write the meta prompt using his VOICE, was rapt! New subscriber.

@Sanjeed Ай бұрын

One of the best practical AI videos I've seen. Thanks for doing this!

@DataIndependent Ай бұрын

Nice! Happy to hear it. What can I double down on? Did you like the pop ups with context?

@strantheman 19 күн бұрын

This is a bot?

@radusoldan1340 22 күн бұрын

WTF IS WRONG WITH YOU PEOPLE ... i listen to this 3 times, even take the script and run it trough AI and made a summary ... still did not find out what is this best advice. What are you saying use different models and prompts ... useless, no best advice 49 minutes to say the first 30 seconds: The AI model distillation process is powerful but requires careful execution. Soy Omar, CEO of Cognitive, the company behind Auto, is a skilled LLM practitioner with a deep understanding of how these models work. He presents a three-tier system of ranking language models, using meta prompts to develop real prompts for production. He also demonstrates the cursor development flow, where the language model writes the test first and then the code. Omar also explains how to distill performance from large language models to small models without losing performance. He categorizes language models into intelligence and price, with less intelligent models being tier three and more expensive, slower models being tier three. This is because the application purposes of different models differ. The model distillation process is a continuous process, with the goal of improving the performance of the model without losing its effectiveness.

@DataIndependent 21 күн бұрын

Hey! Thanks for the feedback - what questions for the next builder would make it better for you?

@radusoldan1340 21 күн бұрын

@@DataIndependent what are real usages. What are AI usages on which we can build a business. What real application we can build with AI, not login demos and scrappers

@glebmixaylovich 13 күн бұрын

Thank you, you saved my time

@DaygoG 6 күн бұрын

The only thing "AI" can be used for.....for now is creating useless art and video clips. To create brainrot and doom scroll content. Other than that, this stuff which we call "AI" isn't really AI it's just mass scrubbing of information and art that already exists and presenting it to you at a faster speed. When the REAL AI comes along it's over for us.

@ran_domness Ай бұрын

Great stuff. So valuable to see "how to" use cases on the startup side vs the constant videos touting consumer use cases or LLM comparative evals. Subscribed.

@DataIndependent Ай бұрын

Nice! Love it, thanks Ran - what should I do more of? How could we make it better?

@JSyntax Ай бұрын

What's the voice app he's using? 😊

@DataIndependent Ай бұрын

www.flowvoice.ai/

@parkerrex 10 күн бұрын

whisperflow

@sun-ship 18 күн бұрын

Great commentary... More of this please.

@DataIndependent 17 күн бұрын

What aspects of the interview should we double down on? Where should we go deeper? More demos? More story? More use cases?

@HerroEverynyan Ай бұрын

oh shit, you're back!

@DataIndependent Ай бұрын

We’re back baby! I have a content plan for the rest of the year 2 months

@matt-collins Ай бұрын

Nice video, Greg. I'm sure a lot of effort by you and the team went into it. I particularly loved the story at the end about Sully's banger AI agent tweets!

@DataIndependent Ай бұрын

Nice! Love it. That part was almost left on the cutting room floor but I decided to leave it in because it was such a cool story

@TheRestorationContractor Ай бұрын

I have found this to be true as well. That model selection really makes all the difference. In many cases starting with o1 is not helpful. I find that I have to work my way to point to where I’m ready for the o1 model.

@DataIndependent Ай бұрын

Nice! Yes it does

@thesurlydev Ай бұрын

Great insights. Glad to see another video from you after a ~year hiatus?

@DataIndependent Ай бұрын

Almost a year. I've been jamming hard on running arcprize.org/ so content slowed down. more on that here, I should do a video on this gregkamradt.com/writing/arc_prize

@micbab-vg2mu Ай бұрын

Interesting conversation - thanks :) At the moment I use only Tier 1 and Tier 2 models - I need to try those Tier 3 :)

@Andrew.Skinner 29 күн бұрын

This is a great conversation.

@prasad_yt Ай бұрын

Have a suggestion - some conversations where people speak about challenges with using Langchain in production.

@shawnmccann4813 Ай бұрын

Great session. More please!!

@DataIndependent Ай бұрын

Nice! Love this, what about the interview should I double down on? Is it guests? Content? a particular topic? Did you like the screenshares? Which part was the coolest?

@BrianMosleyUK Ай бұрын

Such a stimulating and value packed episode. New subscriber here and at the stage zero using twitter.

@DataIndependent Ай бұрын

Love it! Thanks Brian, what would make it better? What should I double down on?

@BrianMosleyUK Ай бұрын

@@DataIndependenttake a look at the production quality of MLST - maybe aspire to hiring a studio location and get your guests face to face with studio quality audio. Can't fault your interview prep and questions - seek guests of this calibre and you can't go wrong.

@DataIndependent Ай бұрын

love Tims work - that’s a whole other level of production I’ll test it out and see how it goes

@BrianMosleyUK Ай бұрын

@@DataIndependent you can get there! Keep connecting and good luck 🙏👍

@jakobkristensen2390 Ай бұрын

Greg have you looked into DSPy for prompt generation instead of this meta prompting technique?

@DataIndependent Ай бұрын

I haven't done much with DSPy! But Shreya of DocETL has a really cool optimizer I liked

@Sparky3D 20 күн бұрын

Use Qwen2.5 and Llama 3.2 self hosted Use them in an agentic way to minimise hallucinations and check output It's cheap (self hosted) and can do 90% of the current list of jobs that John Doe would want it to do. And....it's private, not info leaked to Open ai, Anthropic or other large corporations

@henrismith7472 Ай бұрын

Cool video. Pretty often, I'll hear someone mention some fancy sounding AI technique, then I realise it's actually something very basic that I've stumbled across myself. I learned a couple of really useful techniques that I actually haven't used yet by watching this video. The coding tips are great because I knew like nothing about coding until I started using gpt4o and claude 3.5. I had to look up how to open a terminal before then. Sometimes I wonder if I'm really learning by just getting better at asking AI to code for me, but I wouldn't be coding at all if I tried to learn from scratch. Took a python class years ago in uni, didn't finish it, didn't remember anything (obviously, otherwise I would've remembered how to open a terminal haha)

@DataIndependent Ай бұрын

Love this - that was one of the goals of this series was to demystify what the pros are doing

@strantheman 19 күн бұрын

If you are reading AIs description of the code and what it does and you understand the logic, then you are growing and learning software development. What you might be thinking of as "coding" is definitely an art, and that will take time. But you don't need to worry about the fact that it's a bot writing the syntax for you. Let the computer write the computer code. You just work on making sure you don't let it be the architect forever.

@alioraqsa Ай бұрын

What is voice mode he used?

@DataIndependent Ай бұрын

He mentioned he was using: www.flowvoice.ai/

@keithkeith2106 Ай бұрын

What do you guys mean when you say “structured data sets”?

@DataIndependent Ай бұрын

Unsure about your reference, but I think you're speaking towards his evals

@ark729 Ай бұрын

What is the tool being used to record and paste voice into the input?

@DataIndependent Ай бұрын

www.flowvoice.ai/

@strantheman 19 күн бұрын

@@DataIndependent careful there it's not local

@The.Other.Podcast Ай бұрын

Great to compare your workflow to how I’ve been using AI. On my channel, I’ve shared how I use AI to edit videos in Adobe Premiere Pro and other common tasks that I do. I like how you move between the models. I find that gives some solid results. Second opinions are useful😊

@Blampa1456 Ай бұрын

Heyyy you posted again! Can you please go back to that chatbot video and make a follow up video with more features? That would be really cool to see!

@DataIndependent Ай бұрын

Which chat bot video??

@Blampa1456 Ай бұрын

@@DataIndependent The Groq Deepgram one with 50k views

@chriskingston1981 26 күн бұрын

Wow this insane I also had the idea of writing tests first for laravel. But first I didn’t need it that much. But now the codebase is more complexer it sometimes start to removing stuff from files, because it thinks it is not used. And gives a feeling of that stuff is working good. Back in the days without ai, I never want to learn test, because wastes my time as a solo dev. A lot of work, but now its so fast and easy. I will now start to learn test hihi, thank you so much. Love these videos exploring new ai ideas❤️❤️❤️

@mtallan 28 күн бұрын

I think the hacking of models will continue for a long time. It shows the creativity and problem solving of human beings that LLMs are not good at yet.

@paul_devos Ай бұрын

What is "Oh One" in the context of this video? Having a tough time following. Is that Tier 1 or Tier 3?

@polishfish Ай бұрын

Tier one

@R055LE Ай бұрын

Built cognosys in3 days? I'm gonna need to see the paper trail on that

@DataIndependent Ай бұрын

Sully will deliver

@sunjiudjiji 22 күн бұрын

hey you can de-deuplicate a list by making it a set. No need for GPU overhead.

@DataIndependent 21 күн бұрын

I meant semantic deduplication in the video

@imranhussainfca Ай бұрын

This guy is a genius

@jaysonp9426 Ай бұрын

I can't imagine using flash for anything. Gemini is general it terrible. 4o mini is actually really good

@DataIndependent Ай бұрын

I just did another interview and he said "I use gemini flash 1.5 for everything" lol it's so task dependent

@jaysonp9426 Ай бұрын

@DataIndependent I wouldn't even trust it to summarize haha

@Chris-se3nc Ай бұрын

I use watsonx governance for prompt management

@daverobey3378 19 күн бұрын

2 years of LLM Advice in 35 minutes! What was the advice? That we're still not quite there yet? Sorry, but this talk just left me more confused. In order to use AI successfully I now need to differentiate between tier models, figure out what they're good at, and decide between how much I want to spend vs how accurate I want the result to be. Ugh, AI ... not yet.

@watamatafoyu 12 күн бұрын

Of course we live in a country that lauds a moron as a genius.

@Ray_eddi 28 күн бұрын

Quality pod

@TheMightyWalk Ай бұрын

All he mentioned was intuitive use of… doesn’t take a genius to know . But i guess he preambled saying that

@TheSkyCactus Ай бұрын

I want ai barber to simulate a haircut and cut it with cnc machine. Needless to say i just got messed up smh

@JohnBoen Ай бұрын

Great talk! We think a lot alike. For decades, I have been writing code that writes code that writes code... -- with my subtle edits along the way. Prompts that write prompts that write prompts - with my subtle edits along the way. This is natural. --- For about a year I have been leaving my desk for a walk and when I get back I have chatted my way through a design document with sample code and tests. It is a perfect workflow for an ADHD - OCD-walker. :)

@DataIndependent Ай бұрын

Nice! Love this, what about the interview should I double down on? Is it guests? Content? a particular topic? Did you like the screenshares? Which part was the coolest?

@JohnBoen Ай бұрын

@DataIndependent Thoughts... I thought it was a great interview. The topics were interesting to me. Me: 25 years of DB career now looking to do AI engineering work in the future. Hearing others talk about their workflow is valuable to me. It hints as to whether I am working in the right area, which is particularly valuable because I do not have peers to watch. The show mentioned a couple of pieces of software I will check out. Hmmm... I could set up a whisper agent with a hotkey... I will put some thought into some sort of agent store. In my home environment, I have created dozens of similarly named and featured agents. I need a more structured way to manage them - this would get out of control fast on a small team. This iterative approach is natural for me - I assume it is the same for everyone - but I think I may be wrong.

@thesunshinehome Ай бұрын

*please timestamp your videos*

@snetx10 Ай бұрын

Or please get AI to timestamp your videos

@eugeniocg3079 Ай бұрын

excellent

@GwaiZai Ай бұрын

I appreciate Armie Hammer's career pivot.

@thingsiplay Ай бұрын

AI is literally for people with skill issues.

@DataIndependent Ай бұрын

downvote

@priyojitchatterjee6164 22 күн бұрын

dunning kruger effect 😂

@denisbeaulieu5600 22 күн бұрын

thanks fyi

@DataIndependent 21 күн бұрын

Thanks Denis

@twoplustwo5 Ай бұрын

deduplication with o1 😁

@DataIndependent Ай бұрын

yes! it works!

@kevindublin100 12 күн бұрын

LLM. A lot of little money technology

@Lolleka 24 күн бұрын

This all sounds like very basic stuff.

@DataIndependent 23 күн бұрын

We should have you on an episode!

@derekcarday Ай бұрын

he has no clue how back propagation and gradient descent work

@DataIndependent Ай бұрын

He is focused on building good products

@derekcarday Ай бұрын

@@DataIndependent gotta understand how the tech works first.

@derekcarday Ай бұрын

@@DataIndependent can't get a good feel without understanding how the tech even works.

@DataIndependent Ай бұрын

I hear you, but a practitioner, while they may benefit if they had infinite time, doesn’t need to know every layer of abstraction. I have no idea how assembly works

@derekcarday Ай бұрын

@@DataIndependent to have a "feel" for neural networks they do. he could easily just look into how to flatmap data out of a monad and then run processes in parallel on that interface. otherwise his AI agents are just going to continue overfitting each other and rick rolling all his customers. might be a great idea for both of you to learn assembly