Yup, QwQ is CRACKED: Prompt Chaining with Qwen and QwQ reasoning model (Ollama + LLM)

  Рет қаралды 21,106

IndyDevDan

IndyDevDan

Күн бұрын

Пікірлер: 69
@vedantaggarwal1092
@vedantaggarwal1092 Ай бұрын
what are your device's specs?
@patrickbrady535
@patrickbrady535 Ай бұрын
2023 M2 Macbook Pro with 64 GB RAM and 12 cores. Shows up at 3:39 in the video.
@blackwhite-3607
@blackwhite-3607 Ай бұрын
@@patrickbrady535 wow macbook pro can run ollama qwq 32b?
@Scarsuna
@Scarsuna Ай бұрын
@@blackwhite-3607 Thanks to chip design, the ML Compute framework is able to utilize system ram for vram. Imagine if your video card had 64GB RAM! Now you might understand how a 20GB model can run easily on a Macbook Pro.
@edwardduda4222
@edwardduda4222 8 күн бұрын
@@blackwhite-3607 Apple Silicon has incorporated RAM with up to 128 GB and 16 neural cores for LLM inference. I recently just got the M4 with 48GB and it's amazing being able to both build and run very large models. It doesn't beat an nvidia rig, but it's great for being a laptop.
@stonedizzleful
@stonedizzleful Ай бұрын
Bro you are down the rabbit hole on this stuff it's so impressive. Some of the best quality AI content on youtube!
@indydevdan
@indydevdan Ай бұрын
Thanks for the kind words. The rabbit hole is so deep. Meta prompting + o1 + OpenAI 12 day launch content in the works. The things we can do with this tech is mind boggling.
@Scarsuna
@Scarsuna Ай бұрын
@@indydevdan I am not a coder at all and running Ollama/OpenWebUI via WSL2. However, I understood quickly I needed to come up with a better system for prompting and just found your channel. Got a lot to learn! Thanks to your suggestions in this video, I asked Grok to generate an XML format for specific instructions to edit a particular piece of text and it actually worked on the first try! Normally, I have to ask the AI 2-3 times before it understands the directions.
@sebastianmalcolm3597
@sebastianmalcolm3597 Ай бұрын
Super insightful prompting chaining examples! Thankyou IndyDevDan. 5:42
@ashgtd
@ashgtd Ай бұрын
Amazing video. I love how you are pushing the tech to do as much as it can. i def try this out myself
@cannyaii
@cannyaii Ай бұрын
Let's just call it Qwik, ironic but easy to say
@notnotandrew
@notnotandrew Ай бұрын
One might say that it's quick to say.
@stanislavtrifan96
@stanislavtrifan96 Ай бұрын
John Qwik
@Maisonier
@Maisonier Ай бұрын
Wow dude, this is amazing! liked and subscribed.
@plasmax
@plasmax Ай бұрын
Love the reasoner - extractor pattern. Prompt chaining seems very useful, especially for agents with tool use - you could have the reasoner decide what step to take next, then have an extractor, then verifier with inspection tools that goes back to the reasoner with new information in case something went wrong with the reasoning…
@jtjames79
@jtjames79 Ай бұрын
I would really like to preserve all that thinking process, merge them all into one giant file, and turn it into a knowledge graph. I like to talk about philosophy and futurism, so the details are often very important. I don't care how long it takes. If the AI needs to get back to me tomorrow, that's what I'm used to anyway with people.
@johang1293
@johang1293 Ай бұрын
Sounds like you should look at graph rag.
@jtjames79
@jtjames79 Ай бұрын
@johang1293 The database part is still Greek to me. Most of the tutorials are like "draw a circle, now draw the rest of the owl". 🤷‍♂️
@alexjensen990
@alexjensen990 Ай бұрын
Dude! I don’t know how I haven't watched one of your videos yet. Assuming that all your videos are like this... I have found my new favorite channel... I sometimes feel like I'm the only one I know who is really nerding out on prompt engineering in more complicated (and better performing) manner... To be honest, I feel like I am the only one I know who is really into using generative AI. In any event you’ve got a new regular sub here on YT.
@alexjensen990
@alexjensen990 Ай бұрын
PS- Ironically, I already follow you on GitHub somehow. I don’t recall checking out your repos, but I look forward to following your work man. Cheers.
@samson3523
@samson3523 Ай бұрын
yep hes fucking spot on always, I thought I was the only one in my IT team on the cutting edge (I am) , but at least we've got indydevdan
@ABUNDANCEandBEYONDATHLETE
@ABUNDANCEandBEYONDATHLETE Ай бұрын
There are millions dude.... Most people don't talk about it. It almost feels like a intern as an assistant, can't wait until PhD level. Never making a template from scratch again lol
@JimMendenhall
@JimMendenhall Ай бұрын
This is a very nice approach. Thank you for sharing. I'm wondering if QwQ will output the final answer in a specified tag like similar to how some other models will. That could help with the extraction for sure.
@MaelSimonApprenTyr
@MaelSimonApprenTyr Ай бұрын
Yeah, then after you can extract it with a simple regex 😀
@wSevenDays
@wSevenDays 24 күн бұрын
thank you very much for this prompt guide!
@Gunnerpigs
@Gunnerpigs Ай бұрын
After I watched your previous video about prompt levels I wondered how I should implement dynamic variables, is the way you do it in your 5:47 example the best way? Or are there other better ways of utilizing the dynamic variable?
@RuneX_ai
@RuneX_ai Ай бұрын
Thank you for reviewing!
@sd5853
@sd5853 Ай бұрын
I don’t always get everything from your vids as I’m a fkin noob with limited follow through but fk me I love your vids and I’m getting so much value out of it. Thank you for putting it out there
@MarxOrx
@MarxOrx Ай бұрын
This is cool but you can also just use the structured output in ollama and force the output to put the chain of thought in one key and the final result in another one. Then you don’t need the second LLM pass at all.
@TravisChristopher
@TravisChristopher Ай бұрын
Is that an EGPU for Mac on your desktop? Would love to hear a bit about your setup…
@iainmckenzie1
@iainmckenzie1 Ай бұрын
Curious to know if you've done any work with DSPy Dan? We've just started piloting it in my organisation so will be generating preliminary results soon but it's an interesting concept. Would be cool to hear your thoughts.
@jacquesdupontd
@jacquesdupontd Ай бұрын
Very interesting. Is this script for prompt chaining possible on an IDE like Windsurf for each of the prompt give to an agent like Sonnet 3.5 ? Thanks for your work
@GjentiG4
@GjentiG4 Ай бұрын
Quick question. Are you putting the subtitles automatically? If so what tool are you using?
@paulyflynn
@paulyflynn Ай бұрын
Crushing it ! Can I call you Mr. Hands
@Truth_Unleashed
@Truth_Unleashed Ай бұрын
There can be only one!
@NLPprompter
@NLPprompter Ай бұрын
i disagree, he's bigger than that mate his hands wizard look at how his hands handle the keyboard invoking strokes, sounds like keyboard wizard vim shortcuts on OS!
@aaagaming2023
@aaagaming2023 Ай бұрын
Would you say prompt chaining like this is as efficient as a framework like LangGraph in a production context?
@perelmanych
@perelmanych 21 күн бұрын
You can achieve almost the same output quality just by using the next prompt like: "Use previous answer to formulate the final answer in the json format"))
@supermold
@supermold Ай бұрын
I mean it is slow, but in the time it took me to watch this video it wrote snake game on my rtx 3060. Technically I only asked it for an outline, but it decided phase three of outlining how to code snake game was showing me the code that outlines how to instruct your computer to run snake game on it so idk, I guess it failed successfully. Considering I just gave it some generic and poorly worded instructions in a state of sleep deprivation, I'm pretty impressed and excited to see what else it's capable of.
@ThoughtFission
@ThoughtFission Ай бұрын
And here I had hoped we mere mortals could get something useful out of this. Not for those who don't program I guess ;o)
@daryladhityahenry
@daryladhityahenry Ай бұрын
Wait....... How do you fit the context length for extraction? I mean, it's sooooo long.... And your ram only 64GB and you use 32B model. hmm. I really wonder why it can fit and works really well. >,
@canelj
@canelj Ай бұрын
Thanks for sharing this! 🤩
@Sam-kj8ew
@Sam-kj8ew Ай бұрын
nice. What IDE is that?
@JohnLewis-old
@JohnLewis-old Ай бұрын
Can you do a multi-run test to see how many of those chained outputs fail? Are we talking 70% or 99%?
@TheAIBlueprint
@TheAIBlueprint Ай бұрын
The model is fully open? Or is through an API? Does any info, data, meta-data go back to anyone’s servers or is it 100% tested local?
@marcshawn
@marcshawn Ай бұрын
What is your Macbook Specs?
@magnusquest
@magnusquest Ай бұрын
Using this to run a local aider --architect Qwen + QwQ stack :D
@MaelSimonApprenTyr
@MaelSimonApprenTyr Ай бұрын
Are you using this stack? Or an idea? If you are using it, I'm really curious about the perfs ! I want to switch from sonnet 3.5 to something local to reduce my climate impact
@magnusquest
@magnusquest Ай бұрын
@@MaelSimonApprenTyr Currently running QwQ > output to file > input to aider. As Dan mentions the pitfall of this reasoning model is that it outputs its whole thought process so it would take super long for architect to run efficiently. Extracting the specific steps and details with prompt chaining is best here, but still takes quite a bit longer than using something like GPT o1
@MaelSimonApprenTyr
@MaelSimonApprenTyr Ай бұрын
@@magnusquest I'll have to check this in details, thanks man ! 😊
@KennylexLuckless
@KennylexLuckless Ай бұрын
hat it speaks Chinese would indicate that it is a well-trained model. Mandarin Chinese is the most spoken language in the world, while English only becomes the largest if you count those who speak it as a second language. As a Swede, I often encounter U.S. bias in AI responses, where it uses feet and inches even though the metric system is the most widely used. I have to use a system prompt to make it use metric, but it often leaves the conversion in the answer, which I have to remove later.
@Scarsuna
@Scarsuna Ай бұрын
Europeans have a hard time creating innovative projects of their own due to draconian censorship laws in the EU. That's the problem!
@bukitsorrento
@bukitsorrento Ай бұрын
MCP x Ada/Agent OS, haven't see any channel covering memory server, also check awesome-MCP server repo. Can't wait.
@indydevdan
@indydevdan Ай бұрын
100% feels big - I'm still digesting MCP. Massive OAI releases incoming.
@alexjensen990
@alexjensen990 Ай бұрын
TL; DR - I am not convinced that Qwen is all that great. Admittedly, I haven’t put the newest one through its paces yet due to time constraints, but I intend to do my due diligence when time permits. I will elaborate below on why Qwen, and for that matter a good number of recent models and papers, all seem to be at best not really moving the industry forward. At worst, well, they seem like they are an intentional grift or at a minimum plagiarism. Due to there inability to correctly answer characters lacking certain identifying aspects. I’m sorry if that sounds too general, but in a effort to keep this short I will simply say that if you use o-1 & o-1 mini; Claude, Haiku, and Opus; Llama 70B & 405Band many of its variants; Gemini Pro; and to some degree Mistral though I find that Mistral has fewer, more focused strengths, what you find is that they, at least in English, are able to generalize and abstract in mind blowing ways already. Particularly OpenAI and Anthropic’s models. Despite their flaws, Gemini, Llama, Mistral, and Grok all have extremely amazing ability to infer from a query what the next words should factually be given the effectiveness of the prompt and, often, even with ineffective prompts. However, Qwen out of the box has not, as of yet, shown me anything, but party tricks. Previous versions of Qwen have performed absolutely horribly when I have tested their ability to make certain connections between ideas that every other model, even the 6B-8B models, tend to do without missing a beat. I have several theories as to why and as a scientist, enthusiast, and member of the human race it bums me out. I will leave those theories to your imagination. With all that said, I look forward to putting this new Qwen through its paces and see if it finds a place in my stack. Believe it or not, this was by far the TL; DR because I could go on for some time about this topic. Anyway… Great video. I look forward to following your content moving forward. Cheers.
@daburritoda2255
@daburritoda2255 Ай бұрын
what are the specs of your MacBook pro? I'm impressed how quickly you can generate tokens on QwQ
@daburritoda2255
@daburritoda2255 Ай бұрын
I have an M4 Pro, MBP, but with only 24gb ram so I can't run the model locally
@haroldpierre1726
@haroldpierre1726 Ай бұрын
@@daburritoda2255 At 3:36 he has a description on the screen about the specs of his Macbook. He has an M2 Max with 64 GB RAM.
@box4soumendu4ever
@box4soumendu4ever Ай бұрын
Great 👍👍🥰🥰 Thanks.
@pensiveintrovert4318
@pensiveintrovert4318 Ай бұрын
It is also NOT as smart as o1-preview. o1-preview was the first model ever that was able to solve my puzzle, and QwQ made stupid logic errors.
@indydevdan
@indydevdan Ай бұрын
Totally. It's no where near o1-preview or o1 (just released). For local reasoning though it's a massive step forward.
@goldenglowitsolutions
@goldenglowitsolutions Ай бұрын
This looks amazing! But I dont think by Intel Core I7, 16GB RAM DDR4, RTX 3050 4GB (Acer Nitro 5) Laptop will be able to handle it.
@ozuromo
@ozuromo Ай бұрын
Nice video.
@balqaasem
@balqaasem Ай бұрын
Call it Quick
@shuntera
@shuntera Ай бұрын
Hard to use??? I’m thinking nahhhhhh, immediately disregard it as there are other models that are easier to use
@sephirothcloud3953
@sephirothcloud3953 Ай бұрын
deepseek r1 is better than qwq, sadly they didnt released the model and API yet
@thenextension9160
@thenextension9160 Ай бұрын
No Patreon? Come on man. Your content is way to important for you not to be getting memberships.
Turn ANY Website into LLM Knowledge in SECONDS
18:44
Cole Medin
Рет қаралды 124 М.
Andro, ELMAN, TONI, MONA - Зари (Official Music Video)
2:50
RAAVA MUSIC
Рет қаралды 2 МЛН
DeepSeek R1 Local Ai Server LLM Testing on Ollama
22:29
Digital Spaceport
Рет қаралды 22 М.
DeepSeek R1 Just Revolutionized AI Forever
21:06
Cole Medin
Рет қаралды 31 М.
Transformers (how LLMs work) explained visually | DL5
27:14
3Blue1Brown
Рет қаралды 4,4 МЛН