O1’s Chain of Thought: I Built a System to Mimic It-Here’s How It Went!

  Рет қаралды 10,609

Prompt Engineering

Prompt Engineering

Күн бұрын

Пікірлер: 65
@engineerprompt
@engineerprompt Ай бұрын
Code for the project is up on github: github.com/PromtEngineer/Agent-0.git
@kai_s1985
@kai_s1985 Ай бұрын
This is a great idea! Imagine having multiple LLMs specialized in different types of skills like reasoning, math, coding, general knowledge, writing, creativity, etc... working in parallel or sequentially, and criticizing each other, and outputting a final agreed answer like ensemble models in random forest or xgboost. Imagine each of those models are better than PhD's in their own area of expertise.
@shuffle1659
@shuffle1659 Ай бұрын
Thank you for the idea of my next project.
@engineerprompt
@engineerprompt Ай бұрын
I think for specialized LLMs, they probably needs to be finetuned for specific task but I think its worth exploring.
@Juan-n6k3c
@Juan-n6k3c 8 күн бұрын
​@@engineerprompt even with RAG? Is finetuning necesary?
@Anas-dd5ud
@Anas-dd5ud Күн бұрын
@@Juan-n6k3c yes in my opinion. Rag only adds information to the prompt. but the generation "format" result is much more dependant on the fine tuning. You can notice that if you compare the generation results of instruct finetuned model and the base one.
@rousabout7578
@rousabout7578 Ай бұрын
Great video and first try! I was waiting for someone to replicate 01 with an agentic system. Controversy aside Matt Schumers reflection 70b has a surprisingly good system prompt. Worth trying on a few models. You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside tags, and then provide your final response inside tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside tags.
@engineerprompt
@engineerprompt Ай бұрын
Prompting definitely needs improvement here. The reflection is a tricky one without fine-tuning the model though. Was hoping Matt Schumers will actually just release the data that they created.
@1littlecoder
@1littlecoder Ай бұрын
I was working on something similar and here you managed to get it out already! Great work!
@engineerprompt
@engineerprompt Ай бұрын
thank you, mean a lot coming from you. big fan of your content. keep up the good work sir.
@1littlecoder
@1littlecoder Ай бұрын
@@engineerprompt 🙏🏾
@HassanAllaham
@HassanAllaham Ай бұрын
and here is a big fat "LIKE" to one of the best channels and one of the best youtubers and prompt engineers. Thanks for the good content 🌹🌹🌹
@BenoitChabord
@BenoitChabord Ай бұрын
I think your approach is brilliant, and I suppose the same way with other models you could find a COT good at writing, with different agents prompts. You can start to categorise all of the kind of problems and the first step is an agent prompt knows which Multi stage COT worflow it should use.
@cheaptube4918
@cheaptube4918 Ай бұрын
Thanks for putting this together!
@tollington9414
@tollington9414 Ай бұрын
Good work, love this approach
@supercurioTube
@supercurioTube Ай бұрын
Good job ! It'll be interesting to try it in parallel, from 2 to n and identity if more attempts help or not. Did you leave the temperature to the 1.0 default? To get more diversity of outputs, temperature at the max 2.0 might help.. or not 😅
@engineerprompt
@engineerprompt Ай бұрын
The temperature is set to 1.0. I plan to experiment with multiple different LLMs at some point. I think that will give diverse outputs.
@xXWillyxWonkaXx
@xXWillyxWonkaXx Ай бұрын
In simple terms, this is a "routing prompt system" aka orchestration or multi-agent prompting.
@engineerprompt
@engineerprompt Ай бұрын
Kinda, specially the parallel system will work in that way.
@epokaixyz
@epokaixyz Ай бұрын
Consider these actionable insights from the video: 1. Understand Chain-of-Thought Reasoning: Learn how this technique enables AI to solve complex problems by breaking them into smaller, manageable steps. 2. Explore Multi-Stage Systems: Study the concept of using multiple AI agents, each with a specialized role, to enhance the problem-solving process. 3. Learn how to Use Code for Analytical Reasoning: Discover how AI leverages code to efficiently process information and arrive at accurate solutions. 4. Research Solutions to Existing Challenges: Investigate how to address the limitations of Chain-of-Thought reasoning, such as sequential bottlenecks and lack of diversity in AI models. 5. Dive Deeper into Resources: Consult the provided books and research papers to gain a more comprehensive understanding of prompt engineering and Chain-of-Thought prompting.
@keshavkasat9465
@keshavkasat9465 Ай бұрын
ai comments
@nusquama
@nusquama Ай бұрын
@@keshavkasat9465and so ? Question is : is it useful?
@micbab-vg2mu
@micbab-vg2mu Ай бұрын
I use my own reflextion prompt with claude 3.5 sonnet that gives me better results in tests than o1:)
@raphaelchristi6424
@raphaelchristi6424 Ай бұрын
@@micbab-vg2mu you have repo for that?
@IRazorLeaf
@IRazorLeaf Ай бұрын
Could you share a bit more details on what you do or is that top-secret? :)
@aaron6235
@aaron6235 Ай бұрын
Yes, share, and don't monetize😮
@unclecode
@unclecode Ай бұрын
You’ve built a moat for yourself :)) Nice job! The end result of "O1" isn’t exactly new, the "moat" (if there is one) lies in how they figured out bringing the process into inferences within the model itself. I hope they eventually share the details and reasoning tokens. Right now, it could just be like what you did, or maybe they’ve found a way to embed "search" natively during inference.
@engineerprompt
@engineerprompt Ай бұрын
Its a mystery but I think it is possible to replicate that behavior with multi-agent system especially if they are specialized. Seems like the Qwen team has something in works. will be interesting to see their approach.
@unclecode
@unclecode Ай бұрын
@@engineerprompt Qwen became good in math and me too, waiting for that 😎
@obetishere9215
@obetishere9215 Ай бұрын
Nice. I have my personal project similar to this built using react and using mistral nemo api. My use case is developer, code reviewer, unit tester. I feel like i am missing something but i do not know yet.
@engineerprompt
@engineerprompt Ай бұрын
nice, I love when people are solving practical problems with LLMs.
@aaron6235
@aaron6235 Ай бұрын
Why didn't you put the link to your repo in the description? I would very much like to clone it
@engineerprompt
@engineerprompt Ай бұрын
Wanted to test it a bit more and put up a Readme. There are some failure cases that needs to be highlighted. Will do it soon
@mirkakonest
@mirkakonest Ай бұрын
Open Source?
@engineerprompt
@engineerprompt Ай бұрын
github.com/PromtEngineer/Agent-0.git
@raphaelchristi6424
@raphaelchristi6424 Ай бұрын
There's some repo with that?
@ingenierofelipeurreg
@ingenierofelipeurreg Ай бұрын
Pls share repo
@engineerprompt
@engineerprompt Ай бұрын
github.com/PromtEngineer/Agent-0.git
@ahmadrezaatashbartehrani3617
@ahmadrezaatashbartehrani3617 Ай бұрын
where is code ? thanks for sharing that
@engineerprompt
@engineerprompt Ай бұрын
Look at the pinned comment
@rajapulau
@rajapulau Ай бұрын
10:45 can this solution use autogen?
@engineerprompt
@engineerprompt Ай бұрын
Yes, that could be a good option.
@zeusconquers
@zeusconquers Ай бұрын
hi, nice work. could you provide the source code please
@engineerprompt
@engineerprompt Ай бұрын
github.com/PromtEngineer/Agent-0.git
@Dr.UldenWascht
@Dr.UldenWascht Ай бұрын
Reflection reborn 😄
@engineerprompt
@engineerprompt Ай бұрын
😄😄😄
@大支爺
@大支爺 Ай бұрын
Just review it's own answer and answer it once more time.
@monstercameron
@monstercameron Ай бұрын
this isnt 01 this is gpt4 with code interpreter
@clray123
@clray123 Ай бұрын
These attempts show your fundamental lack of understanding of what OpenAI did to improve their models. Which is excusable because they no longer publish anything worthwhile and are so vague in their marketing material that it leaves you guessing. But I suspect it went as usual: they created a big mountain of synthetic data demonstrating those "chain of thoughts" on a well defined set of problems, for which reward functions can be defined using RL, and then they trained a very normal LLM on that very mountain of "solved problem" data. What you are missing is not some shitty prompts. You are missing the mountain of training data (and another mountain of compute resources to pull off the training, but it's mostly the data which gives them a competitive advantage).
@georgearistides7704
@georgearistides7704 Ай бұрын
genuine question, how do you know this. I am asking because I built a program to implement chain of thought using the lightweight llama models today, and it does of course give improved responses with respect to amount detail and accuracy. Do you have a source for this claim, and if so can you please drop the link here. I want to try implementing this and possibly combining with what I have already done
@clray123
@clray123 Ай бұрын
@@georgearistides7704 The source is my knowledge of how LLMs work and what it takes to get better scores on benchmarks. Also that OpenAI did not mention any improvements to model architecture, which they presumably would have done if they had done anything better than provide more training data.
@georgearistides7704
@georgearistides7704 Ай бұрын
@@clray123 source: trust me bro
@clray123
@clray123 Ай бұрын
@@georgearistides7704 Well, you can go watch AI Explained's review which pretty much does say the same thing after reading 40 pages of their release docs. Believe it or not, these LLMs don't grow on trees and some people out there are fairly well aware how the entire process works.
@paulocacella
@paulocacella Ай бұрын
This simply does not matter. What matters is the output quality goal and cost/benefit. You must show a problem that openai approach is in a way that is superior than agents, RAG and tools in terms of privacy and cost/benefit. That is it. The other aspects simply are not relevant outside a research world. It is a simply as that, you do not need a Ferrari to go to the market. Up to now I can make locally outputs that are better than the approach of o1 on meaningful and useful general problems. When they have something better we can go deeply. Up to now this is only a toy using chatgpt because people to not have the agentic systems setup and have no idea on how to build them.
@alexpetrov8871
@alexpetrov8871 Ай бұрын
there is no "thought", there is no "answer": this is a big math function calculating a "next word"
@StevenSSmith
@StevenSSmith Ай бұрын
and....?
@alexpetrov8871
@alexpetrov8871 Ай бұрын
​@@StevenSSmith and what is your question? can you formulate what you are actually asking?
@ronilevarez901
@ronilevarez901 Ай бұрын
Exactly like you.
@alexpetrov8871
@alexpetrov8871 Ай бұрын
@@ronilevarez901 >Exactly like you. No, I am quite different. Because to me, “words” are just a language that I use to convey information about the reality that I perceive by analyzing the signals of my senses over time. My senses are imperfect, however, they reflect reality to one degree or another. On the contrary, this LLM has not the slightest connection with reality, all they have is to study this human “language”, a set of phonemes that people use to convey the information that they perceived through their imperfect senses and were able to comprehend and translate into "words". Think about the difference.
@ronilevarez901
@ronilevarez901 Ай бұрын
@@alexpetrov8871 yeah, yeah. The problem is that you and many others would be saying the same about AI if it was embodied and had learned language through real world experience: "it doesn't think". Mhm.
EASIEST Way to Fine-Tune LLAMA-3.2 and Run it in Ollama
17:36
Prompt Engineering
Рет қаралды 39 М.
The Best RAG Technique Yet? Anthropic’s Contextual Retrieval Explained!
16:14
MAGIC TIME ​⁠@Whoispelagheya
00:28
MasomkaMagic
Рет қаралды 38 МЛН
Trick-or-Treating in a Rush. Part 2
00:37
Daniel LaBelle
Рет қаралды 34 МЛН
У вас там какие таланты ?😂
00:19
Карина Хафизова
Рет қаралды 20 МЛН
Mia Boyka х Карен Акопян | ЧТО БЫЛО ДАЛЬШЕ?
1:21:14
Что было дальше?
Рет қаралды 9 МЛН
AWS CEO - The End Of Programmers Is Near
28:08
ThePrimeTime
Рет қаралды 544 М.
Coding Was HARD Until I Learned These 5 Things...
8:34
Elsa Scola
Рет қаралды 640 М.
The Vim Experience
45:19
Bog
Рет қаралды 136 М.
Chain-of-thought explained | Aravind Srinivas and Lex Fridman
4:38
Marker: This Open-Source Tool will make your PDFs LLM Ready
14:11
Prompt Engineering
Рет қаралды 56 М.
Space-Filling Aether Theory Makes Comeback
8:24
Sabine Hossenfelder
Рет қаралды 52 М.
4 Methods of Prompt Engineering
12:42
IBM Technology
Рет қаралды 155 М.
Contextual RAG is stupidly brilliant!
15:03
1littlecoder
Рет қаралды 16 М.
*Next-door 10x Software Engineer* [FULL]
4:50
Programmers are also human
Рет қаралды 762 М.
MAGIC TIME ​⁠@Whoispelagheya
00:28
MasomkaMagic
Рет қаралды 38 МЛН