O1’s Chain of Thought: I Built a System to Mimic It-Here’s How It Went!

Рет қаралды 10,609

Prompt Engineering

Күн бұрын

Пікірлер: 65

@engineerprompt Ай бұрын

Code for the project is up on github: github.com/PromtEngineer/Agent-0.git

@kai_s1985 Ай бұрын

This is a great idea! Imagine having multiple LLMs specialized in different types of skills like reasoning, math, coding, general knowledge, writing, creativity, etc... working in parallel or sequentially, and criticizing each other, and outputting a final agreed answer like ensemble models in random forest or xgboost. Imagine each of those models are better than PhD's in their own area of expertise.

@shuffle1659 Ай бұрын

Thank you for the idea of my next project.

@engineerprompt Ай бұрын

I think for specialized LLMs, they probably needs to be finetuned for specific task but I think its worth exploring.

@Juan-n6k3c 8 күн бұрын

@@engineerprompt even with RAG? Is finetuning necesary?

@Anas-dd5ud Күн бұрын

@@Juan-n6k3c yes in my opinion. Rag only adds information to the prompt. but the generation "format" result is much more dependant on the fine tuning. You can notice that if you compare the generation results of instruct finetuned model and the base one.

@rousabout7578 Ай бұрын

Great video and first try! I was waiting for someone to replicate 01 with an agentic system. Controversy aside Matt Schumers reflection 70b has a surprisingly good system prompt. Worth trying on a few models. You are a world-class AI system, capable of complex reasoning and reflection. Reason through the query inside tags, and then provide your final response inside tags. If you detect that you made a mistake in your reasoning at any point, correct yourself inside tags.

@engineerprompt Ай бұрын

Prompting definitely needs improvement here. The reflection is a tricky one without fine-tuning the model though. Was hoping Matt Schumers will actually just release the data that they created.

@1littlecoder Ай бұрын

I was working on something similar and here you managed to get it out already! Great work!

@engineerprompt Ай бұрын

thank you, mean a lot coming from you. big fan of your content. keep up the good work sir.

@1littlecoder Ай бұрын

@@engineerprompt 🙏🏾

@HassanAllaham Ай бұрын

and here is a big fat "LIKE" to one of the best channels and one of the best youtubers and prompt engineers. Thanks for the good content 🌹🌹🌹

@BenoitChabord Ай бұрын

I think your approach is brilliant, and I suppose the same way with other models you could find a COT good at writing, with different agents prompts. You can start to categorise all of the kind of problems and the first step is an agent prompt knows which Multi stage COT worflow it should use.

@cheaptube4918 Ай бұрын

Thanks for putting this together!

@tollington9414 Ай бұрын

Good work, love this approach

@supercurioTube Ай бұрын

Good job ! It'll be interesting to try it in parallel, from 2 to n and identity if more attempts help or not. Did you leave the temperature to the 1.0 default? To get more diversity of outputs, temperature at the max 2.0 might help.. or not 😅

@engineerprompt Ай бұрын

The temperature is set to 1.0. I plan to experiment with multiple different LLMs at some point. I think that will give diverse outputs.

@xXWillyxWonkaXx Ай бұрын

In simple terms, this is a "routing prompt system" aka orchestration or multi-agent prompting.

@engineerprompt Ай бұрын

Kinda, specially the parallel system will work in that way.

@epokaixyz Ай бұрын

Consider these actionable insights from the video: 1. Understand Chain-of-Thought Reasoning: Learn how this technique enables AI to solve complex problems by breaking them into smaller, manageable steps. 2. Explore Multi-Stage Systems: Study the concept of using multiple AI agents, each with a specialized role, to enhance the problem-solving process. 3. Learn how to Use Code for Analytical Reasoning: Discover how AI leverages code to efficiently process information and arrive at accurate solutions. 4. Research Solutions to Existing Challenges: Investigate how to address the limitations of Chain-of-Thought reasoning, such as sequential bottlenecks and lack of diversity in AI models. 5. Dive Deeper into Resources: Consult the provided books and research papers to gain a more comprehensive understanding of prompt engineering and Chain-of-Thought prompting.

@keshavkasat9465 Ай бұрын

ai comments

@nusquama Ай бұрын

@@keshavkasat9465and so ? Question is : is it useful?

@micbab-vg2mu Ай бұрын

I use my own reflextion prompt with claude 3.5 sonnet that gives me better results in tests than o1:)

@raphaelchristi6424 Ай бұрын

@@micbab-vg2mu you have repo for that?

@IRazorLeaf Ай бұрын

Could you share a bit more details on what you do or is that top-secret? :)

@aaron6235 Ай бұрын

Yes, share, and don't monetize😮

@unclecode Ай бұрын

You’ve built a moat for yourself :)) Nice job! The end result of "O1" isn’t exactly new, the "moat" (if there is one) lies in how they figured out bringing the process into inferences within the model itself. I hope they eventually share the details and reasoning tokens. Right now, it could just be like what you did, or maybe they’ve found a way to embed "search" natively during inference.

@engineerprompt Ай бұрын

Its a mystery but I think it is possible to replicate that behavior with multi-agent system especially if they are specialized. Seems like the Qwen team has something in works. will be interesting to see their approach.

@unclecode Ай бұрын

@@engineerprompt Qwen became good in math and me too, waiting for that 😎

@obetishere9215 Ай бұрын

Nice. I have my personal project similar to this built using react and using mistral nemo api. My use case is developer, code reviewer, unit tester. I feel like i am missing something but i do not know yet.

@engineerprompt Ай бұрын

nice, I love when people are solving practical problems with LLMs.

@aaron6235 Ай бұрын

Why didn't you put the link to your repo in the description? I would very much like to clone it

@engineerprompt Ай бұрын

Wanted to test it a bit more and put up a Readme. There are some failure cases that needs to be highlighted. Will do it soon

@mirkakonest Ай бұрын

Open Source?

@engineerprompt Ай бұрын

github.com/PromtEngineer/Agent-0.git

@raphaelchristi6424 Ай бұрын

There's some repo with that?

@ingenierofelipeurreg Ай бұрын

Pls share repo

@engineerprompt Ай бұрын

github.com/PromtEngineer/Agent-0.git

@ahmadrezaatashbartehrani3617 Ай бұрын

where is code ? thanks for sharing that

@engineerprompt Ай бұрын

Look at the pinned comment

@rajapulau Ай бұрын

10:45 can this solution use autogen?

@engineerprompt Ай бұрын

Yes, that could be a good option.

@zeusconquers Ай бұрын

hi, nice work. could you provide the source code please

@engineerprompt Ай бұрын

github.com/PromtEngineer/Agent-0.git

@Dr.UldenWascht Ай бұрын

Reflection reborn 😄

@engineerprompt Ай бұрын

😄😄😄

@大支爺 Ай бұрын

Just review it's own answer and answer it once more time.

@monstercameron Ай бұрын

this isnt 01 this is gpt4 with code interpreter

@clray123 Ай бұрын

These attempts show your fundamental lack of understanding of what OpenAI did to improve their models. Which is excusable because they no longer publish anything worthwhile and are so vague in their marketing material that it leaves you guessing. But I suspect it went as usual: they created a big mountain of synthetic data demonstrating those "chain of thoughts" on a well defined set of problems, for which reward functions can be defined using RL, and then they trained a very normal LLM on that very mountain of "solved problem" data. What you are missing is not some shitty prompts. You are missing the mountain of training data (and another mountain of compute resources to pull off the training, but it's mostly the data which gives them a competitive advantage).

@georgearistides7704 Ай бұрын

genuine question, how do you know this. I am asking because I built a program to implement chain of thought using the lightweight llama models today, and it does of course give improved responses with respect to amount detail and accuracy. Do you have a source for this claim, and if so can you please drop the link here. I want to try implementing this and possibly combining with what I have already done

@clray123 Ай бұрын

@@georgearistides7704 The source is my knowledge of how LLMs work and what it takes to get better scores on benchmarks. Also that OpenAI did not mention any improvements to model architecture, which they presumably would have done if they had done anything better than provide more training data.

@georgearistides7704 Ай бұрын

@@clray123 source: trust me bro

@clray123 Ай бұрын

@@georgearistides7704 Well, you can go watch AI Explained's review which pretty much does say the same thing after reading 40 pages of their release docs. Believe it or not, these LLMs don't grow on trees and some people out there are fairly well aware how the entire process works.

@paulocacella Ай бұрын

This simply does not matter. What matters is the output quality goal and cost/benefit. You must show a problem that openai approach is in a way that is superior than agents, RAG and tools in terms of privacy and cost/benefit. That is it. The other aspects simply are not relevant outside a research world. It is a simply as that, you do not need a Ferrari to go to the market. Up to now I can make locally outputs that are better than the approach of o1 on meaningful and useful general problems. When they have something better we can go deeply. Up to now this is only a toy using chatgpt because people to not have the agentic systems setup and have no idea on how to build them.

@alexpetrov8871 Ай бұрын

there is no "thought", there is no "answer": this is a big math function calculating a "next word"

@StevenSSmith Ай бұрын

and....?

@alexpetrov8871 Ай бұрын

@@StevenSSmith and what is your question? can you formulate what you are actually asking?

@ronilevarez901 Ай бұрын

Exactly like you.

@alexpetrov8871 Ай бұрын

@@ronilevarez901 >Exactly like you. No, I am quite different. Because to me, “words” are just a language that I use to convey information about the reality that I perceive by analyzing the signals of my senses over time. My senses are imperfect, however, they reflect reality to one degree or another. On the contrary, this LLM has not the slightest connection with reality, all they have is to study this human “language”, a set of phonemes that people use to convey the information that they perceived through their imperfect senses and were able to comprehend and translate into "words". Think about the difference.

@ronilevarez901 Ай бұрын

@@alexpetrov8871 yeah, yeah. The problem is that you and many others would be saying the same about AI if it was embodied and had learned language through real world experience: "it doesn't think". Mhm.