Q* explained: Complex Multi-Step AI Reasoning

Рет қаралды 10,512

Күн бұрын

Пікірлер: 26

@gregsLyrics 7 ай бұрын

firehose to my brain. Amazing! This indicates a fairly long path of steps I need to learn so I can properly digest this beautiful wisdom. Really amazing channel, filled with advanced knowledge of the gods.

@btscheung 7 ай бұрын

Your presentation in this video is definitely A+ in terms of clarity and depth of understanding! well done. Also, I am happy to see a real paper and study on the speculative Q* heuristic search algorithm. Although their results seems to not justify the effort and added complexity, we are only looking at well-known math problems that those LLMs might be pre-trained and focused a lot. If we change the angle to the algorithm is applied in the general solution search space, with greater complexity, Q* is the way to go!

@scitechtalktv9742 7 ай бұрын

Interesting explanation! You mentioned there is code to try it yourself, but I cannot find that. Can you point me to it?

@parthmakode5255 6 ай бұрын

please tag me also once you find the code

@drdca8263 7 ай бұрын

I thought Q* was supposed to be a project by Google or OpenAI (I forget which, but I thought it was supposed to be one of them). The authors listed in the paper are indicated as being affiliated with either “Skywork AI” or “Nanyang Technology university”? Is this a model inspired by the rumors of there being a model with the name “Q*”, or is this the model the rumors were about? Were some of these people previously at OpenAI or Google, but not anymore? Or..?

@jswew12 6 ай бұрын

It was OpenAI internal document leaks I believe. I’m wondering the same thing! I feel like it has to be related, otherwise this feels kind of wrong. I understand wanting to get eyes on your research, and this seems like good research so I commend them on that, but still. If anyone has more info, leave a reply.

@a_soulspark 6 ай бұрын

I'm also really confused. Skywork AI seems to be a legit company/research group, they have released models in the past. however, I see no indication that their Q* is related to OpenAI's. the authors of this paper don't seem to have a record on big tech companies. one of the authors, Chaojie Wang, has a github page which gives some more context (you can look it up on Google if you want)

@a_soulspark 6 ай бұрын

I also was quite confused! It doesn't seem like the people behind the paper have any relation with big tech companies (Google, OpenAI, Microsoft, etc.) and it doesn't seem like their paper is directly related to OpenAI's supposed Q*

@a_soulspark 6 ай бұрын

my old comment got deleted, perhaps bc some word triggered the algorithm. I just said you can use search to find out more about the authors, the first one in the cover of the paper immediately answers many questions.

@idiomaxiom 6 ай бұрын

The trick is whether you have a Q* over a sequence or if you figured out how to credit a sequence for good or bad. "The Credit assignment problem". Possibly OpenAI has figured out a fine grained Q* which would give fast accurate feedback and learning.

@李純心-y9u 6 ай бұрын

This is very good introduction and enjoy the contents even I have implemented my Q-Learning by myself around 30 years ago.

@antaishizuku 6 ай бұрын

Yea i cant find the code for this. Could you please tell us where this is?

@GodbornNoven 7 ай бұрын

Amazing video as always

@drdca8263 7 ай бұрын

27:58 : you say “estimated utility of reaching the correct answer”. Does this mean “an estimate of what the utility would be if the correct answer is obtained” (which sounds to me like the plainest interpretation , but also the least likely, as I would think the utility for that would be arbitrary) or “the expected value of the random variable which gives utility based just on whether final answer is correct”, or “the expected value of the random variable, utility, which is determined by both whether the final answer is correct, and other things, such as length of answer”, or something else?

@danberm1755 5 ай бұрын

I have a special request. I'm really interested in understanding how query/key/value can "transform" embeddings closer to other embeddings using attention. In particular why do you need these three values? Why not just have a single query matrix to move the embeddings during attention?

@nthehai01 6 ай бұрын

Thank you for such a detailed explanation. Really enjoyed it 🚀. But is this Q* somewhat relevant to the one from OpenAI that people have been talking about 🧐?