Scaling AI Reasoning: MCTS in ICL for Small LM

Рет қаралды 2,069

Күн бұрын

Пікірлер: 17

@s4kupfers 7 күн бұрын

I love these optimisations that are based on relatively simple logical principles and ideas that can be implemented inexpensively. Also your research with GPT is very valuable, I was able to successfully apply your way of asking GPT to my own problems.

@CharlotteLopez-n3i 6 күн бұрын

I'm intrigued by the idea of feeding abstract reasoning patterns into smaller language models. This could be a game-changer for efficiency and generalization. Excited to see where this research goes!

@GodbornNoven 7 күн бұрын

This is a nice small breakthrough but i think we need something for training. I mean we kinda implemented "thinking" into these models, allowing them to refine their answers and take their time while doing a problem. We also implemented a form of plasticity into them "test time training" what's left is making a good spiking neural network. But how would we go about improving training?

@davidhurtado9922 7 күн бұрын

i love your videos man! Usually think of the fish in American Dad is teaching me ai research

@toxy805 7 күн бұрын

Hey can you structure a playlist from beginner to advanced to learn agents and multi agents so on? Now it’s all over the place

@Yannick-ei2tz 6 күн бұрын

Very good. It is somehow related to a paper about embodied rag I m trying to implement. This more abstract approach to pb solving for my pet project of drone navigation could help

@xtema1624 19 сағат бұрын

Plz share the chatgpt's result that u used in this youtube.

@drdca8263 6 күн бұрын

4:21 : and, do you think that this list is likely to be accurate? 4:37 : hm. Which of these 6 do you think abductive reasoning falls under? Also, how does “relational reasoning” differ from “systemic reasoning”? 5:50 : these 5 seem rather different. (Not sure why you didn’t start with these?) 6:14 : some of these correspondences seem like a stretch…

@user-wr4yl7tx3w 6 күн бұрын

On surface, it looks like a lot of feature engineering in the training process.

@luke.perkin.inventor 7 күн бұрын

The glaring omission is the usual critique of logic, the robustness to approximation and estimation at every step, due to the complexity, indeterminacy, polysemanticity of fuzzy things in the real world. It's fine if you're working with a purely abstract "if A then B", but the world is a lot more complicated, and the mappings are messy. Every reasoning step might be fine but the conclusion still wrong. It's why medicine is hard an QM doesn't work for complex systems.

@user-wr4yl7tx3w 6 күн бұрын

What is QM?

@drdca8263 6 күн бұрын

Why do you say that quantum mechanics doesn’t work for complex systems? Do you just mean that it becomes prohibitively difficult to do computations about?

@luke.perkin.inventor 6 күн бұрын

@@drdca8263 In a way it's both. Experimentally it's difficult to keep things entangled, and computationally it becomes intractable to simulate with all the degrees of freedom, got to make lots of simplifications.

@drdca8263 6 күн бұрын

@@luke.perkin.inventor ok, I wouldn’t say that “QM doesn’t work” for those systems for that reason. That seems like to strong of a statement to me .

@luke.perkin.inventor 6 күн бұрын

@@drdca8263 Just did a quick Google and they did a 120 atom carbon lattice quantum simulation in 2022. Probably a bit higher now. I'm not saying QM isn't amazing, the maths isn't incredible, and the applications like making huge wafers with nanoscale structure aren't amazing. It doesn't mean the maths isn't basically intractable as of 2024 for large systems.