Overparametrized LLM: COMPLEX Reasoning (Yale Univ)

Рет қаралды 8,003

Күн бұрын

Пікірлер: 60

@TheSingularityProject01 3 ай бұрын

Thanks for bringing this to my attention. This is very directly related to my work in generating the General Theory of Intelligence and developing a corresponding mathematical and geometric model. This is definitely moving in the right direction. It is this kind of thinking that will result in the next generation of Intelligent Systems.

@vaioslaschos 3 ай бұрын

For those who are wondering (for me it was not obvious), for the evaluation phase, they freezed the intermediate layers, and they trained only the heads.

@wesremy 2 ай бұрын

For the pretraining, did they use finetuning technically? Because I thought consumers couldn't change the pretraining

@cmw3737 3 ай бұрын

Good to see research confirming the intuitive hypothesis that intelligence emerges from the complexity at edge of chaos. Maybe one day AI will capture the essence of intuition itself.

@brendawilliams8062 2 ай бұрын

I am not the brightest here but AI has intelligence in another intelligence behind it. All the way from 1010101 to the top of however you like counting. A hard problem is when it’s through counting it will go back to down to where you started. So whenever the yo yo stop’s spinning you must go back up again. So I just don’t know about your hard problems with motion here

@SixTimesNine 3 ай бұрын

Wow! This is great stuff! As an avid devotee of Wolfram and Tegmark all of this makes complete sense. (Is Wolfram kicking himself now, or jumping up and down excitedly shouting I Told You So! ?) I'm thinking "Artificial" is going to need to be dropped soon, replaced with "Emergent" perhaps. Reading the few comments so far - if you've never heard of this stuff then, yeah, this is going to be all Greek. But damn! So right on! Please do a follow up crash course to get everyone up to speed?

@brendawilliams8062 2 ай бұрын

It’s still an ongoing project or at least to me seemed to be. The Ruliad is way Beyound in it’s scope of all

@mickelodiansurname9578 3 ай бұрын

How come Stephen Wolfram was not all over this 3 years ago?

@wwkk4964 3 ай бұрын

He explains why in his last live stream from a month ago about machine learning breakthroughs he just made. Basically he said that when he tried it in1985, it didn't work because he gave up too early with time (column or vertical) entropy based rule arrays, but he really should have tried space (horizontal) based ones.

@mickelodiansurname9578 3 ай бұрын

@@wwkk4964 Well I too fluted about with cellular automata, and I might add neural nets back in the 80's and the main reason it was pointless was simple... my computer just could not handle it... simple as that, and I had better things to be leaving my pc on all night to do. Now in fairness I was a teenager f**king about and not a researcher... but technology just was not capable of a lot back then. I mean hell if you were coding anything and any matrices were above two or three dimensions, you may as well go off and come back hours later. But come the arrival of LLM's I would have imagined this idea would have resurfaced. I suppose its all very well in the light of hindsight for me to point that out right?

@wwkk4964 3 ай бұрын

@@mickelodiansurname9578 Precisely why Wolfram developed SNP and Mathematica because he realised that science had been reduced to computation back then! Watch his stream though, its remarkable how close he was, and gave up too early.

@SixTimesNine 3 ай бұрын

@@mickelodiansurname9578Absolutely! Same. Remember getting excited about a 2 week holiday coming up so I could leave my Amiga 500 processing the whole time.

@SixTimesNine 3 ай бұрын

@@wwkk4964please link to it, if possible

@soccerkenshin 3 ай бұрын

Love the channel and I've learned a lot from your content! Your focus on the reasoning capabilities of models has been particularly enlightening. However, I need to address a point regarding the 2022 Emergent Abilities of LLMs paper. It does not discuss the complexity of training data. Instead, it only mentions 'quality' of training data, without defining what that means or how it might relate to complexity. This distinction is crucial because 'quality' can imply many things, not necessarily data complexity. Furthermore, I would advise caution in using this paper as a reference, as its claims have been challenged multiple times. For instance, Rylan Schaeffer et al. in "Are Emergent Abilities of Large Language Models a Mirage?" argue that these so-called emergent abilities might be artifacts of the evaluation metrics used. Similarly, the paper "Are Emergent Abilities in Large Language Models just In-Context Learning?" suggests that what seems like emergent behavior could result from in-context learning, model memory, and linguistic knowledge rather than scale alone. I hope this adds some nuance to the discussion!

@i_forget 3 ай бұрын

Called it. First we got perplexity. Next we get complexity.

@derghiarrinde 3 ай бұрын

I have predicted intelligence arising from [LLM internal representations of] complex patterns nearly a year ago. Still getting around to actually building a proof of concept system that would literally sweep every AI researcher off their feet. But I am thinking of it. Complex data is the key and the more the better.

@OumarDicko-c5i 3 ай бұрын

I love your passion, i don't understand everything but love it 😂

@code4AI 3 ай бұрын

Smile. It is science ...

@jmirodg7094 3 ай бұрын

I'm very interested by the measure of the complexity as it is the key in many domains. Lempel-Ziv, Lyapunov, and Krylov is a good start but not sufficient to fully define the complexity of a given system, a notion of scale is probably necessary as well and I feel some other parameters are missing as well like the interactions with others system

@boyardosalmon 3 ай бұрын

Please add transcript. Need the NotebookLM version. Thank you

@lotus-chain 3 ай бұрын

hey, dude what is the relation this serpinsky triangles(automata shapes) by/with the reasoning llm? please another video for more data. ty bro

@code4AI 3 ай бұрын

Cellular automata data are the training data for the LLM to achieve a clearly defined complexity level of synthetic data to train their LLM on, in order to evaluate if the performance of the LLM is accordingly.

@blengi 2 ай бұрын

is this the same as the many theories thàt suggest criticality in the brain poises it at the brink of a phase transition between order and randomness to optimize information processing?

@meltingscales 3 ай бұрын

Does this imply that you could train an AI on images of fractals, and intelligent behavior would spontaneously emerge?

@firecat6666 3 ай бұрын

From what I understood from the video, the researches didn't feet the ECA images directly into the model, they used some method to extract the pixel state information (0 or 1) of the cellullar automata (not necessarily from an image, I'd imagine), turned it into some numerical representation that I don't understand (vectors something something) and then fed THAT into the model.

@meltingscales 3 ай бұрын

@@firecat6666 Still extremely cool!

@wwkk4964 3 ай бұрын

Soon we will have Rule 110 training on rule 110 and rule 30, by thr hints Stephen Wolfram gave in his last blog post about Discrete Machine learning systems he develoepd.

@Japneets1 3 ай бұрын

What a beautiful conjecture. Complexity is all you need ❤

@code4AI 3 ай бұрын

So true!

@onlyms4693 3 ай бұрын

i also thinking that true intelligent people are that who able to extrac absctract concept into a structured concept that we can learn from, the introduction of complexity into llm actually suit that concept of how its eaisier to understand something that already have structured rather than making a structured from a abstract thing. Whic in turn make them smarter at structured concept because its a derevative of that abstracion or in this term are complexity.

@code4AI 3 ай бұрын

Current paradigm of training LLMs is more data, just more data. the complexity classification of data streams has not yet happened ...

@derghiarrinde 3 ай бұрын

Let me correct you a little... To be more precise, intelligent people have more complete internal representations which [according to this study] arose from more complexitity of the data they have processed. Think a peasant and Einstein, how they imagine things, what are their internal representations of space, time and energy.

@s.patrickmarino7289 3 ай бұрын

I am running a bit slow today so please forgive any stupidity. (I need caffeine badly.) Is this saying that we need to start a new model by giving it synthetic data of increasing complexity, then feed it the normal training data?

@code4AI 3 ай бұрын

It all depends on your task for the AI. If the job is to be everything to everybody (like GPT-4o) than the internet data are great and diverse. If you want to build a high-performance LLM, maybe with narrow domain knowledge, like finance or chemistry, that should be able to argue with a higher complexity, more advanced reasoning, than a new optimized complexity dataset might be helpful for pre-training. It is not, that just more and more data accumulated and suddenly an intelligence emerges. The quality and complexity levels of this data are important.

@MusingsAndIdeas 3 ай бұрын

Not only increasing complexity, but increasing meta-complexity, forcing the model to develop a complex internal representation of the subject it is training on.

@derghiarrinde 3 ай бұрын

Synthetic data, if you want "synthetic intelligence". Human data if you want to model human-like intelligence (AGI).

@derghiarrinde 3 ай бұрын

@@MusingsAndIdeas What do you mean by meta-complexity specifically? Can you give me an example? Thanks.

@s.patrickmarino7289 3 ай бұрын

@@derghiarrinde, It depends on what you want the model to do. If you want the model to chat like a human, you feed it human conversations. If you want the model to spit out facts, you feed it lots of facts. If you want the model to solve a specific class of problem, you create a simple game that is like that problem. Then you run through the permutations of that game. Then you create a more complex version of that game and you train it on a percentage of the permutations of that game.... Each round, the game gets more complex. Eventually, it can infer the strategies of the more complicated iterations from the training data of the simpler ones. If you do this with several different games, you get something that is very good at solving problems, even if it does not think the same way a human does. Then, you feed it the training data from a more traditional LLM.

@SixTimesNine 3 ай бұрын

Did you miss the joke? Perhaps complexity is all you need? A sly reference to Attention is all you need.

@derghiarrinde 3 ай бұрын

Didn't. It was great.

@derghiarrinde 3 ай бұрын

This whole paper reminded me of the PHI-1 paper called "Textbooks are all you need". Worth a re-read, I guess...

@brendawilliams8062 2 ай бұрын

13:49 if structure and balance foster intelligence then the entire universe is brake dancing

@code4AI 2 ай бұрын

I am just trying to imagine your idea ... smile.

@brendawilliams8062 2 ай бұрын

Imagination. A pic with or without noise

@pensiveintrovert4318 3 ай бұрын

If one is to assume that their conjecture is true, then the way forward is to design synthetic data generators which can produce data with increasing levels of complexity. The fact that this synthetic data may not be found in the physical world is not relevant to the conjecture. So this hypothesis can be proved or disproved very quickly (subject to computing resources). One wonders why they didn't take that extra step.

@code4AI 3 ай бұрын

This is exactly what they have done: design synthetic data generators which can produce data with increasing levels of complexity. And then they have evaluated it. As shown in my video ....

@derghiarrinde 3 ай бұрын

That is not the way forward IMO. The task is to model human intelligence. That means you need to give it "human data" with increasing complexity. Not more KNOWLEDGE. But complexity in data, made by humans.

@SixTimesNine 3 ай бұрын

@@derghiarrindeWhy Human Intelligence? Surely intelligence just is. Personally suspect the human part may be a limiting factor.

@derghiarrinde 3 ай бұрын

@@SixTimesNine perhaps think more in line of human intelligence vs current "artificial intelligence" of the Alpha-Go system. Human intelligence = general intelligence, currently the only general intelligence we have. That's what we're trying to model.

@ppbroAI 3 ай бұрын

When a human is exposed to complex data ( ie education) intelligence could emerge, we know BUT think on geniouses, somehow the brain complexity could be hereditary or the exposing of the children to complex data could make the leap possible. Who knows maybe Ramanujan as a little boy seeing all those complex patterns increased his intelligence, and a reproducible process can be achieved in humans. Maybe is not education but complexity for humans too?. I wonder.