The Future of AI is Self-Organizing and Self-Assembling (w/ Prof. Sebastian Risi)

Рет қаралды 40,642

Күн бұрын

Пікірлер: 100

@YannicKilcher 2 жыл бұрын

OUTLINE: 0:00 - Introduction 2:25 - Start of Interview 4:00 - The intelligence of swarms 9:15 - The game of life & neural cellular automata 14:10 - What's missing from neural CAs? 17:20 - How does local computation compare to centralized computation? 25:40 - Applications beyond games and graphics 33:00 - Can we do away with goals? 35:30 - Where do these methods shine? 43:30 - The paradox of scales & brains 49:45 - Connections to graphical systems & GNNs 51:30 - Could this solve ARC? 57:45 - Where can people get started? Read Sebastian's article here: sebastianrisi.com/self_assembling_ai/

@henrythegreatamerican8136 2 жыл бұрын

I'm curious how this would play out in the long run. Without a centralized AI in charge of all the systems, would we eventually have a bunch of systems fighting amongst each other? Would that competition lead to better outcomes or would the more dominant system win? How would these self organizing systems be able to fully function to the best of their ability when resources would need to be wasted to preserve themselves from being destroyed by other systems? Would an outdated system sacrifice itself for the greater good or would it put up a fight for self preservation?

@jonathanr4242 2 жыл бұрын

I did my phd in unsupervised neural networks… that was 20 years ago… it’s really heartening to see this stuff coming into fruition.

@jimj2683 2 жыл бұрын

Wow! Things always take so much longer than we expect! We need to find a way to live longer so we can experience the cool future.

@tavaroevanis8744 2 жыл бұрын

The gains in this field will increase exponentially, so we may not have to wait much longer for a generalized superintelligent AI. Human society should plan for this eventuality ASAP.

@Mutual_Information 2 жыл бұрын

This has been a great listen. Just to add, I think this perspective (many local optimization vs a singular loss to optimize) is gaining traction. Hinton on Abbeel’s podcast mentioned that scaling up NN won’t be enough for intelligence - he think we need many local objective functions.. which appears to be right on top of Risi’s ideas. Exciting research direction

@avishvj 2 жыл бұрын

Need to watch this vid first but is the many local objectives like deep ensembles?

@Mutual_Information 2 жыл бұрын

@@avishvj Doesn't seem to be, from what I can tell. Deep ensembles involve multiple losses to get those uncertainty estimates, but I don't see anything local about them. It seems local losses refer to neighboring pockets of data.. locality in Risi is literally 2D or 3D space. Hinton.. well I'm not sure what local in his mind is.

@avishvj 2 жыл бұрын

@@Mutual_Information Yep got you, just got around to watching it. So neural networks used to learn spatially local rules that dictate how cellular automata "grow". With deep ensembles it's more trying to combine different point estimates of the loss landscape to get a Bayesian average. There might be an interesting link between the loss landscape and important meta-rules of the NCA but I don't think NCA afford the same kind of Bayesian/uncertainty-aware interpretation naturally. Also, really liking your channel! Interested to see your future videos.

@avishvj 2 жыл бұрын

@@Mutual_Information There seems to be a few ways of viewing locality: neighbouring parts of data (I guess feature space?), latent space, loss landscapes. I imagine Hinton was focussed on the latter two but would be unsurprised if he considered the first given his track record of basically thinking of everything before the 2010s 😂

@Mutual_Information 2 жыл бұрын

@@avishvj Interesting.. yea I know very little about deep ensembles, only that they are a good approach to the super difficult task of uncertainty estimate. It was a nice refresher. And thanks! :)

@sau002 2 жыл бұрын

I find the humbleness of Sebastian very refreshing. He refrains from making tall claims or have any pretensions of knowing everything.

@raminbakhtiyari5429 2 жыл бұрын

one of the best informative youtube channels. thanks for not producing nonsense and thanks for such great content.

@redyican5341 2 жыл бұрын

I think it might be worth discussing this subject with prof Michael Levin whos ex AI person and works on communication and morphogenesis in biological organism at cellular level

@barrettvelker198 2 жыл бұрын

Commenting for the algorithm. This could be a really important conversation

@videowatching9576 2 жыл бұрын

Sounds interesting!

@OrlOnEarth 2 жыл бұрын

I immediately thought of Michael Levin works when I saw this. Good stuff

@revimfadli4666 2 жыл бұрын

@@barrettvelker198 commenting too Hope this goes above that suspicious Mrs Sonia comment

@richardtobing5012 2 жыл бұрын

Im am currently enjoying a long-term obsession with neural nets, minecraft, and the emergent behavior of cellular automata, this video really really hit the spot. Kinda mad someone thought of all this before i did tho

@SinanAkkoyun 2 жыл бұрын

19:05 i think that what he meant that is is very hard to accomplish absolute positioning and communication for nanobots in contrast to relative positioning and communication, from a power consumption perspective etc. Imagine a voxel robot with internal damage. How would you detect the damaged voxels? In the end, you need single voxels to communicate, absolute positioning is hard to do, not even mentioning absolute path finding, so relative local communication is key to these systems

@drdca8263 2 жыл бұрын

55:06 : I also initially thought it was “the smallest one”, but it isn’t. Rather, the solution is “pick the one that lacks left/right symmetry Edit: though “pick the one that fits in the box” also gives the right answer in this case. I don’t think it was the intended solution though. I think the “take the one that isn’t left/right symmetric” is it.

@Bencurlis 2 жыл бұрын

In the ARC test you are supposed to guess the output box dimensions, so picking the shape that lacks left/right symmetry must be the right rule.

@drdca8263 2 жыл бұрын

@@Bencurlis Ah, so the game differs from the test rig thing in that the test rig doesn’t give the program being tested the dimensions of the result? I didn’t know that. Thanks!

@drdca8263 2 жыл бұрын

It might be interesting to, in order to try to get some insight into how people solve ARC problems, if you had 2 people X and Y, where one of which (X) could see the 3 example input/output pairs, and the other of which (Y) can’t, and Y can ask X any number of yes/no or numeric questions about the examples, and get answers from X, and then using these answers, Y has to try to produce the output when given the novel input. This would of course be a harder task than doing it normally, but it would have the benefit of making part of the reasoning of “determining what the pattern is” more explicit and recorded. (Maybe add as allowed a few special case questions that have non-numeric and non-Boolean answers to make some things more convenient to ask, like, “which colors appear in the input”/“which colors appear in the output”, “which colors appear in ”, etc. where “which colors” is asking about which subset of the colors ARC uses, appear at all, not asking for the entire content of the image, which would of course defeat the point.) Perhaps by analyzing the logs of what questions people who have gotten good at this, ask, after having gotten the answers they did to previous questions, that could help at designing a program or model which can do well at the task? The first question I think I would ask, would be “are all the outputs the same size as the corresponding input?” If yes, then “are all the inputs the same size?” If yes, then, “is there a clear ‘background color’ “ If yes, “does the output differ from the input either mostly or entirely by replacing parts of the background color with another color?” If not, “does the output differ from the input almost entirely by replacing some parts that don’t have the background color, with the background color?” etc. Actually, probably a better question before some of those, would be, “do squares at the same location tend to mostly be the same between the input and output?” . Or, maybe just, “are the input and output at a given square location highly correlated?” ? Not sure which order would be better for those questions, when one implies the other. I suppose it depends on the relative likelihood of the answers, because (assuming goal is to use as few questions as you can, and then solve it correctly) you want to have each question be as close as possible to cutting the possibilities in half. I want to try doing this with someone, but I don’t think the people I know would really enjoy this game? I don’t know. I think an easier step which is necessary to solve it, is just being able to predict the size of the output with high accuracy. One idea is “take the question/answer logs of a bunch of people playing this game (prefixed with whether the final guess was successful), and train(rather, fine-tune) a language model on those logs, and prompt it with an indication that the following log is from a successful attempt”, but while I think it might produce something which could give effective questions, and paired with something which could evaluate those questions, and then something to try to solve the thing using the answer to those questions, might “work”, I suspect it might be too slow to do well at the competition? Though, perhaps if you then tried to make a model which could encode the questions in a more succinct but not human-readable manner, and had something generate and answer the questions in that format, it might work? This encoding should ideally be chosen in a way where it isn’t penalized for encoding two questions which always have the same answer / two phrasings of the same question, with the same encoding. There are so many research projects that I would like to frequently hear about attempts (both successful and unsuccessful) within. Obviously way too many for one person to work on. I suppose if I was obscenely wealthy I might hire a team of people to try a bunch of ideas I have and tell me on a weekly basis how they are going / whether they worked. That would be very enjoyable I think. Perhaps that is what some super wealthy people do?

@tsunamio7750 2 жыл бұрын

57:29 The Alchemy task reminds me of propositional logic. Depending on how you encode it, it could be pretty trivial... I haven't looked at the exercise yet, but logical conclusions is the jam of prop.logic.

@goldnutter412 2 жыл бұрын

That future may or may not be a distant one ! outstanding video, cheers

@RalphDratman 2 жыл бұрын

I agree that self-organizing systems, visualized as growing organisms rather than engineered systems, must be the long-term future of our intelligent devices. But I must disagree about the use of purely local connections, which are not the main communication pathways of biological organisms. Based on my personal work with cellular automata in the 1980s, and confirmed by Stephen Wolfram’s exhaustive survey of a certain class of cellular automata, I feel confident that such a permanent limitation would drastically curtail the power of new self-organizing systems. Fortunately a slight relaxation of the “local-only” rule, when embedded in 3 or more dimensions, can banish that problem. For in 3 or more dimensions a fiber can connect any two parts of a complex organism. Consider neurons in the vertebrate nervous system. Each neuron can use the basic principle of local contact to grow a long axon to reach anywhere in the brain and even into other parts of the body. The neuron can also create large numbers of synapses with other neuron origins. Consider also the veins and arteries, the lymph nodes and muscle fibers, and it is easy to see how to transcend the rule of local communication by the growth of electrochemical wires, tubes, and even contractile fibers. But all this can only take place in spaces with at least three dimensions.

@drdca8263 2 жыл бұрын

Aren’t there like, “wire-world” 2D cellular automata that can connect regions? Like, as long as you can have wires that cross without interacting, that seems like it allows connecting things as you want? I guess being able to cross wires is a little bit like having a small third dimension?

@RalphDratman 2 жыл бұрын

@@drdca8263 You're absolutely right about such wires being constructible, but my experience suggests that constructing wires from multiple automata cells, along which impulses can travel at most one cell per automaton clock, is not a practical approach to studying these systems. Instead I think it will be necessary to pre-create special automaton cells that can extend in 3D space so as to connect distant regions in just one or a few clock cycles, regardless of the distance.

@oncedidactic 2 жыл бұрын

@@RalphDratman nice comments. Definitely richer geometry offers power unavailable otherwise. I wonder if minimally abstracting offers enough, ie track topological relationships in a graph not actual 3D coordinates. A lot to think aboutz

@RalphDratman 2 жыл бұрын

@@oncedidactic Using an arbitrary graph to interconnect cells can, of course, create any sort of system. The question is, which graphs to try among the countably infinite choices? On the other hand, the much-studied architecture of the human brain has a unique record of cognitive success. Along with with lessons learned from successful ML models, the brain -- including both micro- and macro-structures -- seems the obvious jumping-off point for development based on local-neighborhood systems such as cellular automata.

@oncedidactic 2 жыл бұрын

@@RalphDratman Agreed. I was thinking more along the lines of how to compress what 3D offers and get the nonlocal communication you're talking about in the form that adds the least amount of overhead to 2D. Picture noninterfering circuitboard wires running as needed, just assign a "Z" index and don't worry about actual 3D. So some structure to the graph choice. The recent GNN work on coercing topology into geometric behaviors seems uncannily related. CA are all about local relationships, just like a graph, but we usually interpret their outputs as geometric (shape, size, distance).

@adambarker3130 2 жыл бұрын

Excellent. I felt some nostalgia listening to this and remembering reading Steven Levy's 1993 book "Artificial life" when it first appeared. Still a fun read for context, for those who were not around at the time.

@thomasseptimius 2 жыл бұрын

Great stuff. Would love the hear Wolframs comments on this. Future materials will benefit greatly from this.

@jurischaber6935 2 жыл бұрын

I finally have to thank you for your great Videos. Well done. Chapeau.

@nielsgroeneveld8 2 жыл бұрын

I think you were gonna say something at 1:07, but youtube put a cut in the video.

@mikeadenff 2 жыл бұрын

This is great work and I am convinced that this approach will become main stream! I first started exploring the use of multiple independent learners to avoid overfitting when modeling a population with disparate characteristics. There were 1000s of possible features that could have been used to model the population. In the work that I did for this problem, each learner used its own feature selector. If one local slice of the population could be modeled best using one feature, that is all that would be used. After appreciating what I could do in this case, I start to wonder about other use cases. I am convinced that if I rebuilt some manufacturing models that I worked on in the past I could improve them. Equipment and processes operate in different states at different periods of time. By using different learners, each can focus on modeling an individual state and recognizing members in this state. I would expect these state specific models to be more accurate and more stable than the "uber models" I built before. The other area that I have explored is recommender systems. I find this one really interesting as each learner can also have a different objective/loss. I haven't got to try this, but I would love to see a swarm of recommender models resembling a bunch of expert curators - each focused on their particular view of the world. Recommendations served up to users would be personalized and ever changing blend of items from each model. This really appeals to me because users get to decide what its important to them - so they are not subject to the bias inherent any centrally defined objective/loss.

@videowatching9576 2 жыл бұрын

Wow, recommender system type of approach sounds interesting - any links to projects / papers etc that you think have interesting approaches to doing recommenders, perhaps in such a way?

@ChocolateMilkCultLeader 2 жыл бұрын

Great catch of Self Assembling learning. I want to add one reason why mathematically these systems (especially Evolution based) would scale better than Deep Learning. Adding a new simple weak learner only adds a linear cost to your training. Scaling DL by making models is a an exponential process. Combine this with the fact that having more learners allows you to traverse the search space much more comprehensively.

@revimfadli4666 2 жыл бұрын

I wonder if that would also synergise with quantum computer scaling. The quantum randomness might also help some stuff

@ChocolateMilkCultLeader 2 жыл бұрын

@@revimfadli4666 it makes sense but I'm not sure if it would work that way. I don't know how much more random quantum computing would be than normal randomness (which I know isn't truly random but is close)

@revimfadli4666 2 жыл бұрын

@@ChocolateMilkCultLeader I suspect approximate amounts would be sufficient, as usual with ML

@Kram1032 2 жыл бұрын

I wonder what might happen if you tried to do this Self-classifying MNIST stuff but, like, with CLIP or such: The goal would be to classify, based on an image, what prompt relates to it. Instead of a fixed number of classes, you gotta generate a text input sentence as a class, locally pixel-by-pixel. Going through the evolution of clusters of a complex image could be really interesting

@EudaderurScheiss 2 жыл бұрын

im wondering, could this be used as an encoding algorithm? like how much data does it need to generate / grow certain pictures

@alexandermedina4950 2 жыл бұрын

This seems to be very interesting, thank you for this.

@videowatching9576 2 жыл бұрын

Really interesting. Intriguing to consider in self-organizing and complexity that can emerge in such a way.

@barrettvelker198 2 жыл бұрын

This heavily reminds me of the biological work done by Dr Michael Levine on how limbs, eyes and basically all body parts 'know' how to build into their future state despite wildly varying starting conditions.

@Mr-Casko 2 жыл бұрын

You are a verry important source of info on the Next level on creating A.i.G....and the future of intergrated A.I.

@Luck_x_Luck 2 жыл бұрын

55:08, you select the asymmetric one

@JTMoustache 2 жыл бұрын

Great interview - well done 👍🏼

@-E42- 2 жыл бұрын

Thank you, this was really informative and inspirational! Along the way I almost felt like I used to reading "Computer Kurzweil" in Spektrum der Wissenschaft again :D

@ThePlayfulJoker 2 жыл бұрын

It is nice to look at how you can scale this approach from 2D to 3D. I get really excited thinking about how this approach could be used if you discretize time and try it out in 4D. What would it mean to grow shapes in this context? Can you simulate a beating heart by including the time dimension?

@_bustion_1928 2 жыл бұрын

I have not read the papers yet, but I already have a lingering question. What is the biggest difference between modular models with local communication versus genuine huge monolith models (like conv network)? Maybe my intuition is wrong. What I think that modular networks do is they share their hidden state outputs with neigbhour networks. However, the monolith models do the same thing. One layer shares its output (combined hidden state) with the layer that lies deeper. They do the same thing, but on different scales.

@drdca8263 2 жыл бұрын

Seems like a “sequential vs parallel” thing to me? Like, while you can compute convolutions in a parallel way, but in order to do a pooling layer (hoping I’m getting that term right), you have to wait for all parts of the previous layer to be done, While if it is all with only local stuff, you don’t need such syncing sorta? Idk

@_bustion_1928 2 жыл бұрын

@@drdca8263 You are correct, but I am speaking about a little bit different thing. As far as I understand, local networks can be treated as small networks that share their output with local neighbours; you can imagine it as small interconnected cells. *They are small, there are a lot of them, and they have local connections*. However, layers in bigger networks can be treated as bigger cells with one solid sequential connections. Therefore, they are quite similiar. Additionally, we can provide deeper connections using residual connections (efficiency is evedent because they are used everywhere). Maybe, we can set up better residual connections somehow by providing deeper residual connections with learnable scales.

@antonioILbig 2 жыл бұрын

About ARC games… I notice that the problems cannot be posed just as "you have some input, what is the output?" This is because the problems have very strict constraints about what the input is and not only what the output should be, but also how big is the output grid. These constraints are not written in the problem and have some high degree of complexity. For the one in which you have to choose the figure to fill the output grid exactly for example: must be defined what admissible inputs are, for example you don't want two figures in the input that fits in the same grid(obviously if you want an unique solution), also you should define the output grid in function of the input figures and in particular in function of the solution, this relation is hidden and must be discovered. Last but not least there has to be a sort of definition of figure and the input must be made in a way that you can always identify different separate figures, otherwise where are the boundaries between one figure and the other? For a human it is a not so complex task because: 1) humans have an intuitive meaning of objects in the space and a way to identify them 2) they reason by language, so that it's easy to formulate relation like "the solution is the one that fits the output grid" 3) they already assume that the problem has constraints that need no more investigation, are kinda intuitive. This allows them to tackle the problem in the easiest way possible 4) these problems are made by humans, so that they already contain the human based informations needed for solving them

@RuztomLamundao 2 жыл бұрын

Thanks for tutorial

@SinanAkkoyun 2 жыл бұрын

This is revolutionary

@TheKirillfish 2 жыл бұрын

In that ARC game, it seems to me, the real rule is to pick the only formation that *breaks* the symmetry. The perfectly fitting box is an unnecessary hint

@thegistofcalculus 2 жыл бұрын

Always found it interesting that in an economy, a (supposedly) distributed system, individuals and companies create their own goals which are sometimes rewarded with profit. Agent specific reinforcement learning profit/(goal or subgoal) prediction perhaps?

@sau002 2 жыл бұрын

very interesting.

@joshuasmiley2833 Жыл бұрын

Absolutely amazing and inspiring! This is very similar to bio electricity, electric, cellular, morphing and Xeno bots! Was this work inspired by Michael Leavin? If not, you guys should contact each other.

@johanngambolputty5351 2 жыл бұрын

I'm wondering if it would be beneficial, before even going to 3d, to explore non-uniform grids, to explore graphs where each node may have different neighbourhoods. Maybe this can help focus attention on just relevant regions of space, especially if you let each node move.

@gorojo1 2 жыл бұрын

Local communication with top down communication seems to be the goal, similar to the multidimensional structure of our brain.

@jabowery 2 жыл бұрын

Algorithmic Information Theory provides the answer to why compression leads to better prediction (ie "generalization"). It also meta-predicts that the large language models that yield better perplexity can have their parameters distilled to a tiny fraction of the advertised numbers -- something that has been demonstrated. AIT has been around since the 1960s, guys.

@revimfadli4666 2 жыл бұрын

Isn't that why IIRC bottlenecked-widened networks perform better than homogeneous width ones?

@jabowery 2 жыл бұрын

@@revimfadli4666 Yes. Bottlenecking is a quasi-regularization in FF networks eg highways.

@deusxyz 2 жыл бұрын

I would love to see some code examples of this kind of algorithm :o (in unity c# code)

@MarcelBlattner 2 жыл бұрын

Regarding morphogenesis. Have a look at Micheal Levin’s work. It’s awesome.

@cate9541 2 жыл бұрын

so cool

@alexanderk5835 2 жыл бұрын

like immediately

@r.s.e.9846 2 жыл бұрын

Fresh

@timeTegus 2 жыл бұрын

finally some cool new ideas and not just make the network bigger

@revimfadli4666 2 жыл бұрын

Or better dataset

@cedricvillani8502 2 жыл бұрын

All comes back to classic physics and a bit of quantum physics. Check out Stanford’s talks from 13yrs ago, actually here on KZbin is a 7- set series called Advanced Quantum Physics but it’s really interesting because it goes back to the classic. Also it’s not terribly hard to follow because the algorithms that are used, is in the language we grew up with in collage, these new paper pushers I call them, like to rewrite the basics by adding a new assumption logo and then call it entirely something else, it’s almost like it’s intentional as if people who know the old vernacular are either 💀 or not going to speak out because they’re retired and who cares, because there not stepping on anyone’s toes when you rename everything. 🙃🤔🙄

@revimfadli4666 2 жыл бұрын

Quantum nn are interesting, especially with embarrassingly parallel stochastic algorithms like evolution strategies

@inversemetric Жыл бұрын

Self healing roads would be a big plus

@erickmarin6147 2 жыл бұрын

Independent agents will one day make games like dwarf fortress a whole another beast

@silberlinie 2 жыл бұрын

Großer Anspruch. Große Skepsis.

@BlackMita 2 жыл бұрын

I’d better rewatch this…

@evanparsons123 2 жыл бұрын

Games with fully destructable environments were all the rage in about 2007 and never really materialized no pun intended; although I'll leave whether or not the pun was *actually intended up to the reader's imagination. Really let your imagination run wild on that one. Anyway, they've never really done or even tried a Blockbuster Superman game and to make it right, you would really need FDE and something like this for world building. I think they could finally pull it off.

@nurkleblurker2482 2 жыл бұрын

Why did no one mention genetic algorithms? Isn't that basically what this is?

@flethacker 2 жыл бұрын

so basically copy and paste but with lots of money to pay very high salaries?

@willbrand77 2 жыл бұрын

grey goo?

@immortalsofar7977 2 жыл бұрын

Self-organizing, self-assembling collective systems are what is needed for the next iteration of cryptocurrencies (i.e. next bitcoin).

@Elmownz 2 жыл бұрын

screw mining bitcoin! these guys (miners) need to train neural networks instead of bitcoin!!!

@JL-iu4zc 2 жыл бұрын

I'm so generalized, I don't have any field I care abut.

@marilysedevoyault465 2 жыл бұрын

Very interesting! Wow! Thank you! I hope someday it is used for curing genetic diseases transmitted to babies. Let say these self-constructing parts are in a gene to body simulation, using genes and the protein folding from AlphaFold as the cues for self-construction of bodies. Let say you self-construct models of healthy bodies of men and women. And then you take real men and women with the genetic disease. And you make comparison of the sick scanned bodies with the healthy models. And then, you reverse the reconstruction. Instead of an automatic reconstruction, you ask for an automatic deconstruction to see which gene is implicated. And then you use CRISPR gene editing on the fertilized eggs of the mothers with the genetic disease (or fertilized by fathers with the disease) to make sure that no babies have such disease anymore.