This is why Deep Learning is really weird.

Рет қаралды 303,763

Күн бұрын

In this comprehensive exploration of the field of deep learning with Professor Simon Prince who has just authored an entire text book on Deep Learning, we investigate the technical underpinnings that contribute to the field's unexpected success and confront the enduring conundrums that still perplex AI researchers.
Understanding Deep Learning - Prof. SIMON PRINCE [STAFF FAVOURITE]
Watch behind the scenes, get early access and join private Discord by supporting us on Patreon:
/ mlst
/ discord
/ mlstreettalk
Key points discussed include the surprising efficiency of deep learning models, where high-dimensional loss functions are optimized in ways which defy traditional statistical expectations. Professor Prince provides an exposition on the choice of activation functions, architecture design considerations, and overparameterization. We scrutinize the generalization capabilities of neural networks, addressing the seeming paradox of well-performing overparameterized models. Professor Prince challenges popular misconceptions, shedding light on the manifold hypothesis and the role of data geometry in informing the training process. Professor Prince speaks about how layers within neural networks collaborate, recursively reconfiguring instance representations that contribute to both the stability of learning and the emergence of hierarchical feature representations. In addition to the primary discussion on technical elements and learning dynamics, the conversation briefly diverts to audit the implications of AI advancements with ethical concerns.
Pod version (with no music or sound effects): podcasters.spotify.com/pod/sh...
Follow Prof. Prince:
/ simonprinceai
/ simon-prince-615bb9165
Get the book now!
mitpress.mit.edu/978026204864...
udlbook.github.io/udlbook/
Panel: Dr. Tim Scarfe -
/ ecsquizor
/ ecsquendor
TOC:
[00:00:00] Introduction
[00:11:03] General Book Discussion
[00:15:30] The Neural Metaphor
[00:17:56] Back to Book Discussion
[00:18:33] Emergence and the Mind
[00:29:10] Computation in Transformers
[00:31:12] Studio Interview with Prof. Simon Prince
[00:31:46] Why Deep Neural Networks Work: Spline Theory
[00:40:29] Overparameterization in Deep Learning
[00:43:42] Inductive Priors and the Manifold Hypothesis
[00:49:31] Universal Function Approximation and Deep Networks
[00:59:25] Training vs Inference: Model Bias
[01:03:43] Model Generalization Challenges
[01:11:47] Purple Segment: Unknown Topic
[01:12:45] Visualizations in Deep Learning
[01:18:03] Deep Learning Theories Overview
[01:24:29] Tricks in Neural Networks
[01:30:37] Critiques of ChatGPT
[01:42:45] Ethical Considerations in AI
References:
#61: Prof. YANN LECUN: Interpolation, Extrapolation and Linearisation (w/ Dr. Randall Balestriero)
• #61: Prof. YANN LECUN:...
Scaling down Deep Learning [Sam Greydanus]
arxiv.org/abs/2011.14439
"Broken Code" a book about Facebook's internal engineering and algorithmic governance [Jeff Horwitz]
www.penguinrandomhouse.com/bo...
Literature on neural tangent kernels as a lens into the training dynamics of neural networks.
en.wikipedia.org/wiki/Neural_...
Zhang, C. et al. "Understanding deep learning requires rethinking generalization." ICLR, 2017.
arxiv.org/abs/1611.03530
Computer Vision: Models, Learning, and Inference, by Simon J.D. Prince
www.amazon.co.uk/Computer-Vis...
Deep Learning Book, by Ian Goodfellow, Yoshua Bengio, and Aaron Courville
www.deeplearningbook.org/
Predicting the Future of AI with AI: High-quality link prediction in an exponentially growing knowledge network
arxiv.org/abs/2210.00881
Computer Vision: Algorithms and Applications, 2nd ed. [Szeliski]
szeliski.org/Book/
A Spline Theory of Deep Networks [Randall Balestriero]
proceedings.mlr.press/v80/bal...
DEEP NEURAL NETWORKS AS GAUSSIAN PROCESSES [Jaehoon Lee]
arxiv.org/abs/1711.00165
Do Transformer Modifications Transfer Across Implementations and Applications [Narang]
arxiv.org/abs/2102.11972
ConvNets Match Vision Transformers at Scale [Smith]
arxiv.org/abs/2310.16764
Dr Travis LaCroix (Wrote Ethics chapter with Simon)
travislacroix.github.io/

Пікірлер: 382

@MachineLearningStreetTalk 3 ай бұрын

What did you like about this video? What can we improve?!

@jd.8019 3 ай бұрын

Firstly, I wanted to say--I can't believe I just realized I wasn't subscribed to your channel; this mistake has been rectified! Secondly, I could easily write an essay (or more accurately, a love letter) about this channel: there are very few insightful AI channels on KZbin, a few mediocre ones, and the rest. This channel, without a doubt, is in a league of its own. As an engineer, when I see a new MLST video posted, it's like sitting down for a mouthwatering gourmet meal, after being forced fed nothing but junk food for weeks on end. Finally, with that said, allow me to attempt a justification of my admiration: 1) Guests: You always have amazing guests! Your interview style never fails to engender engaging, thoughtful, and most of all - fun conversations! In this video in particular, Simon seems like he’s having a blast speaking about something he’s passionate about, and that enthusiasm genuinely put a smile on my face. 2) Editing: The videos are always well put together and the production value is always phenomenal! I mean, wow… Compared to the other AI channels on KZbin, Machine Learning Street Talk makes them look like amateurs. 3) Knowledge: Most other channels seem content to merely discuss the latest ML hype as it happens in real-time; this is fine, and most aren’t objectively wrong, however, it's mostly surface level discussion and smacks of novice insight. They are, for lack of a better description, an animated news feed. With the exception of Yannic, MLST is the only other mainstream channel I’m aware of with a solid academic pedigree, and it's palpable. I’ve been completely starving for this kind of more in-depth/rigorous discussion. I can only speak for myself, but I imagine there are many who come from a STEM/technical background who feel the same way, so thank you on our behalf. Keep up the great work!

@MachineLearningStreetTalk 3 ай бұрын

@@jd.8019 Thank you sir!!

@agenticmark 3 ай бұрын

snag the first Ilya interview since the openai debacle :D

@therainman7777 3 ай бұрын

I love the overall aesthetic and the production value. I think many technical channels don’t realize that even for highly technical subject matter, these things are incredibly important for building a large audience that keeps coming back. I also like the way on some episodes you seamlessly edit and link together clips from other interviews; my only criticism is that sometimes when you do that the narrative thread that ties them together is not always made sufficiently clear. If you could include some quick, unobtrusive means of showing the viewer why those clips are being edited in and what the thread is, I think that would significantly up the utility and educational value people get out of it. Kind of like a mind map, but within your videos.

@alib5503 3 ай бұрын

Music?

@samgreydanus6148 3 ай бұрын

I'm the author of the MNIST-1D dataset (discussed at 1h15). Thanks for the positive words! You do an excellent job of explaining what the dataset is and why it's useful. Running exercises in Colab while working through the textbook is an amazing feature.

@therainman7777 3 ай бұрын

Nice, I love the dataset for testing and learning purposes, so thank you so much for creating and releasing it 🙏

@rubyciide5542 3 ай бұрын

How to learn ml from scratch and create stuff like u do?

@datafiasco4126 26 күн бұрын

Thank you for that. I always start my trainings with that.

@Friemelkubus 4 ай бұрын

Currently going through it and it's one of the best textbooks I've read. Period. Not just DL books, books. I love it.

@dennisdelgado4276 14 күн бұрын

Sorry you’re going through it man. Hope things get better for you

@oncedidactic 4 ай бұрын

Brilliant, clear, direct conversation. Thank you!!

@chazzman4553 3 ай бұрын

The best channel I've seen for AI. Cutting edge, no amateur overhyped BS. Down to earth.

@chodnejabko3553 26 күн бұрын

The overparametrization conundrum may be related to the fact we look at what NN are in a wrong way. To me NN is not a "processor" type of object, it's a novel type of memory object, memory which stores and retrieves data by overlaying them on top of one another & also recording the hidden relations that exist in the data set. This is what gets stored in the "in between" places even if the input resolution is low - the logic of coexistence of different images (arrays), which is something not visible on the surface. I'm a philologist by training, and in XX cent. literature there was this big buzz around the concept of "palimpsest". Originally palimpsests were texts written on reused parchment from which previous text were scraped off with a razor. Despite scraping, the old text still remained under the new one, which led to having two texts in the same space on the page. In literature this became a conceptual fashion of merging two different narratives into one, with usually very surreal effect. One of the authors that comes to mind is William S.Burroughs. In the same way merged narratives evoke a novel situation due to novel logical interactions between the inputs, the empty space in an overparametrized NN gets filled with the logic of the world from which the input data comes & this logic exists between them even when the resolution is low. Maybe NN is a Platonic space. Many images of trees somehow hold in them the "logic of the tree", which is something deeper and non obvious to eyes, since in their form alone converge both the principles of molecular biology & the atmospheric fluid dynamics, ecosystemic interactions, up to the very astronomical effects of sun, moon, earth rotation, etc. All of it contributes to this form in one way or another, so the form reflects those contributions & therefore holds partial logic of those interactions within it. Information is a relation between the object and it's context (in linguistics we say - it's dictionary). A dataset not only introduces objects, it also as a whole becomes a context (dictionary) through which each object is read. In that sense maybe upscaling input data sets prior to learning is detrimental to the "truth" of those relations. I would be inclined to assume we'd be better off if we let the NN fill in those spaces based on the logic of the dataset, unless we want the logic of transformations to somehow influence the output data (say - we specifically are designing upscaling engine).

@aprohith1 4 ай бұрын

What a gripping conversation.. Thank you. !

@makhalid1999 4 ай бұрын

Love the studio, would love to see more face-to-face podcasts here

@beagle989 4 ай бұрын

great conversation, appreciate the skepticism

@rajdeepbosemondal7648 3 ай бұрын

Cool thoughts on digging into the nitty-gritty of deep learning frameworks. The connection between language models and our brains, especially in Transformers, really makes you think. Checking out how things stay consistent inside and finding ways to boost brainpower raises some interesting questions. Looking forward to diving deeper into these fancy concepts!

@amesoeurs 4 ай бұрын

i've read most of this book already and it's fantastic. it feels like a spiritual sequel to goodfellow's original DL book.

@sauravsingh9177 4 ай бұрын

Can I read this book , even without reading Goodfellow's book? I am currently reading Jeremy's DL with fastai Pytorch book.

@amesoeurs 4 ай бұрын

@@sauravsingh9177 yes although you're expected to have basic familiarity with stats, lin alg, calculus etc.

@RogueElement. 3 ай бұрын

Hello.... I'm a med student and not so proficient in math... Please could you list a comprehensive math tools requirement to be able to understand DL? I'LL BE RIGHT ON IT!🙏😊 @@amesoeurs

@timhaldane7588 2 ай бұрын

I really appreciate being used as an example near the end of the discussion. Version 2.0 is coming along slowly, but I am confident I'll get there.

@dariopotenza3962 3 ай бұрын

Simon taught the first semester of my second year "Machine Learning" module at university! really nice man, we used this book as the module notes. He was very missed when he left in second semester and the rest of the module was never able to live up to his teaching.

@satvikarora5813 3 ай бұрын

what university?

@kylejensen8634 6 күн бұрын

University of bath?

@stevengill1736 4 ай бұрын

Sounds like there are as many questions as answers at this point - looks like a great book with plenty of instructive graphics - look forward to reading it....cheers & happy Gnu year!

@dorian.jimenez 4 ай бұрын

Thanks Tim, great video I learned a lot.

@AliMoeeny 4 ай бұрын

Tim, this is incredibly insightful. Thank you

@mattsigl1426 4 ай бұрын

It’s interesting that in Integrated Information Theory consciousness literally is a super-high dimensional polytope (with every dimension corresponding to a whole system state in an integrated network) in an abstract space called Qualia space.

@jasonabc 4 ай бұрын

Best source and community for ml on the internet by far. Love the work you guys do mlst

@DailyFrankPeter 3 ай бұрын

The sombre Chopin tones in the background emphasize how deep the learning truly is but leave me with little hope of ever fully understanding it... :D

@exhibitD79 4 ай бұрын

Fantastic - Thank you so much for this content. Loved it.

@truehighs7845 3 ай бұрын

The way I don't understand deep learning is that it is a statistical power modulated by randomness to emulate reasoned speech, but it is really the top 3-5-7 reasonable speeches selected randomly, at every word. So in theory whatever the AI says, it should not be able to say it twice unless you tweak its parameters and taken away the randomness (temp) it will always repeat the same thing. It's good at emulating speech that gives the resemblance of an intelligent articulation, but it is indeed the syntax and vocabulary (data) placed in a statistical congruent manner that gives that illusion. It's like a super sales guy, it will talk very well, but there would be no substance in his apparent passion.

@franzwollang 3 ай бұрын

It's still an open question though how different this is from what humans do. What if human brains operate on a similar principle, of learning patterns, activating a subset of them based on contextual input, and then selecting from them via some noisy sampling? That's what really bakes people's noodles. The biggest difference, in my mind, is that most ML objective functions are explicitly and statically evaluated and human ones are implicit in the effect of dynamical chemical processes on learning rates or whatever hyperparameters govern organic learning. Reinforcement learning approaches hint at what more human-like ML systems could look like.

@premium2681 3 ай бұрын

A lot of what I say has no substance or passion.

@AnthonyBecker9 2 ай бұрын

Turns out learning to predict the next token requires understanding a lot of things

@arnaudjean1159 Ай бұрын

Yes these models need a world models not only statistical but reasoning with self-learning that improve relevance . For the moment it's just a well educated salesman but in a very very narrow way.

@truehighs7845 Ай бұрын

@@franzwollang Yes and your point is compelling. If anything the very fact that humans are learning while they infer or confer makes them already a different beast. As we can cogitate and plan, even while not talking, we "defrag" out knowledge. And if we do confer with someone else, we have even faster learning curves. While I am writing this, I am thinking about expressing my knowledge of the AI, and it is a composite knowledge, made of essential theoretical knowledge, psycho-plasticity while using the AI and inference matured from black-box prompting various AI over the last years. So yes compared to us its learning is uni-dimensional and purely linguistic while we have a convergence of learning mechanisms working together, all the time. Currently there is so much to take on in AI that I have a dozen of half baked stuff opened, but ideally your very inference should automatically fine tune your bot, openpipe is trying to do such a thing, but ideally, it should be ported to the unsloth engine, as openpipe uses openai and it's going to cost you a boat load of money to run anywhere near good between inference, dataset generation and loads of fine-tuning sessions.

@almor2445 3 ай бұрын

Great chat, loved it. Just wanted to query the final thought experiment a little. He said the hypothetical AGI would be created in a random super-power in a random company. That's not what employees of Seep Mind or Open AI are doing. They are both switching on a different dial that is specifically being activated in the USA and in their companies respectively. That means a different results for most people.

@ethanlazuk 15 күн бұрын

SEO learning AI and ML here. Thoroughly enjoyed the video -- especially the bits on ethics -- and appreciate the channel. I just caught this discussion but will share the vid and continue exloring the channel. Think it's critical, whether or not people use the technology, to understand AI's implications at large and on a deeper level. Cheers!

@Tesla_Sentiment_Tracker 4 ай бұрын

Loved the discussion! Thank you

@joshismyhandle 4 ай бұрын

Great stuff, thanks

@Earth2Ross 3 ай бұрын

So glad I found this channel, I have some catching up to do!!

@debmukherjee4818 4 ай бұрын

Thanks!

@Pianoblook 4 ай бұрын

Thank you for another excellent conversation! I really loved the discussion of the practical, grounded ethical concerns - I hope y'all consider having more ethicists on the show!

@MichaelBeale 3 ай бұрын

Have you considered the ethical implications of them doing that, @Pianoblook??

@Pianoblook 3 ай бұрын

@@MichaelBeale yes, hence why I recommended it

@amesoeurs 4 ай бұрын

great episode. tim, you should try to get chris bishop on the show too. he finally released the companion book to PRML this month.

@MachineLearningStreetTalk 4 ай бұрын

He's coming on 🤘

@ProBloggerWorld 12 күн бұрын

2:07 so glad you mentioned Schmidthuber. 😅

@MrMootheMighty 3 ай бұрын

Really appreciate this conversation.

@arturturk5926 3 ай бұрын

The most amazing thing about this video to me is that Simon's hair matches that microphone perfectly, nice work lads...

@Daniel-Six Ай бұрын

😂 It's called a dead-cat wind filter in the video trade. Good one!

@JuergenAschenbrenner 3 ай бұрын

Here you have a guy on the hook, I love how you throw these common buzzwords, like emergent agency, set phenomena at him and let him sort it out, which he does in a way that he gives me the feeling of actually understanding, really nice stuff

@dm204375 2 ай бұрын

One of the best videos I've seen regarding the topic. I hold most of the Professors views on the matter so its refreshing to see not everyone in the ML community drank the A.I cool aid. Though I am a lot more pessimistic in that I think we can't slow down or put the genie in the bottle and the effort is wasted. So enjoy the cool new tech while you can enjoy anything...

@user-gz2po7dx3k 2 ай бұрын

Awesome presentations, thank you for great content!

@tomripley7148 4 ай бұрын

good new year

@richardpogoson 2 ай бұрын

The author was my lecturer last year in my first semester! The dude is brilliant!

@lioncaptive 2 ай бұрын

Simon's contribution adds to the MLST ambitious book club.

@FunwithBlender 3 ай бұрын

i see it as alchemy in the sense of it being between the line of where science meets magic or the unknown...interesting times

@spqrspqr3663 2 ай бұрын

the video is a textbook example of the best traditions of British science ie being objective, no nonsense and honest intellectually. At one point the tech was summarized as Modelling Probability Distributions in a multidimensional space and universal function approximation and hence having nothing to do with "thinking", with which i fully agree (as a professional software engineer). What was however shocking to see towards the end of the video was that the professor (despite the spot on tech summary i just provided), then went into a completely unfounded scifi statements about doctors, lawers and enginners (and even greeting cards designers :-)) losing their jobs to the tune of 800 million (or was it 80 million whatveer). I cant comprehend how the professior managed to reconcile these two things in his head (non GAI nature of the current tech, GAI still a pipedream, no real thinking/intelligence in the current tech just deriving and modeling probabilities from the data to fit patters in it) and that replacing "knowldgfe workers" by the tone

@SimonPrince-lr9dk 2 ай бұрын

Good point and thank you for your kind words. I guess I reconcile these things because I think the technology we have already (even if there was no more significant development) might be enough to cause massive job losses. It could make many individuals much more productive and that would mean most companies would need fewer people. For example, I used Grammarly to proof my book. That was 2 months of proofreading work for someone just gone... Happy to be proved wrong about this though!

@Joorin4711 8 күн бұрын

It's not science fiction that Dall-E can take a sketch, generate a rendered image in a specific style and then offer up 10 variations which in turn can be sold (as greeting cards in this example). If using Dall-E is more cost effective than employing artists doing the same thing fewer artists will be able to make money creating greeting cards. This has nothing to do with AGI and everything with economics. The same thing can be said about lawyers who, typically, sift through data, apply their knowledge about laws and precedents and generate missives that are used to argue in favour of their clients. If any step in that process can be replaced by, say, GPT4 and make it more cost effective, fewer lawyers will be able to make money offering that service. No AGI, just economics. So, I see no problem with his stance on AGI, or lack thereof, and him accepting a prediction of workers losing their employment when non-AGI technology is being used in more and more sectors.

@kyrgyzsanjar 2 ай бұрын

Alright I'm sold. Ordering the book!

@quinnlintott406 3 ай бұрын

This is done very well. Bravo sir!

@AdrianMark 3 ай бұрын

Thank you Professor Prince. Your book is invaluable.

@sproccoli 3 ай бұрын

> me, who know nearly nothing about the theory of all of this stuff, but has implemented an image classification network hearing him talking about 'trying to push softmax functions to infinity' I get that reference.

@philipamadasun 3 ай бұрын

Haven't finished the video yet but, does this have an E-book as well?

@federicoaschieri 3 ай бұрын

Brilliant interview. Refreshing to watch professional content and not the garbage AI channels that the algorithm suggests first. This whole AI hype is built around OpenAI marketing to finance its cash-burning company, and the recommendation algorithm fuels it. I cannot agree more on the distinction between science and engineering. When I obtained my phd in mathematical logic I was excited about AI, because I thought it could unravel the mystery of intelligence. But we are not learning anything about it, and it is so depressing. We need so badly a profound theory of neural networks, instead of watching these engineers trying random stuff until something works.

@pedrojesusrangelgil5064 3 ай бұрын

Hey great book recommendation! Any with similar approach and style on machine learning? Thanks!

@hamzadata Ай бұрын

What an opportunity to listen to this episode when I just started reading the book recently :)

@snarkyboojum 15 күн бұрын

Beautifully produced! Love these types of videos from you. The amount of work that goes into creating one of these videos is mind boggling. Serious kudos. What a service you’re doing for the current and future generations of technologists.

@u2b83 4 ай бұрын

Wow!

@charlesalexanderable 4 ай бұрын

Cool it with the sound effects

@MachineLearningStreetTalk 4 ай бұрын

Podcast version has no music or sound effects 😄

@markkilgore6121 4 ай бұрын

The camera and editing is very annoying. Nearly unwatchable!

@Shaunmcdonogh-shaunsurfing Ай бұрын

How have I only just stumbled on this channel. Good to be home.

@ehfik 2 ай бұрын

thank you for these great interviews

@u2b83 4 ай бұрын

1:13:26 Figure 18.3a explains [to me] why we call it diffusion. I guess the hypothesis goes, as long as you take a small enough step size, you'll stay within the "[conditional] distribution" q(z)(z_t | x*) when iterating on the diffusion kernel q(z_t, x*), i.e. the dashed cyan border representing the time-evolving distribution bounds. Anyone here thinks this diffusion process looks kinda like the stock market? Where we have piece-wise linear dumdums jump out with their limit orders to steer q(z_t, x*) at every iteration lol

@Daniel-Six Ай бұрын

Yeah... It seems reasonable to surmise a kind of fractaline connection pervades these phenomena. I got the same vibe when I watched Veritasium's discussion of the option pricing equation for some reason.

@u2b83 4 ай бұрын

1:13:45 Three-Dimensional Orange (Volume): For a regular three-dimensional orange, which we can approximate as a sphere, the volume is calculated using the formula 4/3 * pi r^3 Four-Dimensional Orange (Hypervolume): In four dimensions, an object analogous to a sphere is called a "hypersphere." The formula for the hypervolume of a 4D hypersphere is 1/2 * pi^2 r^4