#50 Dr. CHRISTIAN SZEGEDY - Formal Reasoning, Program Synthesis

  Рет қаралды 10,936

Machine Learning Street Talk

Machine Learning Street Talk

Күн бұрын

Dr. Christian Szegedy from Google Research is a deep learning heavyweight. He invented adversarial examples, one of the first object detection algorithms, the inceptionnet architecture, and co-invented batchnorm. He thinks that if you bet on computers and software in 1990 you would have been as right as if you bet on AI now. But he thinks that we have been programming computers the same way since the 1950s and there has been a huge stagnation ever since. Mathematics is the process of taking a fuzzy thought and formalising it. But could we automate that? Could we create a system which will act like a super human mathematician but you can talk to it in natural language? This is what Christian calls autoformalisation. Christian thinks that automating many of the things we do in mathematics is the first step towards software synthesis and building human-level AGI. Mathematics ability is the litmus test for general reasoning ability. Christian has a fascinating take on transformers too.
With Yannic Lightspeed Kilcher and Dr. Mathew Salvaris
Whimsical Canvas with Tim's Notes:
whimsical.com/mar-26th-christ...
Pod version: anchor.fm/machinelearningstre...
Tim Introducton [00:00:00]
Show Kick-off [00:09:12]
Why did Christian pivot from vision to reasoning? [00:12:07]
Autoformalisation [00:12:47]
Kepler conjecture [00:17:30]
What are the biggest hurdles you have overcome? [00:20:11]
How does something as fuzzy as DL come into mathematical formalism? [00:23:05]
How does AGI connect to autoformalisation? [00:30:32]
Multiagent systems used in autoformalisation? Create an artificial scientific community of AI agents! [00:36:42]
Walid Saba -- the information is not in the data [00:41:58]
Is generalization possible with DL? What would Francois say? [00:45:02]
What is going on in a neural network? (Don't Miss!) [00:47:59]
Inception network [00:52:42]
Transformers negate the need for architecture search? [00:55:58]
What do you do when you get stuck in your research? [00:58:08]
Why do you think SGD is not the path forward? [00:59:59]
Is GPT-3 on the way to AGI [01:02:01]
Is GPT-3 a hashtable or a learnable program canvas? [01:05:01]
What worries Christian about the research landscape? [01:07:14]
The style that research is conducted [01:11:10]
Layerwise self supervised training [01:13:59]
Community Questions: The problem of reality in AI ethics [01:15:33]
Community Questions: Internal covariate shift and BatchNorm [01:20:03]
Community Questions: What is so special about attention? [01:23:08]
Jürgen Schmidhuber [01:24:18]
Community Question: Data efficiency and is it possible to "learn" inductive biases? [01:27:13]
Francois's ARC challenge, is inductive learning still relevant? [01:31:13]
A Promising Path Towards Autoformalization and General Artificial Intelligence [Szegedy]
link.springer.com/chapter/10....
Learning to Reason in Large Theories without Imitation [Bansal/Szegedy]
arxiv.org/pdf/1905.10501.pdf
MATHEMATICAL REASONING VIA SELF-SUPERVISED SKIP-TREE TRAINING [Rabe .. Szegedy]
openreview.net/pdf?id=YmqAnY0...
LIME: LEARNING INDUCTIVE BIAS FOR PRIMITIVES OF MATHEMATICAL REASONING [Wu..Szegedy]
arxiv.org/abs/2101.06223v1
DEEP LEARNING FOR SYMBOLIC MATHEMATICS [Lample]
arxiv.org/pdf/1912.01412.pdf
It’s Not What Machines Can Learn, It’s What We Cannot Teach [Yehuda]
arxiv.org/pdf/2002.09398.pdf
Investigating the Limitations of Transformers with Simple Arithmetic Tasks [Nogueira]
arxiv.org/pdf/2102.13019.pdf
Provable Bounds for Learning Some Deep Representations [Arora]
arxiv.org/pdf/1310.6343.pdf
Neural nets learn to program neural nets with fast weights [Schmidhuber]
people.idsia.ch/~juergen/fast...
How does Batch Normalization Help Optimization? [Ilyas]
gradientscience.org/batchnorm/
How to Train Your ResNet 7: Batch Norm
myrtle.ai/learn/how-to-train-...
Training a ResNet to 94% Accuracy on CIFAR-10 in 26 Seconds on a Single GPU - [KUHN]
efficientdl.com/how-to-train-...
en.wikipedia.org/wiki/HOL_Light
en.wikipedia.org/wiki/Coq
en.wikipedia.org/wiki/Kepler_...
en.wikipedia.org/wiki/Feit%E2...
We used a few clips from the ScaleAI interview with Christian - • Interview with Christi...

Пікірлер: 37
@machinelearningdojowithtim2898
@machinelearningdojowithtim2898 3 жыл бұрын
First! Woohoo! This was a brilliant conversation with Christian, what a legend! 💥🙌👍
@daveman683
@daveman683 3 жыл бұрын
Every time, I listen to this podcast, I go back with new ideas.
@daveman683
@daveman683 3 жыл бұрын
I would also recommend Dr. Sanjeev Arora as a guest request. I have gone through all his lectures and they are absolutely amazing to ground a lot of theoretical understanding about Deep learning.
@daveman683
@daveman683 3 жыл бұрын
@@things_leftunsaid Yes. typo in writing. I really enjoyed his lectures. It opened up my perspective.
@danielalorbi
@danielalorbi 3 жыл бұрын
8:25 - "Too much rigor led to rigor mortis" Bars.
@hellofromc-1374
@hellofromc-1374 3 жыл бұрын
Kendrick would be proud
@stalinsampras
@stalinsampras 3 жыл бұрын
Congratulation on the 50th Episode. I'm a huge fan of this podcast/videocast. Looking forward to the 100th episode, I'm sure with the rate of improvement you guys have shown in video editing, you could be making mini docuseries relating to AI. Congrats and all the best.
@MachineLearningStreetTalk
@MachineLearningStreetTalk 3 жыл бұрын
Thanks!! Happy 50!
@hellofromc-1374
@hellofromc-1374 3 жыл бұрын
get Schmidhuber here!!
@MachineLearningStreetTalk
@MachineLearningStreetTalk 3 жыл бұрын
Please can you all tell him you want to see him on the show! We were soooo close to getting him on, he just needs a little bit of convincing 😃😃
@abby5493
@abby5493 3 жыл бұрын
Loving the graphics, gets better and better 😍
@rohankashyap2252
@rohankashyap2252 3 жыл бұрын
Absolute legend Christian, the best episode!
@therealjewbagel
@therealjewbagel 3 жыл бұрын
Awesome! An episode I've been waiting for!!! Can't wait for next week either.
@AICoffeeBreak
@AICoffeeBreak 3 жыл бұрын
Thanks for answering my question! I think Christian Szegedy made a great point on bias: people make little case studies around very particular questions but do not analyse AI feedback loops in general (including nonproblematic ones). Wouldn't this be a place where AI could collaborate with sociology?
@_tnk_
@_tnk_ 3 жыл бұрын
Great session!
@HappyMathDad
@HappyMathDad Жыл бұрын
If we are able to solve the formalism problem that Dr Szegedy is working on. I think we also solve his concern. Because we could sidestep or ignorance of the current deep networks, since they would just become a vehicle to get formalisms.
@coder8i
@coder8i 3 жыл бұрын
Whimsical Canvas is really nice to follow along the conversation
@dinasina3558
@dinasina3558 3 жыл бұрын
NN are hastables. Like a humans too. We don't compute multiplication. We memorising multiplication tables in school.
@user-yn8rg2xv4w
@user-yn8rg2xv4w 3 жыл бұрын
AMAZING INTERVIEW !!!!!! .... When are you guys going to interview Ben Goertzel and Christos Papadimitriou??
@MachineLearningStreetTalk
@MachineLearningStreetTalk 3 жыл бұрын
Thanks for letting us know about Christos Papadimitriou, he looks great! We did try to invite Ben I think on Twitter
@janosneumann1987
@janosneumann1987 3 жыл бұрын
Awesome interview, really enjoyed the show, learned a lot. I liked the question so what do you think is going on in the deep learning model :)
@keithkam6749
@keithkam6749 3 жыл бұрын
RE: multiagent systems in auto-formalization: This reminded me of an interactionist theory of reasoning from Sperber and Mercier's 'The Enigma of Reason' - It's a psychology book so maybe not the ML/AI crowd's cup of tea but many of the ideas are very applicable: The core idea proposed is that we humans do not have what Kahneman describes as 'system 2', logical thinking. (in 'Thinking fast and slow' he proposes that we have two types of cognition, system 1 = fast cheap intuitions, system 2 = slow expensive logical deduction). Instead, Sperber and Mercier suggest that all we have are intuitions - the ability to pattern match. Specifically, our ability to reason is actually intuitions about reasons, combined with intuitions for evaluating the validity of reasons. They argue that the primary purpose of reasoning from an evolutionary perspective is not to generate new knowledge from existing knowledge, but instead to generate emergent consensus and allow for cooperation between non-related individuals. 1. Alice wants to convince of Bob something - e.g. a new idea, a proposal to do something together, a justification for an action. 2. Of course, Bob would not accept everything Alice proposes. If that was the case they would be gullible and easily taken advantage of. 3. However, it is not beneficial to reject everything Alice proposes, since the knowledge could be useful (for Bob or for both of them). 4. To get around this, Alice proposes clear to follow, logical reasons that Bob then has to evaluate. Perhaps the key to reasoning in an ML context would be this generative, adversarial process, combined with an ability to direct attention to existing knowledge bases or new experiments.
@HappyMathDad
@HappyMathDad Жыл бұрын
AGI used to be 10 years away, that's improvement
@dr.mikeybee
@dr.mikeybee 3 жыл бұрын
Is there a way to do transfer learning with transformers? Can the slow weights be re-trained without starting from scratch?
@dr.mikeybee
@dr.mikeybee 3 жыл бұрын
Tim, you are asking one of my questions in somewhat different wording. Here's my query: In CNNs pooling effectively changes the size of the sliding window; so that the model learns larger and larger features. Is there something like this in transformers?
@Isinlor
@Isinlor 3 жыл бұрын
Does anyone know what is being mentioned at 1:14:55 ? It's somehow connected to infinite RNN without back-propagation trough time.
@willd1mindmind639
@willd1mindmind639 3 жыл бұрын
Human reasoning starts at the earliest steps of existence because it takes place in a different part of the brain as opposed to, for example, the visual cortex. The visual cortex for all intents and purposes is a visual encoding system that converts light waves into neuronal networks. And those neuronal networks represent all the visual details of the real world that are stitched together by the brain into a coherent mental picture or projection of the real world. So what happens just in terms of the idea of 'visual intelligence' is that the features of the real world become inputs to the higher abstract reasoning parts of the brain as neural network "features". Which in turn become conceptual entities used in logic and reasoning. So a dog, is recognized because the features of fur (which in turn is a collection of features representing individual hair shapes), plus shape of the body parts (snout, ears, body pose, legs), plus number of legs, plus tail, etc. Now the trick is that each of those feature collections(clouds of neural network data) are inputs as parameters to a higher order part of the brain that does reasoning and logic and the weighting of parameters and relationship between parameters for understanding happens there, not in the visual cortex.. And the power of that is that it isn't a static set of parameters or a fixed set of weights. It is variable, which is how logic and reasoning are expressed. That ability then carries over to every other aspect of human intelligence. We see this in humans in the sense that if I take and draw a stylistic outline of a dog with some of the characteristic shape features of a dog, humans recognize it instantly as a dog (ie. dog emoji). Because the thinking area of the brain recalls that the shape of the emoji even if in a single color, matches the features found in real dogs and as a parameter into the analysis and reasoning area can be evaluated as "shape of a dog" in a logical conceptual way, even though there is no actual dog there as opposed to a some pen/marker shapes on paper, brush strokes in a panting or pixels on a computer screen. In fact the brain can handle the idea that the shape of a dog is drawn on paper with a pen as part of this reasoning and understanding process. That is because the feature encoding of the shapes themselves are separate from the reasoning part. Meaning these aren't hidden layers of logic as expressed in most monolithic machine models which are expected to compress all "thinking" into a single logical output based on hidden parameters and hidden weights. The difference is that these things, like feature layers or weights aren't hidden in the brain. The conceptual ability of the brain to reason is an extension of the fact that once the features are encoded by the brain (analog/digital conversion biologically) they become separate from the actual source and real world. And the higher order parts of the brain use that to make sense of the world, just like your view of the world is actually based on a projection of the world based on neural data as that is no longer the same as the light waves that triggered those neurons.
@SimonJackson13
@SimonJackson13 3 жыл бұрын
APL to syntax tree translators?
@bethcarey8530
@bethcarey8530 2 жыл бұрын
Great question Tim, based on Walid's interview - can transformers 'get there' with enough data, for language understanding? I take Christian's answer to be 'yes' because Transformers are currently too small but they can get there. This is at odds with symbolic AI advocates & inventors because as Walid says, not everything we use to generalize from is 'in the data'. And there is a gold standard blueprint brains follow to be able to generalize and provides our 'common sense', whatever our native language.
@bethcarey8530
@bethcarey8530 2 жыл бұрын
I'd love to know what Christian, or any transformer guru believes is enough training data that necessarily produces reasoning required for natural language. My math could be wrong, but GPT-3 used ~225x10^9
@ratsukutsi
@ratsukutsi 3 жыл бұрын
Excuseme, gentlemen, I have a question to ask, where is Mr Chollet at this time? I'm curious to know.
@dosomething3
@dosomething3 3 жыл бұрын
TLDR; Math is a closed system. Or as close as possible to being closed. Which makes it simpler for a neutral network to process compared to any other system. With reality being the furthest from a closed system. Hence the most difficult for a neutral network to process.
@machinelearningdojowithtim2898
@machinelearningdojowithtim2898 3 жыл бұрын
I think it's tempting to think it is, but it only is for things you know already-- you still need to make conjectures for things you don't know and this is very open-ended in lieu of some more general frameworks of reference which we currently have. It is precisely this reason that Christian thinks that Mathematics is the litmus test for AGI. I hope Christian will chime in with a comment on this because I think it gets to the core of the work.
@sabawalid
@sabawalid 3 жыл бұрын
The conclusion of the paper "Investigating the Limitations of Transformers with Simple Arithmetic Tasks" is that "models cannot learn addition rules that are independent of the length of the numbers seen during training". This is expected, because if you fix the number of digits then the space of that function is finite (it is a finite table). Addition of varying length numbers is infinite, and -- again, like in language, when the space is infinite, like most real problems in cognition, DL has nothing to offer. It is INFINITY all over again, that makes ANY data-driven approach nothing more than a crude approximation that can always be adversarially attacked.
@MachineLearningStreetTalk
@MachineLearningStreetTalk 3 жыл бұрын
You will enjoy our next episode with Francois Chollet, dropping in the next few days!
@sabawalid
@sabawalid 3 жыл бұрын
What??? "Mathematics is the process of taking a fuzzy thought and formalizing it" - I can't believe I heard that. Mathematics exists despite of physical reality - thus mathematics is not invented it is "discovered". We do not "invent" mathematical theorems, we discover them and then learn how to prove them. As just a simple example, we did not invent the fact that the sum of 1 + 2 + 3 + ... + n = n(n+1) / 2 - we simply discovered that fact.
@MachineLearningStreetTalk
@MachineLearningStreetTalk 3 жыл бұрын
I am pretty sure Christian would agree with you - I think you misunderstood. We still make conjectures to discover the mathematics. Geometric progression as you cited is a great example of something which Christian would want to use deep learning to discover (if we didn't know it already). The auto formalization stuff just means converting from language and text into abstract syntax trees.
#55 Dr. ISHAN MISRA - Self-Supervised Vision Models
1:36:22
Machine Learning Street Talk
Рет қаралды 22 М.
Kernels!
1:37:30
Machine Learning Street Talk
Рет қаралды 19 М.
ISSEI funny story😂😂😂Strange World | Pink with inoCat
00:36
ISSEI / いっせい
Рет қаралды 25 МЛН
Glow Stick Secret (part 2) 😱 #shorts
00:33
Mr DegrEE
Рет қаралды 32 МЛН
Don’t take steroids ! 🙏🙏
00:16
Tibo InShape
Рет қаралды 26 МЛН
#041 Dr. SIMON STRINGER - Biologically Plausible Neural Networks
1:27:06
Machine Learning Street Talk
Рет қаралды 13 М.
#53 Prof. BOB COECKE - Quantum Natural Language Processing
2:17:52
Machine Learning Street Talk
Рет қаралды 12 М.
Google I/O 2024: Everything Revealed in 12 Minutes
11:27
NEW GPT-4o: My Mind is Blown.
6:28
Joshua Chang
Рет қаралды 215 М.
#54 Prof. GARY MARCUS + Prof. LUIS LAMB - Neurosymbolic models
2:24:13
Machine Learning Street Talk
Рет қаралды 54 М.
#032- Simon Kornblith / GoogleAI - SimCLR and Paper Haul!
1:30:29
Machine Learning Street Talk
Рет қаралды 8 М.
Christian Szegedy - Deep Learning for Formal Reasoning
1:00:48
Institut des Hautes Études Scientifiques (IHÉS)
Рет қаралды 1,8 М.
#047 Interpretable Machine Learning - Christoph Molnar
1:40:22
Machine Learning Street Talk
Рет қаралды 13 М.
Sara Hooker - The Hardware Lottery, Sparsity and Fairness
1:30:36
Machine Learning Street Talk
Рет қаралды 4,7 М.