05.1 - Latent Variable Energy Based Models (LV-EBMs), inference

  Рет қаралды 17,461

Alfredo Canziani (冷在)

Alfredo Canziani (冷在)

Күн бұрын

Пікірлер: 68
@liuculiu8366
@liuculiu8366 3 жыл бұрын
Really appreciate your generosity in sharing these courses online. We live in the best era.
@alfcnz
@alfcnz 3 жыл бұрын
😇😇😇
@hasanrants
@hasanrants 29 күн бұрын
Alfredo thank you very much for this concrete explanation. Yann's lectures are full of dense knowledge and I had some doubts and gaps that are filled by this practicum. Appreciated/
@alfcnz
@alfcnz 28 күн бұрын
🥳🥳🥳
@donthulasumanth5415
@donthulasumanth5415 13 күн бұрын
@23:59 Number of trainable params in decoder are 2. One for w1cos(z), one for w2sin(z) respectively.
@alfcnz
@alfcnz 11 күн бұрын
Precisely 🙂
@punkbuster89
@punkbuster89 3 жыл бұрын
"we gonna actually see next week how do we learn shi... eeerm stuff...." Cracked me up XD thanks for the amazing course BTW, really enjoying it!
@alfcnz
@alfcnz 3 жыл бұрын
🥳🥳🥳
@bibhabasumohapatra
@bibhabasumohapatra Жыл бұрын
I have seen this video 5-6 times last 1 year. but Now when I understood 50% of it I was like F. Insane . . . .Amazing
@alfcnz
@alfcnz Жыл бұрын
Check out the 2023 edition! It’s much better! 🥳🥳🥳
@abdulmajidmurad4667
@abdulmajidmurad4667 3 жыл бұрын
Thanks alfredo (pretty cool plots btw).
@alfcnz
@alfcnz 3 жыл бұрын
😍😍😍
@soumyasarkar4100
@soumyasarkar4100 3 жыл бұрын
wow what cool visualisations !!
@alfcnz
@alfcnz 3 жыл бұрын
🎨🖌️👨🏼‍🎨
@kseniiaikonnikova576
@kseniiaikonnikova576 3 жыл бұрын
yay, new video! 😍 Alfredo, many thanks!
@alfcnz
@alfcnz 3 жыл бұрын
🥳🥳🥳
@НиколайНовичков-е1э
@НиколайНовичков-е1э 3 жыл бұрын
Thank you, Alfredo!
@alfcnz
@alfcnz 3 жыл бұрын
Добре дошъл, Николай. 😊😊😊
@НиколайНовичков-е1э
@НиколайНовичков-е1э 3 жыл бұрын
It's in another language :))))
@alfcnz
@alfcnz 3 жыл бұрын
Упс, а вы говорите по-русски?
@НиколайНовичков-е1э
@НиколайНовичков-е1э 3 жыл бұрын
Ага, это мой родной язык. Я из России :))
@alfcnz
@alfcnz 3 жыл бұрын
@@НиколайНовичков-е1э Ха-ха, ладно, раньше я ошибался языком 😅😅😅
@rsilveira79
@rsilveira79 3 жыл бұрын
Awesome material, really looking forward to the next classes. Thanks for all the effort you put on designing this classes
@alfcnz
@alfcnz 3 жыл бұрын
You're welcome 😁
@bibhabasumohapatra
@bibhabasumohapatra Жыл бұрын
Basically in Layman terms we are choosing the best y_pred out of n number of y_pred for each ground truth y. Right?
@alfcnz
@alfcnz 6 ай бұрын
Yes, that’s correct! And we need to do so because otherwise we would be learning to predict the average target y.
@pastrop2003
@pastrop2003 3 жыл бұрын
Thank you, Alfredo, great video. I have been reading about the energy models for a few weeks already and still have a nagging question: Does energy function is a generalized loss function? I keep thinking that I can reframe any traditional neural network loss as an energy function. What do I miss here?
@alfcnz
@alfcnz 3 жыл бұрын
Next episode I'll talk about the loss. Stay tuned.
@robinranabhat3125
@robinranabhat3125 Жыл бұрын
THANK YOU !
@alfcnz
@alfcnz Жыл бұрын
You're welcome! 🥰🥰🥰
@chetanpandey8722
@chetanpandey8722 Жыл бұрын
Thank you for posting making such amazing videos. I have a doubt. Whenever you are talking about optimizing the energy function to find the minimum value you are saying that we should be using gradient descent and not stochastic gradient descent. In my understanding, in gradient descent we calculate the gradients using the whole dataset and then make an update while in stochastic case we take random data points to calculate the gradient and then make the update. So I am not able to understand what is the problem with stochastic gradient descent
@alfcnz
@alfcnz 10 ай бұрын
The energy is a scalar value for a given input x, y, z. You want to minimise this energy, for example, wrt z. There’s nothing stochastic here. When training a model, we minimise the loss by following a noisy gradient computed for a given per-sample (or per-batch) loss.
@mikhaeldito
@mikhaeldito 3 жыл бұрын
Office hours, please!
@alfcnz
@alfcnz 3 жыл бұрын
Okay, okay.
@francistembo650
@francistembo650 3 жыл бұрын
First comment. Couldn't resist my favourite teacher's video.
@alfcnz
@alfcnz 3 жыл бұрын
❤️❤️❤️
@blackcurrant0745
@blackcurrant0745 2 жыл бұрын
At 17:56 you say there are 48 different z's, then later you have only 24 of them at 29:43, and then later yet one can count 48 lilac z points in the graph at 42:45. What's the reason of changing the number of z points back and forth?
@alfcnz
@alfcnz 2 жыл бұрын
Good catch! z is continuous. The number of distinct values I pick is arbitrary. In the following edition of these slides there are no more distinct z points but they are shown as a continuous line. So, there are infinitely many z. Why 24 and 48. 24 are my y. I used to generate them with 24 equally spaced z. When I show the ‘continuous’ manifold, I should show more points than training samples. So, I doubled them. Hence 48. It looks like I didn't use the doubled version for the plot with the 24 squares. In the following edition of this lesson (not online, because only minor changes have been made and these videos take me forever to put together) and in the book (which replaces my video editing time) there are no more discrete dots for the latents.
@WolfgangWaltenberger
@WolfgangWaltenberger 3 жыл бұрын
These are super cool, pedagogical videos. I wonder what software stack you guys are using to produce them.
@alfcnz
@alfcnz 3 жыл бұрын
Hum, PowerPoint, LaTeXiT, matplotlib, Zoom, Adobe After Effects and Premiere.
@sutharsanmahendren1071
@sutharsanmahendren1071 3 жыл бұрын
Thank you for your great explanation and make your course material available to all. I have a small doubt at 45:25 where you compute energy from all inferences from z samples. Is it the right way to use euclidian distance for computing distance from the reference point (y) to all the points(y hat) in the manifold. ? Will it is more appropriate if points from the bottom half of the manifold resulted in more energy than the first half?
@alfcnz
@alfcnz 3 жыл бұрын
E is a function of y and z. Given a y, say y', E is a function of z only. What I'm computing there is E(y', z) for a few values of z. In this example, for every z the decoder will give me ỹ. Finally, the energy function of choice, in this case, is the reconstruction error.
@sutharsanmahendren1071
@sutharsanmahendren1071 3 жыл бұрын
@@alfcnz Thank you so much for your reply. I understand that reconstruction error is one of the choices for energy function here.
@alfcnz
@alfcnz 3 жыл бұрын
@@sutharsanmahendren1071 okay, so your question is… not yet answered? Or did I nail it above?
@sutharsanmahendren1071
@sutharsanmahendren1071 3 жыл бұрын
@@alfcnzActually my question is reconstruction error is the best choice for EBM ? (Funny ideas: construct KNN graph with y_hat manifold and y (observation) and find the shortest path from y to all other y_hat; instead of computing energy between two points cant we measure the energy between two distributions which are formed by y and y_hat in EBM? )
@keshavsingh489
@keshavsingh489 3 жыл бұрын
Great explanation. Just one question: Why is it called energy function, when it looks just like a loss function with latent variable.?
@alfcnz
@alfcnz 3 жыл бұрын
A “loss” measures the network performance and it's minimised during training. We'll see more about this in the next episode. An “energy” is an actual output produced by a model and it's used during inference. In this episode we didn't train anything, still we've used gradient descent to perform inference of latent variables.
@keshavsingh489
@keshavsingh489 3 жыл бұрын
Thank you soo much for explaining. Looking forward to the next lecture.
@flaskapp9885
@flaskapp9885 3 жыл бұрын
amazing video alfredo :)
@alfcnz
@alfcnz 3 жыл бұрын
And more to come! 😇😇😇
@flaskapp9885
@flaskapp9885 3 жыл бұрын
@@alfcnz thanks, pls make guide video to NLP engineer or something :) there is no sort of nlp engineer things on the internet:)
@alfcnz
@alfcnz 3 жыл бұрын
@@flaskapp9885 NLP engineering? 🤔🤔🤔 What is it?
@flaskapp9885
@flaskapp9885 3 жыл бұрын
@@alfcnz yes sir, nlp engineering. IM thinking of doing that. :)
@alfcnz
@alfcnz 3 жыл бұрын
@@flaskapp9885 I don't know what that is. Can you explain?
@datonefaridze1503
@datonefaridze1503 2 жыл бұрын
You explain like Andrew Ng, giving examples are essential for proper understanding, thank you so much, great content
@alfcnz
@alfcnz 2 жыл бұрын
❤️❤️❤️
@mythorganizer4222
@mythorganizer4222 3 жыл бұрын
Hello Mr. Canziani!
@alfcnz
@alfcnz 3 жыл бұрын
“Prof” Canziani 😜
@mythorganizer4222
@mythorganizer4222 3 жыл бұрын
@@alfcnz I am sorry Professor Canziani. I want to tell you, your videos are the best learning source for people who want to study deep learning but can't afford it. Oh your videos and also deep learning by Ian Goodfellow. It is a very good book. Thank you for all the efforts you put in sir :D
@alfcnz
@alfcnz 3 жыл бұрын
😇😇😇
@anondoggo
@anondoggo 2 жыл бұрын
So inference means we're given x and y and we want to predict an energy score E(x, y)? I thought lv-ebm was supposed to produce predictions for y, better go back to the slide :/
@anondoggo
@anondoggo 2 жыл бұрын
Ok so I think what's going on is, during training y is the target for which we should give low E for; during inference, we're choosing a y that gives the lowest energy and y is an input. Mind is blown :/
@alfcnz
@alfcnz 2 жыл бұрын
I think your sentence is broken. «during inference…» we'd like to test how far a given y is from the data manifold.
@kevindoran9031
@kevindoran9031 2 жыл бұрын
How we learn s*** 😂
@alfcnz
@alfcnz 2 жыл бұрын
Without timestamp it's hard to double check 🥺🥺🥺
06 - Latent Variable Energy Based Models (LV-EBMs), training
1:04:49
Alfredo Canziani (冷在)
Рет қаралды 10 М.
05L - Joint embedding method and latent variable energy based models (LV-EBMs)
1:51:31
Alfredo Canziani (冷在)
Рет қаралды 24 М.
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 30 МЛН
VIP ACCESS
00:47
Natan por Aí
Рет қаралды 30 МЛН
My scorpion was taken away from me 😢
00:55
TyphoonFast 5
Рет қаралды 2,7 МЛН
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН
04.1 - Natural signals properties and the convolution
1:09:13
Alfredo Canziani (冷在)
Рет қаралды 15 М.
Visualizing transformers and attention | Talk for TNG Big Tech Day '24
57:45
Biggest Puzzle in Computer Science: P vs. NP
19:44
Quanta Magazine
Рет қаралды 919 М.
06L - Latent variable EBMs for structured prediction
1:48:54
Alfredo Canziani (冷在)
Рет қаралды 10 М.
10 - Self / cross, hard / soft attention and the Transformer
1:12:01
Alfredo Canziani (冷在)
Рет қаралды 36 М.
Is the Future of Linear Algebra.. Random?
35:11
Mutual Information
Рет қаралды 377 М.
MIT 6.S191: Reinforcement Learning
1:00:19
Alexander Amini
Рет қаралды 63 М.
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 30 МЛН