Stanford CS236: Deep Generative Models I 2023 I Lecture 12 - Energy Based Models

  Рет қаралды 617

Stanford Online

Stanford Online

18 күн бұрын

For more information about Stanford's Artificial Intelligence programs visit: stanford.io/ai
To follow along with the course, visit the course website:
deepgenerativemodels.github.io/
Stefano Ermon
Associate Professor of Computer Science, Stanford University
cs.stanford.edu/~ermon/
Learn more about the online course and how to enroll: online.stanford.edu/courses/c...
To view all online courses and programs offered by Stanford, visit: online.stanford.edu/

Пікірлер: 1
@CPTSMONSTER
@CPTSMONSTER Күн бұрын
5:40 Contrastive divergence, changes in gradient of log partition function wrt theta easy to evaluate if samples from the model can be accessed 8:25 Training energy based models by maximum likelihood is feasible to the extent that samples can be generated, MCMC 14:00? MCMC methods, detailed balance condition 22:00? log x=x' term 23:25? Computing log-likelihood is easy for EBMs 24:15 Very expensive to train EBMs, every training data point requires a sample to be generated from the model, generating sample involves Langevin MCMC with 1000 steps 37:30 Derivative of KL divergence is Fisher divergence, two densities convolved with Gaussian noise, derivative wrt size of noise is Fisher divergence 38:40 Score matching, theta is continuous 47:10 Score matching derivation, independent of p_data 51:15? Equivalent to Fisher divergence 52:35 Interpretation of loss function, first term makes data points stationary (local minima or maxima) to minimize log-likelihood, small perturbations in the data points should not increase the log-likelihood by a lot, second term makes data points local maxima not minima 55:30? Backprop n times to calculate Hessian 56:20 Proved equivalence to Fisher divergence, infinite data would yield the exact data distribution 57:45 Fitting EBM, similar flavor to GANs. Instead of contrasting data to samples from the model, contrast to noise 1:00:10 Instead of setting the discriminator to some neural network, define it with the same form as the optimal discriminator. Not feeding x arbitrarily into neural network, evaluate the likelihoods under the model p_theta and noise distributions. The optimal p_theta must match p_data, due to the pre-defined form of the discriminator. Parameterize p_theta with EBM. (In a GAN setting, the discriminator itself would be parameterized by a neural network.) 1:03:00? Classifiers in noise correction 1:11:30 Loss function is independent of sampling, getting EBM and sampling still requires MCMC Langevin steps 1:19:00 GAN vs NCE, generator trained in GAN, noise distribution fixed in NCE but need to evaluate likelihood of noise 1:22:20 Noise contrastive estimation, where the noise distribution a flow that is learned adversarially
Stanford CS236: Deep Generative Models I 2023 I Lecture 6 - VAEs
1:22:01
Chips evolution !! 😔😔
00:23
Tibo InShape
Рет қаралды 27 МЛН
FOOTBALL WITH PLAY BUTTONS ▶️ #roadto100m
00:29
Celine Dept
Рет қаралды 51 МЛН
顔面水槽がブサイク過ぎるwwwww
00:58
はじめしゃちょー(hajime)
Рет қаралды 114 МЛН
Stanford CS25: V3 I Retrieval Augmented Language Models
1:19:27
Stanford Online
Рет қаралды 131 М.
Stanford CS236: Deep Generative Models I 2023 I Lecture 2 - Background
1:20:09
Cornell CS 6785: Deep Generative Models. Lecture 11: Energy-Based Models
1:11:33
EfficientML.ai Lecture 16 - Diffusion Model (MIT 6.5940, Fall 2023)
1:16:26
Stanford CS236: Deep Generative Models I 2023 I Lecture 5 - VAEs
1:21:02
Stanford Online
Рет қаралды 1,4 М.
Stanford CS25: V4 I Aligning Open Language Models
1:16:21
Stanford Online
Рет қаралды 13 М.
Diffusion and Score-Based Generative Models
1:32:01
MITCBMM
Рет қаралды 64 М.
Einstein's General Theory of Relativity | Lecture 1
1:38:28
Stanford
Рет қаралды 7 МЛН
From Deep Learning of Disentangled Representations to Higher-level Cognition
1:17:05
Chips evolution !! 😔😔
00:23
Tibo InShape
Рет қаралды 27 МЛН