Stanford CS236: Deep Generative Models I 2023 I Lecture 13 - Score Based Models

  Рет қаралды 1,679

Stanford Online

Stanford Online

Ай бұрын

For more information about Stanford's Artificial Intelligence programs visit: stanford.io/ai
To follow along with the course, visit the course website:
deepgenerativemodels.github.io/
Stefano Ermon
Associate Professor of Computer Science, Stanford University
cs.stanford.edu/~ermon/
Learn more about the online course and how to enroll: online.stanford.edu/courses/c...
To view all online courses and programs offered by Stanford, visit: online.stanford.edu/

Пікірлер: 2
@user-zr4ns3hu6y
@user-zr4ns3hu6y Ай бұрын
Best explanation!
@CPTSMONSTER
@CPTSMONSTER 27 күн бұрын
2:00 Summary 8:00 EBMs training, maximum likelihood training requires estimation of partition function, contrastive divergence requires samples to be generated (MCMC Langevin with 1000 steps), minimize Fischer divergence (score matching) instead of KL divergence 19:15 EBMs parameterize a conservative vector field of gradients of an underlying scalar function, score based models generalize this to an arbitrary vector field. EBMS directly model the log-likelihood, score based models directly model the score (no functional parameters). 29:15? Backprops in EBM, f_theta derivative is s_theta, Jacobian of s_theta 34:00 Fischer divergence between model and perturbed data 36:25 Noisy data q to remove trace of Jacobian in calculations 39:30 Linearity of gradient 42:15 Estimating the score of the noise perturbed data density is equivalent to estimating the score of the transition kernel (Gaussian noise density) 44:10 Trace of Jacobian removed from estimation, loss function is a denoising objective 46:05 Sigma of noise distribution as small as possible 48:55? Stein unbiased risk estimator trick, evaluating quality of an estimator without knowing ground truth, denoising objective 51:55 Denoising score matching, these two objectives are equivalent up to a constant, minimizing the bottom objective (denoising) is equivalent to minimizing the top objective (which estimates the score of the distribution convolved with Gaussian noise) 52:20? Individual conditionals 53:25 Reduced generative modelling to denoising 55:35? Tweedie's formula, alternative derivation of denoising objective 58:25 Interpretation of equations, conditional on x is correlated to the joint density of x and perturbed x, q sigma is the integral of the joint density, Tweedie's formula expresses x in terms of the perturbed x with an optimal adjustment (gradient of q sigma which is correlated to the density of x conditional on the perturbed x) 1:01:35? Jacobian vector products, directional derivatives, efficient to estimate using backprop 1:04:20 Sliced score matching single backprop (directional derivative), without slicing needs d backprops 1:07:00 Sliced score matching not on perturbed data 1:12:00 Langevin MCMC, sampling with score 1:14:35 Real world data tends to lie on a low dimensional manifold 1:21:00 Langevin mixing too slowly, mixture weight disappears when taking gradients
[1hr Talk] Intro to Large Language Models
59:48
Andrej Karpathy
Рет қаралды 1,9 МЛН
How to bring sweets anywhere 😋🍰🍫
00:32
TooTool
Рет қаралды 52 МЛН
TRY NOT TO LAUGH 😂
00:56
Feinxy
Рет қаралды 15 МЛН
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 278 М.
Geoffrey Hinton in conversation with Fei-Fei Li - Responsible AI development
1:48:12
Arts & Science - University of Toronto
Рет қаралды 114 М.
Stanford CS236: Deep Generative Models I 2023 I Lecture 1 - Introduction
57:28
Lecture 1 | The Fourier Transforms and its Applications
52:07
Stanford
Рет қаралды 1,2 МЛН
MIT 6.S191: Deep Generative Modeling
56:19
Alexander Amini
Рет қаралды 27 М.
Watching Neural Networks Learn
25:28
Emergent Garden
Рет қаралды 1,2 МЛН
EI Seminar - Siyuan Feng & Ben Burchfiel - Towards Large Behavior Models
1:23:05
MIT Embodied Intelligence
Рет қаралды 2,8 М.
Diffusion and Score-Based Generative Models
1:32:01
MITCBMM
Рет қаралды 68 М.