Stanford CS236: Deep Generative Models I 2023 I Lecture 14 - Energy Based Models

  Рет қаралды 1,013

Stanford Online

Stanford Online

19 күн бұрын

For more information about Stanford's Artificial Intelligence programs visit: stanford.io/ai
To follow along with the course, visit the course website:
deepgenerativemodels.github.io/
Stefano Ermon
Associate Professor of Computer Science, Stanford University
cs.stanford.edu/~ermon/
Learn more about the online course and how to enroll: online.stanford.edu/courses/c...
To view all online courses and programs offered by Stanford, visit: online.stanford.edu/

Пікірлер: 1
@CPTSMONSTER
@CPTSMONSTER Күн бұрын
7:00 Sliced score matching slower than denoising score matching, taking derivatives 13:45 Denoising data minimizes sigma, but minimum sigma is not optimal for perturbing data when sampling 27:15 Annealed Langevin, 1000 sigmas 38:50 Fokker Planck PDE, interdependence of scores, intractable so treat loss functions (scores) as independent 45:00? Weighted combination of denoising score matching losses, estimation of score for each perturbed data by sigma_i, weighted combination of the estimated scores 48:15 As efficient as estimating a single non-conditional score network, joint estimation of scores is amortized by a single score network 49:50? Smallest to largest noise during training, largest to smallest noise during inference (Langevin) 52:10? Notation, p sigma_i is equivalent to previous q (estimation of perturbed data) 57:20 Mixture denoising score matching is expensive at inference time (Langevin steps), deep computation graph which doesn't have to be unrolled at training time (not generating samples during training) 1:07:00 SDE describes perturbation iterations over time 1:08:50 Inference time (largest to smallest noise) described by reverse SDE which only depends on the score functions of the noise perturbed data densities 1:12:00 Euler-Maruyama discretizes time to solve numerically solve SDE 1:13:25 Numerically integrating SDE that goes from noise to data 1:15:00? SDE and Langevin corrector 1:20:25 Infinitely deep computation graph (refer to 57:20) 1:21:45 Possible to convert SDE model to normalizing flow and get latent variables 1:22:00 SDE can be described as ODE with same marginals 1:23:15 Machinery defines a continuous time normalizing flow where the invertible mapping is given by solving an ODE, paths of solved ODE with different initial conditions can never cross (invertible, normalizing flow), normalizing flow model trained not by maximum likelihood but by score matching, flow with infinite depth (likelihoods can be obtained)
顔面水槽がブサイク過ぎるwwwww
00:58
はじめしゃちょー(hajime)
Рет қаралды 116 МЛН
Let's all try it too‼︎#magic#tenge
00:26
Nonomen ノノメン
Рет қаралды 55 МЛН
Flow Matching for Generative Modeling (Paper Explained)
56:16
Yannic Kilcher
Рет қаралды 36 М.
When Computers Write Proofs, What's the Point of Mathematicians?
6:34
Quanta Magazine
Рет қаралды 371 М.
Watching Neural Networks Learn
25:28
Emergent Garden
Рет қаралды 1,1 МЛН
MIT Introduction to Deep Learning | 6.S191
1:09:58
Alexander Amini
Рет қаралды 166 М.
Einstein's General Theory of Relativity | Lecture 1
1:38:28
Stanford
Рет қаралды 7 МЛН
What are Diffusion Models?
15:28
Ari Seff
Рет қаралды 199 М.
Stanford CS25: V4 I Aligning Open Language Models
1:16:21
Stanford Online
Рет қаралды 14 М.
Why Neural Networks can learn (almost) anything
10:30
Emergent Garden
Рет қаралды 1,2 МЛН
Introduction to Chemical Engineering | Lecture 1
48:09
Stanford
Рет қаралды 770 М.
顔面水槽がブサイク過ぎるwwwww
00:58
はじめしゃちょー(hajime)
Рет қаралды 116 МЛН