suuuucchhhh a great talk. really clear, thank you.
@nikitagupta93697 жыл бұрын
Loved the talk, helped understand the intuition of HMC. Thanks
@ProfessorBeautiful8 жыл бұрын
Excellent talk; thank you. And yes, to respond to your question at the end, it was that clear.
@deepbayes68087 жыл бұрын
Amazing talk.
@stevebez27677 жыл бұрын
?watson
@oflasch7 жыл бұрын
Great talk, thank you!
@Samanthaz4 жыл бұрын
Are your slides available? perhaps with the lecture transcript for each slide?
@stevebez27677 жыл бұрын
Brands Hatch=Sim C egg,sam,pools? (Monile Radiation)
@stevebez27677 жыл бұрын
[ibm 'hmc'?]
@alute5532 Жыл бұрын
Biased inferences, in wide data regime select bias Add prior regulidr system gives math added to help Likely what we learned in total qauntifies our info Any stat question via manipulation of posterior Resort to an expectation reduced t0 computing an interval We do numerical wppriximation As calculating exact is hard in D To find expectation, identify where to focus our computation where is most contribution to those expectations. Interesting density consider the volume (over that density) High F lots of corners, hard Volume increase fast exponentially 2 competing forces 1 volume wanna focus on large q 2 Density focus on mode OMG (I. E. Normal) balances out in middle Region concentration is the typical set. Look at surface around the mode Markov chain : a way of finding exploring sets like that It's a random function tao After jump next time it will be a new distribution of points We get a Markov chain If we can engineer Markov chain to preserve our target distribution Markov make us humans to typical set (start exploring that surface) In m d every point is far from the typical set End nice quantification of where probability really is To compute any function average it over Markov chain history I. R Markov chain Monte Carlo mcmc Long enough ensure we always converge to the true expectation (always right answer) Q how well can we do it? 2 how quick we converge to true expectation?.if transitions expensive like in white data Exhaust computational resources long before we complete the exploration Partial exploration means biased (missing probability) lots of mcmc aloha like that Metropolis 1 proposal:add some noise 2 decide: accept reject proposal (based on where we come) If closer to mode. The. Ccwp it If away from Mode, we reject it In MD volume is weird it doesn't scale. Outside typical set there is more volume Only way is to shrink size of perturbation to a really small neighborhood We won't go any where, just a tiny transition End up v inefficient exploration, v poor mcmc So avoid guessing checking p acceptance is v small Use transition knows shape of our surface (how to stay on the contour?) Need of automation How extract info about the surface? Hamiltonian mcmc uses diff geometry Use vector field : assign direction to vectors if direction is right, don't guess anymore! Hence all new points lead to others on the same typical set How: look at density of target fun Take gradient of that function Gradient is also a vector field If we follow it it leads to mode (unuseful) Potentially correct gradient Differential geometry automatically correct the gradient Physics planet orbit & it's field Missing momentum transverse motion keeps us from falling Too much momentum gravity won't catch us at all?! Key add momenta in the right way For all parameter q, add expand a momentum 2 lift up target distribution on this space Find prob. Structure pi( p q) How by conditional distribution (for the momenta ) End join distribution, over momenta and distributions) I always recover target distribution I can project it down, get rid of momenta use symplectic integrator can bound errors, transformation required from exact o approximate Calculate how accurate the solution is by integrating over all deviations Solution I'll n between cost of algorithm, and step size End up getting lower bound upper bound (of error) x avg acceptance prob. Y = cost For almost all models relationship is bounded between. 2 lines 0.6 0.8 solution is near flat, near optimal Choose step size so that avg x Aziz in. 0.6 0.8 Intuition hoe to 1 choose kinetic energy 2 choose integration time 3 step size Fully automated Devouple 2 steps of inference 1 modeling step we choose prior likelihood 2 computation step: compute those expectations 2 step size smaller No step size work Changing your model reimplementing in different way or recharging your priors After ensures exact computation of necessary gradient 1 control stmts if else 2 prob. Density functions PDF Cdf 3 linear algebra addition multiplication decomposition 4 ode (nonstiff stiff) Space equipped with Lie group to give a flow typical set is meausre preserving flow Adibotic Monte Carlo multi modal distribution