What is a BEST approximation? (Theory of Machine Learning)

Рет қаралды 4,400

Күн бұрын

Пікірлер: 25

@mohammedbelgoumri 6 ай бұрын

I'm doing a PhD in ML and was recently getting into functional analysis and measure theory. Great to see this kind of content at this specific time 😅

@JoelRosenfeld 6 ай бұрын

I’m glad I can be of service! Are there any topics you are wanting to learn more about in particular?

@DistortedV12 6 ай бұрын

This content is what I need as a grad student studying machine learning

@JoelRosenfeld 6 ай бұрын

I’m glad I can help! The next several videos will talk about the Riesz theorem, kernel spaces, and the representer theorem. Then we’ll stop and discuss SVMs and VC dimensions. Then back to more Hilbert space stuff, like Fourier series, bases, etc

@JoelRosenfeld 6 ай бұрын

Tell all your friends, lol!

@octaviusp 4 ай бұрын

do you have any machine learning theory courses that follows this series of videos? I would happy to see a Machine learning theory course from you.

@JoelRosenfeld 4 ай бұрын

Not a formal course yet. It’s something I plan on doing in the future. For now, I am working on a sort of complete series here on KZbin

@JoelRosenfeld 6 ай бұрын

Ack! Left that note for myself in the video at 0:49. Whoops!

@tux1968 6 ай бұрын

Thanks for the video, it is helpful! One small nitpick, the background music obscured your voice, and was distracting.

@JoelRosenfeld 6 ай бұрын

I'm sorry you found it distracting. I try to turn it off before we get to any real details. Which part of the video did you find it the worst?

@noahsegal1334 6 ай бұрын

How does this method work on extrapolation? You had defined a domain for the function but if you then go outside the polynomial is likely bo where near function since it is a large degree polynomial. I think the method is really cool and think the moment matching is a neat probabilistic tie. What is the method like for when your domain is really large?

@JoelRosenfeld 6 ай бұрын

Threw together this short to answer your question: kzbin.infoMDrjOcmWQA8

@smolboye1878 6 ай бұрын

This is amazing.

@JoelRosenfeld 6 ай бұрын

Thank you!

@cunningham.s_law 6 ай бұрын

you should reduce the saturation maybe get a better display to see the actual colors

@JoelRosenfeld 6 ай бұрын

Interesting, I'll give it some thought. No one has ever mentioned it before.

@skillick 6 ай бұрын

Great video! Thanks for putting it together. If I may, I’d like to suggest slowing down a bit, you sound out of breath! We have plenty of time to learn. Thanks again.

@JoelRosenfeld 6 ай бұрын

I'm glad you liked the video! I'll work on slowing it down a bit. I honestly had 30 minutes to set up and record the first half.

@ckq 6 ай бұрын

New to this content and I'm not that strong with matrices, but essentially this is just polynomial regression. Not sure why the weierstrauss thing is so bad, even a piecewise linear function would do much better

@ckq 6 ай бұрын

Interestingly ~6 years ago when I was a kid and obsessed with memorizing sin(x) I tried to come up with the best order ~3 polynomial for approximating sin from 0 to 45 degrees that I could compute in my head

@JoelRosenfeld 6 ай бұрын

Yeah, when the Weierstrass approximation method was first conceived, he was much more concerned with showing that it CAN work, rather than work well. And this version by Rudin I think even works better, but I'd have to go double check that to be sure. This is polynomial regression, but where we are using moments as the data rather than point samples. We will get into regression methods a lot more in future videos. This was an easy lift for where my Hilbert space/Machine Learning course is right now.

@KitagumaIgen 6 ай бұрын

Well that's not the BEST best approximation now, is it? For that you ought to take into account how the approximation-method behaves in the face of noisy data (function-values) and then one has to balance the accuracy of the estimate against overfitting to noise. The best methods for that are the AIC/BIC model-selection methods...

@JoelRosenfeld 6 ай бұрын

It gets its name from arising from a projection in a closed subspace of a Hilbert space. It's the best according to the Hilbert space metric. I'd wager that this actually will still do OK in the face of noisy data. This is because the moments arise as integrals rather than point samples, so mean zero white noise will likely be integrated away. But sure, there are better ways to do this. I'm building up a Hilbert space course on my channel now, and so this is what we have available to us now at this point in the course. I'll look into your suggestion and see if I can integrate it into course later. I appreciate it!

@KitagumaIgen 6 ай бұрын

@@JoelRosenfeld The level and objective you explain is about what I grasped from the presentation, that I really liked. Looking forward the the next ones. Regarding the best fit: In any real situation the integrals convert into Riemann-sums and your polynomials will start to fit to the noise in the data, this is the problem with any sequence of least-squares fit with increasing number of basis-functions spanning the Hilberty space. The Aikaike Information Criterion/Bayesian IC adds a cost to the free parameters of the model, that cost increases with the number of parameters, this should give a reasonably best balance between fit and robustness.

@JoelRosenfeld 6 ай бұрын

@@KitagumaIgen Yep, this method will only take us so far. I think closest to what you are speaking about here, that's currently in the works for a future video, is regularized regression methods. Where we penalize the size of the weights. That's about two or three videos out, if I stick to my plan. I haven't investigated AIC/BIC methods you are talking about, but I certainly will try to interweave them when I discuss model selection. Most of my own experience comes from pure Functional Analysis, Operator Theory, and Dynamical Systems. I'm always happy to learn new things.