"Linearized ways of describing how a system evolves over one timestep" is BRILLIANT! I never heard PDEs described in such a beautiful, comprehensible way, Thank you Yannic Kilcher.
@errorlooo81244 жыл бұрын
So basically what they did is kind of like taking a regular neural network layer added jpeg compression before it, and jpeg decompression after it, then built a network and trained it on navier stokes images to predict the next images. The reason i say jpeg is because the heart of jpeg is transforming an image into the frequency domain using a fourier-like function, the extra processing jpeg does is mostly non-destructive(duh you want your compressed version to be as close to the original), plus a neural network would probably not be impeded by the extra processing, and their method throws away some of the modes of the fourier transform too.
@errorlooo81244 жыл бұрын
@Pedro Abreu Yeah DCT is derived from the DFT which is basically the Fourier Transform but can work on actual data instead of needing a continuous function. (DCT is just the real component of DFT, with a bit of offsetting(it uses n+1/2) and less rotation(it uses pi instead of 2pi))
@dominicisthe14 жыл бұрын
Cool to see a paper like this pop up on my youtube. I did my MSc thesis on the first reference solving ill-posed inverse problems using iterative deep neural networks.
@taylanyurtsever4 жыл бұрын
Vorticity is the cross product of nabla operator and the vector field of velocity, which can be thought of as the rotational flow in that region (blue clockwise and red ccw).
Fourier Neural Operators aren't limited to periodic boundary conditions the linear transform W works as a bias term which keeps track of non-periodic BCs.
@이현빈학생공과대학기 Жыл бұрын
This is an excellently clear description. Thanks for the help.
@channuchola11534 жыл бұрын
Wow.. simply awesome. Fourier and PDE good to see togather
@soudaminipanda11 ай бұрын
Fabulous explanation. Crystal clear
@kazz8114 жыл бұрын
Cool video as usual. Quick comment, vorticity is simply the curl of the velocity field and doesn't have much to do with "stickiness". Speaking of which, viscosity (measures forces within the fluid molecules) is not actually related to "stickiness", a property that is measured by surface tension (how the fluid interacts with an external solid surface). You can have highly viscous fluids which don't stick at all.
@PatatjesDora4 жыл бұрын
Going over the code is really nice!
@clima39932 жыл бұрын
Yannic always give me an illusion that I understand things that I actually don't. Anyway, good starting point and thank you so much!
@dawidlaszuk4 жыл бұрын
Coming from signal processing and getting head into the Deep™ world, I'm happy to see Fourier showing up. Great paper and good start but I agree with the overhype. For example, throwing away modes is the same as masking with square function, which in the signal space is like convolving with a sinc function. That's a highly "ripply" func. Nav-Stks is general is chaotic and small perturbations will change output significantly over time. I'm guessing that they don't see/show these effects because of their data composition. But that is a good start and maybe an idea for others. For example replace Fourier kernel with Laplace and use proper filtering techniques.
@DavenH4 жыл бұрын
Hey Dawid, you produce any YT content? I'm also from DSP and doing Deep learning, curious what you're working on.
@DavenH4 жыл бұрын
I hope this is going to lead to much more thorough climate simulations. Typically these require vast amounts of supercomputer time and are run just once a year or so. But it sounds like just a small amount of cloud compute would run them on this model. Managing memory would then be the challenge, however, because I don't know how you could afford to discretize into isolated cells the fluid dynamics of the atmosphere, where each part affects and flows into other parts. It's almost like you need to do it all at once.
@PaulanerStudios4 жыл бұрын
Well from what I have seen climate simulations are at the moment also discretized into grids for memory management... at least the ones where I have looked at the code... I guess its more of a challenge to enforce boundary conditions in this model such that neighbouring cells don’t diverge at their shared boundaries... I guess traditional methods for dealing with this would suffice tho... you’d still have to then blend the boundaries occasionally, so the timesteps can’t be arbitrarily large
@DavenH4 жыл бұрын
@@PaulanerStudios Hmm. Maybe take a page from CNNs and calculate 3x3 grid cells, so you get a centre cell with boundaries intact, then stride 1 cell and do another 3x3 calculation; hopefully the interaction falloff is steep enough to then stitch the centre-cells together without discontinuities. Or maybe you need to do 5x5 cells throwing away all but the centres. Another thing, I thought the intra-cell calculations were hand-made heuristics with these climate simulations, not actually Navier-Stokes. Could be wrong, but if no even eliminating those heuristics and putting in "real" simulations is a good improvement.
@PaulanerStudios4 жыл бұрын
@Mustache Merlin The thing with every compute job is the von Neumann Bottleneck... running massively parallel compute jobs on CPU or GPU, the limiting factor is always memory bandwith... since neural networks are in the most basic sense matrix multiplications interspersed with nonlinearities, VRAM is the limiting factor for how large a given multiplication/network and thus network input can be... there is really no sense in streaming anything from a drive no matter how fast, because the performance will tank by orders of magnitude for backprop and such, if the network (and computation graph) can‘t be held in graphics memory at once... If u‘re arguing the case for regular simulations, well, supercomputers already have terabytes or petabytes of ram... the issue is swapping the data used for computation in and out of cache and subsequently registers... optane drives will not solve the issue of the memory bottleneck there either... the only thing they can solve is maybe memory price, which really is not a limiting factor in HPC (most of the time)
@herp_derpingson4 жыл бұрын
36:30 I like the idea of throwing away high FFT modes as regularization. I wish more papers did that. 37:35 IDK if throwing out the little jiggles is a good idea because the Navier Stokes is a chaotic system and those little jiggles were possibly contributing chaotically. However perhaps the residual connection corrects that. 46:10 XD I wish the authors ablated the point to point convolution and showed how much does that help, same for throwing away modes. Also I wish the authors showed an error accumulation over time graph. I really liked the code walkthrough. Do it for other papers too if possible.
@simoncorbeil40819 ай бұрын
Great video, however I would like to correct a few facts. If Navier-Stokes equations needs the development of new and efficient methods like neural networks it`s essentially because they are strongly Nonlinear especially for high Reynold number (low viscosity, like with air, water; typical fluids we daily meet ) where Turbulence is triggered. Also, I want to rectified, the Navier-Stokes systems shown in the paper is in incompressible regime, and the second equation is the divergence of of velocity, which is the mass conservation equation, nothing related to vorticity (it`s more the opposite, vorticity would be the cross product of the nabla operator with the velocity field).
@antman76733 жыл бұрын
Vorticity is derived from vortex. The triangle pointing down is the nabla Operator. It was pointing to the lowest value.
@pradyumnareddy54154 жыл бұрын
I like it when Yannic throws shade.
@mansisethi81273 ай бұрын
Thank you for the paper presentation!!
@idiosinkrazijske.rutine4 жыл бұрын
Looks similar to what is done is so called "spectral methods" for simulation of fluids. I'm sure this is where they draw their inspiration from.
@Mordenor4 жыл бұрын
Normal broader impact: This may have negative applications on society and military applications This paper: I AM THE MILITARY
@markh.8764 жыл бұрын
This is going to be lit when it comes to Quantum Chemistry
@raunaquepatra39664 жыл бұрын
I wish the authors showed the effects for throwing away modes in some nice graphs😔. Also show the divergence for this method from ground truth (using simulator) when used in a RNN fashion(ie feeding the final output of this method back to itself to generate time steps possibly to infinity and show at what point it starts diverging significantly)
@kristiantorres10803 жыл бұрын
Thank you! I was just reading this paper and somewhere around page 5, I started to fall asleep. Your video will help me to understand this paper better.
@mohsensadr27192 жыл бұрын
Very nice work of explaining the paper. I was wondering if you have any comments about: - Fourier works well if you have equidistance grid points. I think if the initial data points are random in space (or unstructured grid), one has to include more and more terms in the Fourier expansion given the irregularity of the mesh. - FNO has to be coupled with an exact solver since one has to give the solution of the first several time steps as input. - I think it is not possible to train FNO on a small solution domain and then use it for larger ones. Any comments on that?
@weishkysiliy44202 жыл бұрын
Training on a lower resolution directly evaluated on a higher resolution. I don't understand how he can do It?
@billykotsos46424 жыл бұрын
Damn the opening title blew my mind
@yusunliu48584 жыл бұрын
The process Fourier Transformation -> Multiplication -> Inverse Fourier Transformation seems like a low pass filter. If that is so, why not doing a low pass filter at the input A'. Maybe I didn't get the idea correctly.
@YannicKilcher3 жыл бұрын
I think one of the steps is actually explicitly a low pass filter, so you're right
@weishkysiliy44202 жыл бұрын
@@YannicKilcher Training on a lower resolution directly evaluated on a higher resolution. I don't understand how he can do It?
@YannicKilcher2 жыл бұрын
@@weishkysiliy4420 the architecture is somewhat agnostic to the resolution, unlike traditional image classifier models
@weishkysiliy44202 жыл бұрын
@@YannicKilcher After training on small size (64*64) and loading the model directly, change the input dimensions to 256*256? Can I understand it this way?
@weishkysiliy44202 жыл бұрын
@@YannicKilcher I really like your song. Nice prelude
@diegoandrade3912 Жыл бұрын
Fabulous thank you for the explanation and time to create this video, keep it coming.
@CoughSyrup Жыл бұрын
This is really huge. I see no reason this couldn't be extended to solve magnetohydrodynamic behavior of plasma. And made to work for the 3D equations. This currently requires supercomputers to model. Imagine making it run on a desktop PC. This means modeling of plasma instabilities inside fusion reactors. Maybe with fast or real-time modeling, humanity can finally figure out an arrangement of magnets in 3D for plasma that is stable and robust to excursions.
@lestroarmonico3 жыл бұрын
6:26 vorticity is derivation of viscosity? No it is not. Viscosity is the fluid's property, vorticity is ∇×V (curl of the velocity). Edit: And at 8:18, that is not vorticity equation, that is the continuity equation which is about conservation of mass. Very helpful video as I currently study on this very paper myself, but there are a few mistakes you've made that needs correction :)
@tedonk032 жыл бұрын
Thank you for the awesome explanation, really clear and helpful. Can you do one for PINN (Physics Informed Neural Network)?
@konghong38854 жыл бұрын
jokes aside, as a Physics student, I wonder: is it possible to apply periodic boundary condition on the FNO? how to actually estimate the error of the solver, for MCMC, the error can be estimated with probability, but not for the ML case
@artyinticus71494 жыл бұрын
Highly unlikely
@dominicisthe13 жыл бұрын
I think it is the non periodic boundary conditions u are worried it about.
@DamianReloaded4 жыл бұрын
47:00 If they wanted to predict longer sequences they could use the solver for the first tensor they input and just feed in the last 11 steps of the latest prediction back in right? I wonder after how many steps it would begin to diverge if they used the maximum possible resolution of the data.
@YannicKilcher3 жыл бұрын
True, but as you say, the problems would pile up
@esti4457 ай бұрын
8:30 It is the laplacian operator - the second derivative with respect to space..
@JurekOK4 жыл бұрын
So . . . they have taken an expensive function (which is itself, already an approximation of an even more expensive function), and trained up an approximated function. Then, there is no comparison of predictions with any experiment (least a rigorous one), only with that original "reference" approximated function. Is this a big deal? I have been doing that during the 2nd year of my undergrad in mechanical engineering, 18 years ago. Come on. How about the long-term stability of their predictor? How does it deal with singularities at corners? moving or deforming objects? deconveregence rate? is the damping spectrally correct? My point is that this demo is really unimpressive to a person that actually uses fluid dynamics for product design. It might be visually impressive for the entertainment industry. Hyped titles galore.
@surbhikhetrapal19753 ай бұрын
Hi, found this review of the paper very helpful. I could not locate the code at the link shared in video description. Does anyone know under what name in the github neuraloperator repository is this code present?
@andyfeng63 жыл бұрын
The triangle means Laplace operator
@antman76733 жыл бұрын
So this is kind of like an approximation of the development of the fluid with pixels instead of the infinite resolution “vector graphic” provided by the equation.
@Andresc93 Жыл бұрын
Thank you, you just save a bunch of time
@lucidraisin4 жыл бұрын
Woohoo! New video!
@sujithkumar8244 жыл бұрын
Download this video to save it personally because it can be taken down because of pressure by the author, for stupid reasons.
@herp_derpingson4 жыл бұрын
Why?
@judgeomega4 жыл бұрын
@@herp_derpingson i think the author can neither confirm nor deny any reasoning for a take down
@sujithkumar8244 жыл бұрын
@@judgeomega yes, I'm glad Yannic didn't even respond publically to her, this is exactly the treatment every attention seeker should get.
@matthewtang14893 жыл бұрын
what?? paper author or article author? there is a fiasco about this?
@amarilloatacama49973 жыл бұрын
??
@MaheshKumar-iw4mv Жыл бұрын
Can FNO be used to train data from Reaction-Diffusion dynamics with no-flux boundary conditions?
@airealguy3 жыл бұрын
So I think this approach has some flaws and has been hyped too much. The crux of the problem is the use of FFT's which impose some severe constraints on CFD problems. First, consider complex geometries (ie those that are not rectangular). How does one take an FFT on something that is not rectangular? You can map the geometry using a spatial transform to a rectangular coordinate system, but then the learned parameters are specific to that transform and thus that geometry. Secondly, there are no good ways to do FFT's efficiently at large scales (ie scales above the memory space of one processor). Even the best algorithms such as heFFTe which can achieve 90% of the theoretical max performance are quite poor in comparison to the algorithmic performance of standard PDE solvers. heFFTe only achieves an algorithmic performand of 0.05% of peak on summit. So while this is fast on small scale problems, it will likely suffer major performance problems at large scales and will be difficult if not impossible to apply to complex non rectangular geometries. The neural operator concept is probably a good one, but the basis function makes this difficult to apply to general purpose problems. We need a basis function which is expanded in perception but not global like an FFT. Even chopping the FFT off can have issues. If you want to compute a N
@crypticparadigm21803 жыл бұрын
Great points... On the topic of memory consumption and allocation of neural networks-- what are your thoughts about Neural Ordinary Differential Equations?
@meshoverflow21504 жыл бұрын
Would there be any advantage to doing convolution in frequency space with a conventional cnn for say image classification? On the surface it seems like it could be faster (given that an fft is very fast) than regular convolution, but I assume there’s a good reason why it isn’t a common practice.
@nx68034 жыл бұрын
Octave convolutions are sorta based on the same intuition, yet don’t actually use fft.
@andrewcutler45994 жыл бұрын
Convolution preserves spatial relationships which makes it useful for images. Neighboring pixels are often related to one another. A CNN in FFT world would operate on frequency. Not clear that there is a window where only near frequencies should be added together to form feature maps.
@meshoverflow21504 жыл бұрын
@@andrewcutler4599 The cnn wouldn’t operate on frequencies though. Multiplication in frequency space IS convolution, so a feed forward network in frequency space should do the exact same thing as a conventional cnn. I feel like the feed forward should be smaller than the equivalent cnn, hence the question.
@DavenH4 жыл бұрын
@@meshoverflow2150 Interesting observation.
@beginning_parenting3 жыл бұрын
On the line 87 of the code in FNO3D , it is mentioned that input is a 5d tensor (batch, x,y,t, in_channels).. What does in channels represent? Does that mean that each point in (x,y,t) is a vector containg 13 channels?
@boffo254 жыл бұрын
Nice explanation
@davenovo694 жыл бұрын
Great channel! What App do you use to annotate PDFs?
@YannicKilcher3 жыл бұрын
OneNote
@digambarkilledar0037 ай бұрын
what is number of input channels and output channels ?
@reinerwilhelms-tricarico3443 жыл бұрын
I found this article quite abstract (which may explain why it's interesting ;-). I could sort of get it after first reading an article by the same authors where they explain neural operators for PDEs in general (Neural Operator: Graph Kernel Network for Partial Differential Equations, 2020). There they show that the kernel they learn is similar to learning the Green's function for the PDE.
@kristiantorres10803 жыл бұрын
It is abstract and there are some things that I don't understand. Is this the paper you are referring to? arxiv.org/abs/2003.03485
@reinerwilhelms-tricarico3443 жыл бұрын
@@kristiantorres1080 Yes. I read that paper and it somehow helped me understanding the paper presented here.
@weishkysiliy44202 жыл бұрын
Training on a lower resolution directly evaluated on a higher resolution. I don't understand how he can do It?
@sui-chan.wa.kyou.mo.chiisai4 жыл бұрын
8:30 Triangle for Laplace operator ?
@sui-chan.wa.kyou.mo.chiisai4 жыл бұрын
www.wikiwand.com/en/Laplace_operator
@machinelearningdojo4 жыл бұрын
😀😀😀😀 pwned
@finite-element3 жыл бұрын
Also Navier-Stokes should be nonlinear not linear (circa the same time window).
@JM-ty6uq3 жыл бұрын
that is the dorito operator
@JM-ty6uq3 жыл бұрын
24:40 I suppose its worth mentioning that you can make a cake with 0.5 eggs or 2 eggs
@sinitarium9 ай бұрын
Amazing! This must be how Nvidia DLSS works!?
@konghong38854 жыл бұрын
behold, the new title formate for ML community
@southfox20123 жыл бұрын
great
@perlindholm41293 жыл бұрын
Idea - Scale down the ground truth video. Then train a model on a small matrix 4x4 part of the frame and learn the expansion 16x16 submatrix of the original frame. This way you can train 2 models each on the different aspects of the calculation. One scaled down time learning and one scale up learning.
@cedricvillani85022 жыл бұрын
Should update your video
@acharyavivek512 жыл бұрын
very scary how ai is progressing.
@RalphDratman Жыл бұрын
All those little bumps could be creating the digital environment in which the upper layers of GPTx are doing their magic.
@Beingtanaka Жыл бұрын
Here for MC Hammer
@jean-pierrecoffe66664 жыл бұрын
Hahahahaha, excellent intro
@Neomadra4 жыл бұрын
I don't quite get why you said (If I understood you correctly) that the prediction cannot be made arbitrarily far into the future. Couldn't you just use the output of the forward propagation as new input for the next round of forward propagtion. So you apply a chain of forward propagations until you reach the time you want. If memory is a problem, then you can simply clear the memory of the previous outputs.
@seamusoblainn4 жыл бұрын
Perhaps as the network is making predictions as opposed to the ground truth sim which is using physics. In the latter there only is what it's rules generate, while in the former you are using 'feedforwarding' which must by necessity diverge, and on a fine degree of granularity probably is from the beginning.
@YannicKilcher3 жыл бұрын
it's true, but you regress to the problem you have when running classic simulations
@kesav19854 жыл бұрын
So much fuss about curve-fitting! Curve-fitting is not a numerical scheme for solving PDEs. :-)
@artyinticus71494 жыл бұрын
Imagine using the intro to politicize the paper.
@artyinticus71493 жыл бұрын
@adam smith Imagine using the military for non-political purposes.