Рет қаралды 473,293
FluidX3D source code: github.com/ProjectPhysX/FluidX3D
The 20 seconds of video show the car driving at 100km/h for 1 second.
Mattia Binotto's Ferrari SF71H in #CFD. In this 10 billion voxel #FluidX3D simulation you see the wild aerodynamic optimization for a very successful F1 car. #OpenCL compute (2152×4304×1076 resolution grid, 217k time steps) plus rendering 3x 20s 4K60 video took 14 hours. Shown is velocity-colored Q-criterion isosurfaces with marching-cubes. Reynolds number is 3.75 Million with Smagorinsky-Lilly subgrid model.
How is it possible to squeeze 10 billion grid points in only 512GB VRAM?
I'm using two techniques here, which together form the holy grail of lattice Boltzmann, cutting memory demand down to only 55 Bytes/node for D3Q19 LBM, or 1/3 of conventional codes:
1. In-place streaming with Esoteric-Pull. This almost cuts memory demand in half and slightly increases performance due to implicit bounce-back boundaries.
Paper: doi.org/10.3390/computation10...
2. Decoupled arithmetic precision (FP32) and memory precision (FP16): all arithmetic is done in FP32, but LBM density distribution functions in memory are compressed to FP16. This almost cuts memory demand in half and almost doubles performance, without impacting overall accuracy for most setups.
Paper: www.researchgate.net/publicat...
Graphics are done directly in FluidX3D with OpenCL, with the raw simulation data already residing in ultra-fast video memory. 1 frame of the velocity field is 120GB, 1201 frames are generated, which would be 144TB. No volumetric data ever has to be copied to the CPU or hard drive, but only rendered 4K frames (33MB) instead. Once on the CPU side, a copy of the frame is made in memory and a thread is detached to handle the slow .png compression, all while the simulation is already continuing. At any time, about 16 frames are compressed in parallel on 16 CPU cores, while the simulation is running on GPU.
Paper: www.researchgate.net/publicat...
Timestamps:
0:00 side view
0:20 front view
0:40 top view
Thanks to the people at Jülich Supercomputing Centre for letting me test their hardware!
The 3D model is from Thingiverse: www.thingiverse.com/thing:299...
#CFD #GPU #FluidX3D #OpenCL