So I am working on analysing a chaotic system, and I needed to compute a large number of trajectories for different initial conditions. Even with multiprocessing, and all the optimization I could think of, it was taking forever. Then I watched this video, promptly added 3 lines of code: import numba, and two of the decorators, and got a speed-up by a factor of 20. Words cannot express how grateful I am for this video.
@thinkmichaelthink6 жыл бұрын
The seven strategies are: 1. Line profiling (4:37) 2. NumPy (5:33) 3. Specialized Data Structures (9:10) 4. Cython (12:07) 5. Numba (13:47) 6. Dask (15:23) 7. Find an Existing Implementation (18:56)
@maciejurbanski61465 жыл бұрын
now to make this list closer to being complete: PyPy - JIT like Numba for all your code, but at cost of compatibility Pytorch - think numpy but GPU-based (arrays -> tensors)
@FinallyAFreeUsername6 жыл бұрын
Another great JVDP talk.
@ErickMuzartFonsecadosSantos6 жыл бұрын
I was expecting one of the strategies to numerical optimization in python to be using GPU execution, through pytorch/cuda, for example. Any comment on that?
@ErickMuzartFonsecadosSantos6 жыл бұрын
Ray Donnelly, executing numerical calculations on GPU goes beyond the use cases of machine learning. Take a look at general purpose programming on GPUs: en.m.wikipedia.org/wiki/General-purpose_computing_on_graphics_processing_units An obvious use case of this approach would be to offload to the GPU compute intensive calculations, such as matrix multiplications, thus improving performance of python code. Pytorch provides functionality similar to numpy and so could be a viable alternative for "optimizing your numerical code". I would have liked feedback on this approach, benchmark references or just experiences porting, say, numba code to pytorch.
@JJ-xi2vp6 жыл бұрын
I was also a bit disappointed. Although there is another good talk about CuPy which aims to be a numpy for the gpu. Which sounds great.
@yinzhangfred5 жыл бұрын
I am a bit surprised that he didn't mention pypy, which can be fast not only for numerical computation, but for string operations as well. But then, you would have to point out that it is not yet very compatible with pandas/numpy.
@magno51575 жыл бұрын
Can Pypy compete with the speed of Numpy and Pandas? Or is Pypy slower?
@yinzhangfred5 жыл бұрын
pypy vs numpy are for different things. If your data fits into RAM and are in an array/ data frame format, then I'd say go for numpy or pandas (I personally use pandas, though it uses about 4X RAM of your data size). On the other hand, if you write 'vanilla' python or need to process your data line by line (for whatever reason), pypy is good. It can speed up even string operations while Numba can not. @@magno5157
@magno51575 жыл бұрын
@@yinzhangfred You've just made me curious. Could you give me examples of numerical computations that do not require arrays and data frames?
@yinzhangfred5 жыл бұрын
@@magno5157 Ah, you are right, I did not read the title of the talk carefully, where it reads "optimizing your NUMERICAL code". The examples I have are not quite "numerical". For example, I use pypy to turn a large structured log file to csv. If I didn't come across pypy, I would have to deal with c.