Compiled Python is FAST

  Рет қаралды 79,621

Doug Mercer

Doug Mercer

Күн бұрын

Sign up for 1-on-1 coaching at dougmercer.dev
-----------------------------------------
Python has a bit of a reputation -- fast to write, but slow to run.
In this video, we focus on a simple to understand dynamic programming problem that would be terribly slow in native Python or numpy. We show that Python can achieve (and actually exceed) C++ level performance with the help of just-in-time and ahead-of-time compilers such as mypyc, Cython, numba, and taichi.
Also, I finally got a camera, so, uh... face reveal, I guess.
#python
Chapters
---------------
00:00 Intro
01:07 The Problem
02:38 numpy
03:08 mypyc
04:08 cython
06:46 numba
07:58 taichi
09:47 Results
11:48 Final Thoughts

Пікірлер: 516
@dougmercer
@dougmercer Ай бұрын
If you're new here, be sure to subscribe! More Python videos coming soon =]
@thesnedit5406
@thesnedit5406 Ай бұрын
You're very underrated
@FabianOtavo
@FabianOtavo 21 күн бұрын
Mojo and Codon(Exaloop)?
@flutterwind7686
@flutterwind7686 Ай бұрын
Numba and cython are an easy way to improve performance beyond what most people require for python, and they don't require much boilerplate either.
@dougmercer
@dougmercer Ай бұрын
Absolutely!
@emilfilipov169
@emilfilipov169 Күн бұрын
@@dougmercer taichi doesn't look very boiler-platy either with just the use of a decorator.
@megaspazos1496
@megaspazos1496 Ай бұрын
Great video, I enjoyed it! In my eyes the video actually shows how fast C++ is. Unoptimized line by line translation from Python to C++ can be as fast as compiled Python optimized with HPC library.
@dougmercer
@dougmercer Ай бұрын
Absolutely. C/C++ and gcc -O3 is basically magic.
@BartekLeon-jx5jv
@BartekLeon-jx5jv Ай бұрын
​ @dougmercer I am pretty convinced that taichi under the hood creates 1D array and not 2D. Doing vector hits the performance quite a bit (while not the most reliable test, changing vector to normal vector gave ~10% boost. Although both C++ versions where faster than taichi for me. (compiled with MSVC release). There are still some minor things, but they shouldn't influence anything since in my case it was ~40-50% in std::max and 20-30% in creating the vector. All in all, nice video showcasing the tools.
@BartekLeon-jx5jv
@BartekLeon-jx5jv Ай бұрын
Ah, also... just out of curiosity: @numba.njit def lcs2(a, b): m, n = len(a), len(b) dp = [0] * (n + 1) prev_row = [0] * (n + 1) # Temporary storage for the previous row for i in range(1, m + 1): for j in range(1, n + 1): if a[i - 1] == b[j - 1]: dp[j] = prev_row[j - 1] + 1 else: dp[j] = max(prev_row[j], dp[j - 1]) for j in range(1, n + 1): prev_row[j] = dp[j] return dp[n] Less memory allocation / 2D array. Testing this against C++ / taichi would be a nice one :) [and you have some vectorisation you can throw there]
@ruroruro
@ruroruro Ай бұрын
​@@BartekLeon-jx5jv it's not a 1D array, but a homogeneous ND array. It's somewhere between vector and int[A][B]. It is represented as a flat array in memory, but unlike int[A][B], the data type, number of dimensions, sizes of these dimensions and the iteration strides are dynamic. Also, it's not just taichi that's using ndarrays, numpy and numba are also using ndarrays here.
@BartekLeon-jx5jv
@BartekLeon-jx5jv Ай бұрын
​@@ruroruro That's what I meant in a sense. Although all is still boiling down to: are you allocating once or are you allocating N times (in case of vector).
@mr_voron
@mr_voron 11 ай бұрын
This channel is highly underrated. Excellent analysis.
@dougmercer
@dougmercer 11 ай бұрын
Thanks for the support Maks! =]
@s8r4
@s8r4 6 ай бұрын
I've also had some fun using various methods to speed python up, and this video is a great overview of the major ways of going about it, but while it's a big departure, I've found nim to have the most python-like syntax while being as fast as things get (compiles to c, among many other languages). I've seen that you know about the true power of python already, but James Powell did a great talk about this exact topic titled "Objectionable Content", big recommend. Thanks for the video!
@dougmercer
@dougmercer 6 ай бұрын
I'll check it out! Also, I have looked at Nim in the past. It seems nice. Eventually I may do another video on this topic, and branch out to other languages (Nim, Julia, and now Mojo). Thanks for the idea, the video rec, and thoughtful comment =]
@Masterrex
@Masterrex 5 ай бұрын
Subbed, nicely done. I can tell you were having fun, IMO don’t worry so much about the glitzy graphics - your story telling is great!
@dougmercer
@dougmercer 5 ай бұрын
Thanks so much =]
@dhrubajyotipaul8204
@dhrubajyotipaul8204 Ай бұрын
Thank you for making this. Trying out mypyc, cython, and numba right now! :D
@dougmercer
@dougmercer Ай бұрын
Enjoy! And good luck =]
@ethanymh
@ethanymh 11 ай бұрын
Love this video so much! The quality of content, animation, and visualization is unmatched...
@dougmercer
@dougmercer 11 ай бұрын
Thank you so much!
@stereoplegic
@stereoplegic 2 ай бұрын
After reading the other comments while thinking up my own, I feel compelled to echo this sentiment first. Fantastic job, @dougmercer - both technically and visually - I loved it all.
@dougmercer
@dougmercer Ай бұрын
Thanks @stereoplegic! That means a lot =]
@pietraderdetective8953
@pietraderdetective8953 9 ай бұрын
This is a very high quality content, mate! Well done! A question, for gamedev use case, can we just use the tools mentioned to speedup things? I've seen horrible performance when someone is using Python-based game engine (like pygame etc).
@dougmercer
@dougmercer 9 ай бұрын
Thanks! =] Yes, you should be accelerate a pygame-based game with these tools. You can't speed up pygame functions and methods, but you can speed up your code between those calls. It'll be most well suited for larger, number crunchy parts between methods rather than quick little one-off operations. Let me know if you end up tweaking something and seeing a boost in performance!
@onogrirwin
@onogrirwin Ай бұрын
damn, this is a high effort channel. your stock footage game is especially on point. hope you pop off big time :)
@dougmercer
@dougmercer Ай бұрын
That's so nice! thanks =] 🤞
@matswikstrom7453
@matswikstrom7453 6 ай бұрын
Wow! Really informative and interesting - Thank You! I am now a subscriber 😊👍
@dougmercer
@dougmercer 6 ай бұрын
Thanks so much =]
@jcldc
@jcldc 5 ай бұрын
Nice video. I have just learned cython and achieved a speed up of 500x vs pure python(+numpy) in one of my code. It worth to mention that using cython, you can automatically parallyze your loop with prange statement instead of range.
@dougmercer
@dougmercer 5 ай бұрын
500x is great! And good point on prange-- I should have covered the parallel aspect more of all the solutions (numba, Taichi, and cython) but I glossed over it due to the serial nature of the example problem. Thanks for the comment =]
@Finnnicus
@Finnnicus 11 ай бұрын
good content, great presentation. love the style!
@dougmercer
@dougmercer 11 ай бұрын
Thanks Finnnicus! Much appreciated =]
@mariuspopescu1854
@mariuspopescu1854 Ай бұрын
So, I'm not a big python guy so I was curious. I repeated your experiment for C++ vs numba. Only real difference: for the C++, I rewrote it just a bit (used auto and changed the indexing a bit to be more c-like) and I wrote the function as a template in which the size m and n were the template variables. This allowed me to change from a vector to a stack allocated array, the main benefit I believe being that the whole memory is contiguous and allowed for better caching. The C++ version was about 1.5x faster than numba on my machine. I really enjoyed this video though! Made my question my biases, and I think there's alot to be said by letting compilers/optimizers do the thinking for you. I think this was really insightful and I think I'm gonna give the numba one a go for many of my future quick projects.
@dougmercer
@dougmercer Ай бұрын
Oh, that's awesome! I think that's the fastest anyone has gotten it so far! Someone else in the comments encouraged me to try a 1D vector of size (m+1)(n+1) and index into it with arithmetic -- that gave me a roughly 1.1-1.2ish x speedup over the original C++ . So, I guess much of the remaining speedup came from data locality-- very cool that it was another 0.3x-ish boost. I'm glad you found the video interesting =]
@YuumiGamer1243
@YuumiGamer1243 Ай бұрын
I was already aware of numba, but it's good to see all the others like this. Enjoyable video, and I was happy you showed most of the code, while somehow making it feel like a documentary
@dougmercer
@dougmercer Ай бұрын
That's an awesome compliment-- I'm gonna put "Code Documentarian" on my resume. Thanks for watching and commenting =]
@MrXav360
@MrXav360 9 ай бұрын
I learned C++ in the last month (came from a Python background!) and tried my luck at coding real-time animations of fractals. I wanted to compare with Python's performance, but now I am scared I learned C++ for nothing... Thanks! (Just kidding I loved learning C++ and I am glad I did. It's super impressive however to see that we can achieve similar performances with these packages in Python! Thanks for the video).
@dougmercer
@dougmercer 9 ай бұрын
Taichi is great for fractals! I like that it has good built in infrastructure for plotting to a canvas. That said, I'm sure you'll find a use for your new-found C++ knowledge =]
@user-yk8yb5xy8r
@user-yk8yb5xy8r Ай бұрын
My favourite was numba as we were able to achieve our goal with very little code, there are certain shortcut algorithms that can be applied to makeup for its non applicable functions
@user-np9il4is1t
@user-np9il4is1t 9 ай бұрын
Love this video ! it was amzing and usefull !
@dougmercer
@dougmercer 9 ай бұрын
Thanks so much!
@billyhart3299
@billyhart3299 Ай бұрын
Great video man. I'm going to try this on my web server project that uses numpy quite a lot.
@dougmercer
@dougmercer Ай бұрын
Numba should work great! You may just need to tweak your implementation slightly to use the subset of numpy features supported by Numba.
@billyhart3299
@billyhart3299 Ай бұрын
@@dougmercer have you tried anything that helps with matplotlib?
@dougmercer
@dougmercer Ай бұрын
Hmm. Hard to say. Could try mypyc-- maybe it'll just magically work. Alternatively, though this might be a bit disruptive, you could swap out CPython with PyPy (a JIT compiled replacement for the CPython interpreter). In the video I'm working on now, PyPy was shockingly convenient and fast.
@dougmercer
@dougmercer Ай бұрын
What are you plotting, out of curiosity? Maybe do a quick sanity check to make sure the amount of data your plotting has exceeded the usefulness of matplotlib. If it's a scatter plot with millions of points, maybe you should use something like datashader or similar
@billyhart3299
@billyhart3299 Ай бұрын
@@dougmercer I'm using it to do histograms for images that have been turned black and white and then converted to 8 bit png files to convert them to stippling.
@dar1e08
@dar1e08 Ай бұрын
Easily the best video I have seen on performance Python, subbed.
@dougmercer
@dougmercer Ай бұрын
Thanks so much! I should have another performance related video out in mid April so see ya then =]
@josebarria3233
@josebarria3233 5 ай бұрын
Gotta love mypyc, I've been using it in my project and never felt disappointed
@giannisic1544
@giannisic1544 6 ай бұрын
Brilliant video and useful content. It's a pity there's so few of us... Glad the algorithm suggested this video
@dougmercer
@dougmercer 6 ай бұрын
Thanks! Glad you found it helpful =]
@enosunim
@enosunim Ай бұрын
Thanks! This is a really great info!
@dougmercer
@dougmercer Ай бұрын
Glad it was helpful!
@alexsere3061
@alexsere3061 15 күн бұрын
Dude, the quality and depth of this video is insane. I feel like I have a deeper understanding of the strengths and limitations of python, and I have been using it for about 7 years. Thank you
@dougmercer
@dougmercer 14 күн бұрын
Glad it was helpful =]
@famaral42
@famaral42 7 ай бұрын
Thanks for the analysis, I got motivated to look at numba and cython more carefully. Taichi looked cool, but not having it in the anaconda repo is a negative point for me. Have you tried running this code with TORCH?
@dougmercer
@dougmercer 7 ай бұрын
Oh interesting, I didn't realize taichi wasn't on conda-forge. I wonder if they'd accept a PR 🤔. For what it's worth, you can pip install it (and that's possible even if you're using an environment.yml). I did not try torch, but I suspect it would very slow. Reason being-- the main use case for torch is parallel computing via tensors. Since this problem is inherently not parallelizable, my guess is it'd be super slow in torch.
@famaral42
@famaral42 7 ай бұрын
@@dougmercer Thx for insinghts
@beaverbuoy3011
@beaverbuoy3011 23 күн бұрын
Super enjoyable video, thank you this was very helpful!
@dougmercer
@dougmercer 23 күн бұрын
Thanks! Glad it was helpful!
@sdmagic
@sdmagic Ай бұрын
That was exceptional. Thank you very much.
@dougmercer
@dougmercer Ай бұрын
Thanks for watching and commenting!
@EdeYOlorDSZs
@EdeYOlorDSZs 2 ай бұрын
crazy good video! I'm gonna check out Taichi for sure
@dougmercer
@dougmercer 2 ай бұрын
Thanks =]
@abhisheks5882
@abhisheks5882 8 ай бұрын
This channel is a hidden gem
@dougmercer
@dougmercer 8 ай бұрын
Thanks 💎 =]
@chkone007
@chkone007 10 ай бұрын
That was funny, I did both C++ and Python but now I'm more on C++ side. I had in mind the meme "look what they need to mimic a fraction of our power", I didn't tested it, but I bet If you change the proper compilation options that will be faster again in C++. To my understanding this is what taichi do, it's general SIMD based on your current hardware, under the hood via LLVM optimizer based on the data structure (taichi is tailored for sparse data structure). As you work with dense data Halide would give you [maybe] better results. For all cases the code generated by python front end can be generated by C++, the python will always have an overhead. This is what Machine Learning people do, they don't care about python performances, because all the computation which too 90% of their frame is implemented on CUDA and C++, the python is here only to provide data to lower level system.
@dougmercer
@dougmercer 10 ай бұрын
> "look what they need to mimic a fraction of our power" Haha, true! In another comment, I said I loved that even if I write terrible C++ it still turns out pretty fast. That said, the same argument could be reversed, if we consider productivity and third party library access. If an application is 95% high level glue and one hot spot, I'd rather write the majority in Python and the hot spot in an AOT or JIT compiled variant of Python than write my entire app in a low level language. The overhead would be worthwhile from a productivity perspective. > Proper compilation flags Do you have flags you want me to try in particular? I did -std=c++11 -O3, but maybe I'm missing something. > SIMD Since this is all sequential, can SIMD help? I thought SIMD was for packing multiple of the same operations in a single instruction (but again, I'm not a C++ dev) > the Python just provides an interface to a lower level language. True! And I'm OK with that! I def agree that well written, native code in a lower level will out-perform generated code from Python. That said, for all but the most trivial algorithms, I can't write well-written C++. So, if I can get even a 95% solution for free from these high level LLVM interfaces, then I'm stoked!
@chkone007
@chkone007 10 ай бұрын
@@dougmercer ( : That remind me a benchmark done by Microsoft, Debug C++ /NoSIMD vs Release C# SIMD, and they notice faster C# :D Yeah sure... The point of Python is not to be faster, it's mostly to be gentle with non-engineer-long-beard programmer, the user are mostly scientist and data-analysts. > Productivity For this example I see no productivity differences between C++ and Python. But personally I'm more productive in C++ with Eigen and few other lib Like an experimented Python will be faster with numpy and his other favorite libs. > Proper compilation flags I don't know what is your compiler, but for Visual Studio: /Ot {favorize speed} /Oi {Inable Intrinsic} To increase the STL speed, Disable C++ expcetion, "Basic Runtime Checks", /GS-, /GR- ... To help intrinsic generation /Zp8 or /Zp16 (here you're processing int), but we can process And based on your hardware /arch:AVX, ... > SIMD You have gather and scatter instruction that could help, need to profile ( : > Improve On both side I'll bet we can performance by using only type you need. If your number cannot go higher than 100 just use a byte/uint8_t, etc. As I said the video was funny, the point is not to say Python is faster than C++, but more "if you're careful you can have performance higher or close to baseline C++"
@dougmercer
@dougmercer 10 ай бұрын
I'm using g++, I'll try to find the analogs for the compiler flags you recommended. And true, a uint8 is enough. I'll mess around with that too. In any case, thanks for the comments! I'd def like to learn more about C++ but I don't get the opportunity very often
@user-zi2zv1jo7g
@user-zi2zv1jo7g 12 күн бұрын
@@chkone007 Ok, I get the point but theres a lot of production code written in python, most code writing does not require performance and the few bits that do you can write a C extension or simply use C++ and python together
@chkone007
@chkone007 12 күн бұрын
​@@user-zi2zv1jo7g I kind strongly disagree. Did you ever experienced slow UI, stuttering App, lagging game, ... If yes, you already met a programmer who said "most code writing does not require performance". If you said a code does not require performance that just mean you consider your time more valuable than the user time. As a developper we don't own time, the time is not ours, it's the user time. That's what make the difference between a smooth app, slow and memory heavy software, like everything web based, slack, etc. And all chromium stuff. Most of the devs said It's just a chat app, I don't need C++, just a chromium based. Consequences... My Mac/PC uses 8 GiB for doing nothing, just running a VM. And in a industrial point of view, you can release your startup with python code and saying "how I don't care it's CUDA underthehood". You just expose yourself to have a competitor who implement his stuff on C++/CUDA directly and this competitor will explode his profitability because his AWS bill will be much cheaper. We always require memory efficient and fast code. If none of those argument convience you, consider the CO2 argument, it's more eco-friendly for you PC or your server or your N-instances of your programmer running on AWS. I love python to prototype idea, and accelerate my exploration of ideas, but I cannot be serious with that to my clients. I know lot of "AI startup" are like that, download the model from the researcher, create a docker, build a website => step 2 => profit. Most of them rely on Python, but any competitor with cheaper infrastructure can scale more and be more efficient. I had in mind Facebook developed on PHP fine, cool, but at the beginning each new user cost more than the previous one, ... FB wasn't able to scale. They create "HipHop" compiler from PHP to C++, and now the company became profitable each new user became cheaper than the previous one. Conclusion => Performance always mater. Don't read me wrong, that doesn't mean I over-engineer everything to save 1 byte or 1 pico second in median. But keep in mind the quote "early optimization is the root of evil" was written from a time when everybody was written C and assembly code... The code is different, today with python, javascript, ... "early non-optimization is the root of evil".
@ManuelBorges1979
@ManuelBorges1979 Ай бұрын
Excellent video. 👏🏼 Subscribed.
@dougmercer
@dougmercer Ай бұрын
Thanks Manuel! Glad to have you =]
@miriamramstudio3982
@miriamramstudio3982 Ай бұрын
Text on the screen was definitely engaging ;) Thanks
@dougmercer
@dougmercer Ай бұрын
Yay! Success =]
@pranavswaroop4291
@pranavswaroop4291 Ай бұрын
Just excellent in every way. Subbed.
@dougmercer
@dougmercer Ай бұрын
=]
@ThisRussellBrand
@ThisRussellBrand 5 күн бұрын
Beautifully done!
@dougmercer
@dougmercer 5 күн бұрын
Thanks Russell =]
@NicolauFernandoFerreiraSobrosa
@NicolauFernandoFerreiraSobrosa Ай бұрын
Very cool video! Did you consider compilation time in C++ tests? I used Numba daily, and the first run is always slow due to the JIT feature.
@dougmercer
@dougmercer Ай бұрын
I did not count compilation time for the c++ times, but did include JIT time for the first run of Numba. However, it doesn't play a big impact, because we are typically doing 100s or thousands of runs and adding up their times (so the first run being slow only accounts for a small part of the overall time)
@abc_cba
@abc_cba 20 күн бұрын
If you don't keep your content consistently uploaded, you'd be committing a felony. Subbed!!
@dougmercer
@dougmercer 20 күн бұрын
I'm gonna try! Hahaha Thanks for subbing =]
@cmilkau
@cmilkau Ай бұрын
pypy is a jit for full python with special bindings for numpy and scipy. you can use it for any python code, but for max performance might need to write critical parts of your code in rpython, a subset of python that can be statically compiled to native binary. The example subsequence code is valid rpython btw.
@dougmercer
@dougmercer Ай бұрын
PyPy is fantastic -- I'm actually going to cover it in my next video!
@user-by8fp5uw2o
@user-by8fp5uw2o Ай бұрын
Consider using Golang if you want speed + simple to learn (mostly, ofc). Python is fantastic at some tasks, but if you’re really trying to get the best of both worlds (fast to write and fast to run), then Golang could be a great fit
@dougmercer
@dougmercer Ай бұрын
I do plan to do a project in Go sometime soon
@guowanglin4537
@guowanglin4537 3 ай бұрын
Well, I use numba in my research, concerning the human genome, it was really fast!
@dougmercer
@dougmercer 3 ай бұрын
That's awesome! I love numba-- super convenient and fast
@khawarshehzad487
@khawarshehzad487 10 ай бұрын
Amazing content, engaging presentation and sadly, underrated channel. Subbed!
@dougmercer
@dougmercer 10 ай бұрын
Thanks so much! Be sure to share with friends/coworkers you think might enjoy this, and hopefully the channel will grow over time 🤞
@khawarshehzad487
@khawarshehzad487 10 ай бұрын
@@dougmercer keep up the good work, it sure will 🙌
@luaguedesc
@luaguedesc Ай бұрын
Great video! Did you compile the C++ code with optimization flags?
@dougmercer
@dougmercer Ай бұрын
Yup! You can check out the C++ code/compile command here, gist.github.com/dougmercer/1a0fab15abf45d836c2290b98e6c1cd3
@MrNolimitech
@MrNolimitech Ай бұрын
When you reach the 100x speed performance, I don't think it really matter that you can do better (Maybe with some cases). Most of the time, it's only because the code is wrong. People that is new (or even pro) to python, think it's slow, because they heard it somewhere. But in fact, it's only because they can't write better codes. They duplicates everything. They initializes the same thing at multiples times. They repeat themself. Using multiprocessors or threads with a huge function (method) that do everything inside, instead of separate things and use the cpu/gpu for specifics calculations. These are good libraries, but I hope people will try to optimize their codes with betters lines before using those libraries.
@dougmercer
@dougmercer Ай бұрын
I agree! there is usually a lot of room to make your algorithm/implementation better
@overbored1337
@overbored1337 11 күн бұрын
Python is super slow by default. The only skill issue is actually the choice of Python when performance matters, because it was never designed for speed, or power draw, and optimizing it goes against its fundamentals. If it does not fit, as is, then use another language instead of a shoehorn.
@roshan7988
@roshan7988 11 ай бұрын
Great video! Super underrated channel. Love the graphics
@dougmercer
@dougmercer 11 ай бұрын
Thanks Roshan! Means a ton to hear that =]
@sageunix3381
@sageunix3381 25 күн бұрын
limited branch c code will usually be faster in most applications , but if you want code to be ridiculously fast use assembly. inline assembly is cool too works directly with c. however speed comes at the cost of convenience often
@jamesarthurkimbell
@jamesarthurkimbell Ай бұрын
Nice video! Well done
@dougmercer
@dougmercer Ай бұрын
Thanks for watching!
@cmleibenguth
@cmleibenguth 6 ай бұрын
Interesting results!
@dougmercer
@dougmercer 6 ай бұрын
Thanks! I was surprised too
@Iejdnx
@Iejdnx Ай бұрын
5k subs? I swear I thought you had like 1 million because of how good this video was I'm subscribing
@dougmercer
@dougmercer Ай бұрын
Thanks =] I appreciate it. It's been a slow grind, but the past few days the algorithm has blessed me with some impressions, so I hope it keeps going 🤞
@ianposter2161
@ianposter2161 4 ай бұрын
Hey, thanks for an amazing video! Which one would you suggest so that I can just grab my regular python code with dataclasses and get a performance boost with no tweaks whatsoever?
@dougmercer
@dougmercer 4 ай бұрын
Thanks for watching! =] I'd try mypyc first. The others are way more disruptive and would probably require changes to your code
@ianposter2161
@ianposter2161 4 ай бұрын
​@@dougmercer Thanks for your answer! I was thinking of something. Nowadays we almost always use type hints because they are great. But only for clarity/type-checkers like mypy. So we are not getting any performance benefit out of it, although I think we could have! Cython translates python to C and forces us to write statically-typed python for that. Which type hints could also be used for... Turns out that Cython supports type hints as well! Then we have stuff like MonkeyType that allows us to automatically type-hint code based on runtime behavior. Nice for annotating legacy code. 1) we write python code with type hints 2) if needed apply MonkeyType to apply them everywhere 3) compile with Cython 4) get a C-like performance I wonder why it's not actually practiced. Do you have any idea?
@dougmercer
@dougmercer 4 ай бұрын
Mmm, for using type hints to achieve better performance through compilation, I think there's a high level design question: "should your code (1) look/feel like vanilla Python, or (2) are you OK with using non-standard Python features, or (3) are you willing to use syntax that only works in your special language, as long as it still vaguely resembles Python and interoperates with it"? I think mypyc is the closest to achieving the goal of speeding up vanilla Python. cython's python mode is pretty OK, but you need to add extra metadata to make it be performant (e.g., the locals decorator). Cython also has its own type system rather than using Pythons built-in types (e.g., cython.int vs int). Cython as a language (in non-python mode) isn't really Python any more, but interpolates with it well. Some other languages (e.g., Mojo) claim to have a "python-like" syntax and support interacting with Python, but the code isn't really Python.
@ianposter2161
@ianposter2161 4 ай бұрын
​@@dougmercer Yeah it would be amazing if we could just write vanilla python with standard type hints and compile it with Cython. Apparenly Cython somewhat supports it. KZbin blocks my commend if I paste a link but you can search this on google: Can Cython use Python type hints? Because todays type hints are everywhere and we don't get any performance benefit out of it at all, which feels weird.
@dougmercer
@dougmercer 4 ай бұрын
It's hard to say-- when I was experimenting with this problem I remember not observing any speed up when adding vanilla Python typehints, and it wasn't until I started adding things like the @locals decorator that I really noticed any improvement. Let me know if you do any testing that shows a meaningful speed up!
Ай бұрын
Very usefull. A quick question, what eas the optimization level for compiling the c++ code. It can really make a diferrence.
@dougmercer
@dougmercer Ай бұрын
I used -O3. Another commenter recommended using a 1D array and handling indexing through arithmetic, and that does speed up the C++ by about 1.1-1.2x. (still pretty similar to the ndarray approach from Taichi) Here's the c++ code and build script if you want to play around with it yourself =] gist.github.com/dougmercer/1a0fab15abf45d836c2290b98e6c1cd3
@JohnMitchellCalif
@JohnMitchellCalif Ай бұрын
interesting and useful! Subscribed.
@dougmercer
@dougmercer Ай бұрын
Thanks! And welcome =]
@IamusTheFox
@IamusTheFox Ай бұрын
Im enjoying the video, serious question though. How can jit be faster than c++? Did you have the c++ optimizer on? Nevermind, found a comment where you said that you used -O3. Great work. I feel like anyone who complains about your c++ isn't being fair. While i may have done it another way, its valid
@dougmercer
@dougmercer Ай бұрын
Probably means that I left some performance on the table in the C++, or the JIT pulled some tricks that most people wouldn't pull when writing it natively. Someone else in the comments found that using a flat 1D array gave the C++ a 1.1-1.2x speedup. That probably puts it on par with the Numba/Taichi ndarray approaches That said, the point of the video still stands-- for at least this particular problem, there are several approaches for getting performance on par with native C++
@IamusTheFox
@IamusTheFox Ай бұрын
Absolutely! Fantastically well done. I'm really quiet impressed by what you did.
@dougmercer
@dougmercer Ай бұрын
Thanks =]
@lchunleo
@lchunleo 8 ай бұрын
Good work
@dougmercer
@dougmercer 8 ай бұрын
Thanks =]
@timlambe8837
@timlambe8837 6 ай бұрын
Really interresting Video. I‘d love to learn more about it. Maybe I will be laughed at for this statement, but even with this video i feel like bringing python to C-Level performance seems to be quite a bit of an effort. Isnt it worth it to learn C/C++ for special tasks? How would you evaluate the developer‘s expirience comparing „Make everything possible with Python“ with „Learning C/C++ or Rust“? Thanks a Lot!
@dougmercer
@dougmercer 6 ай бұрын
You're right! It's not easy to get C++ performance in Python. I think these tools are appropriate when there are a few "hot spots" in your code, but the majority of your application benefits from Python's ecosystem. It's possible to directly build C extensions and call them from python, but I think these tools are way easier. For some (new) projects, it might make sense to write the whole thing in Rust from the start. In practice, most of my projects use a lot of Python libraries, and my team is not very flexible (they mostly only know Python), so it'd be pretty disruptive if I wrote a critical component in a different language and with different tooling. Good question! (Sorry I don't have a good answer =P)
@timlambe8837
@timlambe8837 6 ай бұрын
@@dougmercer that is indeed a good answer, thanks. Since I am working in the Data analysis field (geospatial) I love Python for its possibilities. I was wondering if it makes sense to learn another language for intensive calculations like C++. But think I will try your tools 😊 Many thanks!
@BaselSamy
@BaselSamy 3 ай бұрын
Wonderful video, even for a beginner like myself! I wonder if you could share the animation tool you used? I feel it would be awesome for my presentations :))
@dougmercer
@dougmercer 3 ай бұрын
Thanks! I primarily used Davinci Resolve, but used the Python library `manim` (community edition) for the code animations.
@BaselSamy
@BaselSamy 3 ай бұрын
Thanks! @@dougmercer
@atharv9924
@atharv9924 6 ай бұрын
@Dough: Your channel's popularity should be atleast 100x more!!!
@dougmercer
@dougmercer 6 ай бұрын
Thanks so much! Fingers crossed the channel does grow 100x 🤞. At that point I prob could make videos full time 🤯
@ivolol
@ivolol 11 ай бұрын
Would be interested to see what Pypy and nuitka do for it as well.
@dougmercer
@dougmercer 11 ай бұрын
If this video ends up getting some more views, maybe I'll do another pass at adding other options. I have a *guess* though... PyPy would speed this up significantly, probably on par with numba. I've heard good things about it *but* it didn't install first try when using conda on my M1 Mac, so I skipped it ¯\_(ツ)_/¯ Nuitka would only speed things up a little bit. From what I've read, nuitka is more so about compatibility (supports *all* python language constructs) and for making standalone, portable builds. For nuitka, speed is secondary to those concerns
@thesnedit5406
@thesnedit5406 Ай бұрын
The theme, info, ambience and the whole vibe of the video is so good. Subscribed !
@dougmercer
@dougmercer Ай бұрын
That's like the best compliment =] thanks!
@lapppse2764
@lapppse2764 Ай бұрын
10:48 I think it would be nice to define on the left that lower is better (I've usually seen it done in benchmarks). Thank you for the video! About CPP, I think you might've used SIMD instructions.
@dougmercer
@dougmercer Ай бұрын
Good point, I def could have made the metrics interpretation clearer. As for SIMD, it's hard to parallelize this because it's an inherently serial problem (everything requires previous solutions)
@MaxShapira2real
@MaxShapira2real 11 ай бұрын
You should put out an advanced Python course. Great job buddy!
@dougmercer
@dougmercer 11 ай бұрын
Maybe one day! Thanks Max!
@etiennetiennetienne
@etiennetiennetienne Ай бұрын
There are also ways to write c++ directly in python i think, for instance cppyy or with torch extension
@dougmercer
@dougmercer Ай бұрын
True! Through C/C++ extension libraries, you can directly write/link C/C++ libraries and write your own Python interface to it. Cppyy, ctypes, cffi, pybind11, and Cython are all fair game for this.
@gorrofrigio5570
@gorrofrigio5570 20 күн бұрын
Thank you Doug for this awesome video! Btw, just curious: has anyone tried some of this on Pygame? I know Python it's not a common language in the videogame industry, but maybe some of this could bring it some justice (and good surprises).
@dougmercer
@dougmercer 20 күн бұрын
You can definitely use Cython or Numba to help speed some things up with pygame. I found a few old reddit threads that included demos and discussions by searching "Numba pygame reddit".
@OliverBatchelor
@OliverBatchelor 2 ай бұрын
Taichi for the win. You didn't even use GPU programming with it, which is all I do - the inter-op with torch is excellent and works the same way as the ndarray.
@dougmercer
@dougmercer 2 ай бұрын
Taichi was super fun. I did use GPU (well, metal) for rendering the fractal animation. Was pleasantly surprised at how easy it was.
@OliverBatchelor
@OliverBatchelor 2 ай бұрын
@@dougmercerSorry that possibly came out the wrong way - I meant that you did a great job demonstrating it *even without* using the GPU!
@dougmercer
@dougmercer 2 ай бұрын
Oh, I see now-- hah! Thanks =] I definitely would like to try using Taichi for an ML project. Taichi + Torch seems like a great fit. Do you have any open source projects you've done with it? (I have skimmed through the docs section involving torch, but haven't looked at real projects). I also thought it might be fun to make a "shader" to process video (but I can't for the life of me figure out how to extend Davinci Resolve with Python code, so that's kind of an unrelated blocker).
@OliverBatchelor
@OliverBatchelor 2 ай бұрын
@rcer Yep! A few now - most of them are for bits and pieces I do at work, and largely undocumented e.g. for an HDR Camera ISP pipeline or a spatial subdivision grid for distance queries. By far the biggest one so far is a Taichi library for Gaussian Splatting rasterization, I called it taichi-splatting (distinct from original taichi_3d_gaussian_splatting, which it originally derived from but is very different now!). It has a few rough edges but I think it has enabled quite a clean yet performant implementation. I replied a yesterday but I see my comment is nowhere to be seen I think because I put a link in here, so I haven't this time! I must admit that before watching this video I did not realise that the CPU implementation in taichi performed so well, especially with the outer loop serialised!
@dougmercer
@dougmercer 2 ай бұрын
Oh right! I saw your posts in the discord-- I read through the readme a bit. It looks very interesting-- I'll take another look at the source code sometime tomorrow. And sorry about the link issue! For some reason it's not showing up in comments held for review on the mobile app. I'll check in a browser tomorrow and hopefully approve it (if not, KZbin totally ate it-- sorry)
@helkindown
@helkindown Ай бұрын
Great video! From what I've tested, your C++ code is good enough. The main bottleneck of your code seems to be the dp result variable. I was able to double the speed (from 3.78832 to 1.77546 seconds) by replacing dp 2D array by two 1D arrays: one "current row" array and "previous row" array, and swapping references around at each iteration. This probably because the code don't have as many cache misses by not fetching new rows of the "dp" array, which are filled by zeros anyway. I did not test this with the Python code, but the same speedup should be obtainable by using two variable (or an tuple of 2 arrays) to keep up with C++.
@dougmercer
@dougmercer Ай бұрын
Good point! I may have to re-run this experiment at some point-- I wonder how Numba/cython would perform with that more memory efficient approach 🤔
@ButchCassidyAndSundanceKid
@ButchCassidyAndSundanceKid 4 ай бұрын
Was your taichi (arch) based on cpu or gpu when you carried out the benchmark testing ?
@dougmercer
@dougmercer 4 ай бұрын
The LCS dynamic program was on CPU. The visualization I showed at the beginning of the section of a kind of warping fractal was on GPU.
@ButchCassidyAndSundanceKid
@ButchCassidyAndSundanceKid 4 ай бұрын
@@dougmercer Thanks. Taichi certainly looks promising, but I still prefer Numba for its simplicity, i.e. adding a couple of decorators, without altering the code too much. Have you tried Spark and Dask ? They're both parallel programming libraries.
@dougmercer
@dougmercer 4 ай бұрын
Yup, both are great! Since this problem couldn't be easily parallelized, I didn't mention them. And I agree, in general Numba will be easier than Taichi by a long shot. I just thought Taichi was kind of neat so I included it in the video ¯\_(ツ)_/¯
2 ай бұрын
Nice. Thanks!
@dougmercer
@dougmercer 2 ай бұрын
No prob! Glad it was helpful
@Daekar3
@Daekar3 Ай бұрын
I feel like this is one reason why my PC is literally god-tier compared to what I went to college with, but the day to day experience really isn't ant different. My games are prettier and my SSD is bigger, but the mechanics is using the OS is NOT orders of magnitude better.
@imadlatch7206
@imadlatch7206 3 ай бұрын
we just use pypy as interpreter, no need anything else
@dougmercer
@dougmercer 3 ай бұрын
Yeah, pypy is a great option
@sootguy
@sootguy Ай бұрын
what about pypy?
@dougmercer
@dougmercer Ай бұрын
I'm working on a video that uses it right now =]
@user-up8fm3vb1r
@user-up8fm3vb1r 23 күн бұрын
Amazing work, as someone who has to use python against my will, I enjoy your videos
@dougmercer
@dougmercer 23 күн бұрын
Thanks =]. What's your preferred language if Python is against your will?
@user-up8fm3vb1r
@user-up8fm3vb1r 22 күн бұрын
@@dougmercer Haskell is my love and I like lambda calculus so I am writing a interpreter and compiler for my own lc implementation for fun. (in haskell)
@dougmercer
@dougmercer 22 күн бұрын
@@user-up8fm3vb1r very cool. I haven't touched Haskell much, but I'm learning ocaml for fun recently and enjoying it
@user-up8fm3vb1r
@user-up8fm3vb1r 22 күн бұрын
@@dougmercer glad to see you join the functional land.. enjoy!!
@RobertLugg
@RobertLugg Ай бұрын
How did you make those amazing looking bar charts?
@dougmercer
@dougmercer Ай бұрын
Hah, *very carefully* in Davinci Resolve (Fusion Page) =P I manually drew the graph using rectangles, then applied (noise + displace) to make it more irregular + (fade it out with noise + the "painterly" effect from Krokodove) to give it the water color appearance + paper texture + adding lens blur One of my favorite animations I've made =]. Thanks for commenting on it
@janAkaliKilo
@janAkaliKilo Ай бұрын
Another option - learn Nim. It is an easy to learn language with a pythonic syntax. Because Nim is a compiled language, it's speed is on par with C, C++ and Rust.
@dougmercer
@dougmercer Ай бұрын
I've been meaning to give it a shot... It definitely seems very approachable
@ThatJay283
@ThatJay283 Ай бұрын
with the c++ version, did you compile it with -O3 optimisations enabled?
@dougmercer
@dougmercer Ай бұрын
Yup! gist.github.com/dougmercer/1a0fab15abf45d836c2290b98e6c1cd3
@ThatJay283
@ThatJay283 Ай бұрын
@@dougmercer thanks! i just managed to get it 169% faster (see fork). still, the speed improvements offered by numba, pyx, and taichi are really impressive :)
@dougmercer
@dougmercer Ай бұрын
Very cool! Yesterday I implemented the 1D index approach (not nearly as cleverly-- just hand jammed the indexing arithmetic in line) and I got about 1.1-1.2x speed up. Does the noexcept make a difference in performance? Or is there something else causing the extra 0.4ish speed up 🤔
@arta6183
@arta6183 4 ай бұрын
Can you also share the C++ code? It's very easy to write slow C++ code. If the code involves vectors, then AVX optimizations can drastically improve performance on x86 CPUs.
@dougmercer
@dougmercer 4 ай бұрын
Hey @arta6183 - Sure! Here's a link to the code and compile command in a gist - gist.github.com/dougmercer/1a0fab15abf45d836c2290b98e6c1cd3 Note-- this algorithm is inherently *not* parallelizable unless you do some really wonky stuff (wave front optimization). So, I'm not sure if AVX will help. That said, I would love to see you squeeze 10x more performance out of it and share a gist back to me. Like I said in the video -- I only know the absolute basics of C++, so my C++ code is *bad*.
@AndersonPEM
@AndersonPEM 23 күн бұрын
[Tries with Rust] the result shows up before you even start the program 😂
@legion_prex3650
@legion_prex3650 2 ай бұрын
Love you channel! Nice 80ies sound!
@dougmercer
@dougmercer 2 ай бұрын
Thanks! I had fun choosing music for this one =]
@BrunoGallant
@BrunoGallant Ай бұрын
Great production value. Thanks for the tips. Grumpy linux sysadmin here, definitively does not want to learn C++. With good speed, python is perfect.
@dougmercer
@dougmercer Ай бұрын
Glad it was helpful! And definitely, I'm a big fan of "good enough" speed, and generally I can get that with Python
@PySnek
@PySnek 23 күн бұрын
What about Nim?
@marcelobravo3074
@marcelobravo3074 5 ай бұрын
this is gold
@dougmercer
@dougmercer 5 ай бұрын
Thanks! Glad you liked it =]
@system64_MC
@system64_MC 2 ай бұрын
What happens if you use the -O2 or -O3 optimisation flag for the C++ implementation?
@dougmercer
@dougmercer 2 ай бұрын
I did compile with -O3 for my C++ test
@dougmercer
@dougmercer 2 ай бұрын
gist.github.com/dougmercer/1a0fab15abf45d836c2290b98e6c1cd3
@system64_MC
@system64_MC 2 ай бұрын
@@dougmercer Oh, you did. This is surprising how Python can be faster than C++!
@dougmercer
@dougmercer 2 ай бұрын
Definitely surprising! That said, I'm sure someone could write faster C++! But, it did beat my first attempt at translating the code into C++ ¯\_(ツ)_/¯
@Petch85
@Petch85 6 ай бұрын
Grate video. I will give numba a try... I use numpy all the time, and that is super fast for my work. But I always end up needing to plot some numbers, and save it as a png file or something. I use matplotlib, and most of the time i can read and manipulate my data i lest than 0.1 sec. But then making the plot takes maybe 1 sec, and saving the png file also take 1 sec. Is there anything I could do. (I have more than one file of data, and need more than one plot saved... I know 3 sec do not seem like a long time, but it adds up)
@dougmercer
@dougmercer 6 ай бұрын
Hmm, I don't have any sure-fire recommendations. Could potentially try using multiprocessing if your plotting function is easy to map over an iterable of inputs? That way you can maybe speed up by the number of cores your CPU has.
@lbgstzockt8493
@lbgstzockt8493 Ай бұрын
Are you showing the plot? There is a way to not show the plot windows but still save to a file, it is still slow but much less than two seconds.
@Angel33Demon666
@Angel33Demon666 29 күн бұрын
How does this compare with Julia? I found that its fast just out of the box
@dougmercer
@dougmercer 29 күн бұрын
I didn't try Julia, but I've used it a bit in the past and it is quite fast. In a future video, I'd like to throw Julia and Nim into the mix
@varunbhaaskar3338
@varunbhaaskar3338 21 күн бұрын
how many of them are production ready? is there anything like this that is production ready?
@dougmercer
@dougmercer 21 күн бұрын
I would say Cython and Numba are definitely "production ready"
@jimmysaxblack
@jimmysaxblack 2 ай бұрын
fantastic thanks a lot
@dougmercer
@dougmercer 2 ай бұрын
Glad it was helpful =]
@cucen24601
@cucen24601 Ай бұрын
"Numba is so much easier than Cython" In reality it is so much more painful to code in Numba and doesn't really work very well. At least for Cython, if I coded correctly, it works correctly. Numba doesn't let me do things I expect them to do...
@dougmercer
@dougmercer Ай бұрын
That is fair. It works well when you already mostly (or entirely) use supported features, and is incredibly painful when you don't.
@cucen24601
@cucen24601 Ай бұрын
@@dougmercer Yup. Thank you for the great content by the way, I wasn't paying much attention to numba and thought Cython to be as fast it could get, but the results are shocking.
@dougmercer
@dougmercer Ай бұрын
There might have been some cython tricks that I missed. I'm definitely not an expert, but I found it really hard to get up to Numba speeds.
@Caspar__
@Caspar__ Ай бұрын
But most of the time I use pyhton libraries. Can I just in time copile those as well?
@dougmercer
@dougmercer Ай бұрын
I do not believe any of these options will compile or JIT third party libraries. If I'm wrong, hopefully someone will correct me. That said, you can try using a different a Python interpreter altogether. PyPy would JIT whatever code it runs (but you need to use the PyPy interpreter instead of CPython)
@Caspar__
@Caspar__ Ай бұрын
@@dougmercer Thanks a lot : )
@mayankmaurya8631
@mayankmaurya8631 Ай бұрын
Ideally, C++ can't be slower than any language's any kind of implementation because in C++ you can literally write hardware-level controls. What I'm saying is your C++ code was not very well written. The machine code that taichi or numba produced can be also produced by C++, so it was not a good comparison.
@dougmercer
@dougmercer Ай бұрын
@mayankmaurya8631, I think you missed the point of the video. For a Python developer trying to accelerate a hot spot in their code numba, cython, and taichi were found to be just about as fast as C++. So, rather than hand writing C++ and complicating their build system, they can pip install numba and get just as good performance for very little work.
@vonnikon
@vonnikon 15 күн бұрын
"Hand writing C++" Python is not "hand written"? Most of the fasy Python solutions presented in the video resulted in messy code and/or dependency/compatibility. C++ has none of those problems. I suppose that's the real point of the video. Can you make Python run fast? Yes, but it is easier and more maintainable to get the same result using C++.
@Zeioth
@Zeioth Ай бұрын
I'm missing nuitka on that comparison, but very cool.
@dougmercer
@dougmercer Ай бұрын
I've never tried it! Does it work well? I'll have to mess with it sometime 🤔 That said, I am working on a video where I cover one library that I wanted to include in this video (PyPy).
@streamdx
@streamdx Ай бұрын
You should not use vector of vectors in c++ First of all you will allocate memory m+1 times (for each of inner vector). This is slow. Also this data layout is not cache friendly because each vector will be allocated on its own and whole table is scattered around. What you really should do is define one big (m+1)*(n+1) vector and use this contiguous space as if it has two dimensions like this v[i*m + j] So you skip i rows then select j column. I bet you can easily beat python with this simple modification. Also be sure to compile it with at least -O2 optimization in release configuration so no debug stuff will slow you down at runtime
@dougmercer
@dougmercer Ай бұрын
Another commenter actually already tried a single contiguous vector. They found that -O3 optimizes away any difference in performance. Here's the comment thread where they talked about their attempts kzbin.info/www/bejne/q56vi6OerKZgmtk&lc=UgyNE2s94tUKjG3hayF4AaABAg.9rJ4vi7-9UyA-ES8Dn0d1t (needs to be opened on desktop) Here's a gist to the implementation and compile command used in the video gist.github.com/dougmercer/1a0fab15abf45d836c2290b98e6c1cd3 So, feel free to let me know if you get a significantly faster -O3 optimized version. If you do, I'll pin your comment.
@streamdx
@streamdx Ай бұрын
If you will experiment try to change i and j in v[i*m + j] By changing it you will change memory traversal order from row major to column major. This will change cache misses ratio and resulting speed. You can google for cache friendly data layout to learn more. Those things are very important if you want speed!
@dougmercer
@dougmercer Ай бұрын
Building libraries Length 30000 Sequences, 10 Reps Time using lcs_taichi_ndarray: 23.286619998994865s Time using lcs_taichi_field_once (this one is cheating): 18.95515070901456s Time using original C++: 26.1285s (n_rep=10) Time using O(n) memory C++: 26.7876s (n_rep=10) Time using flattened 2D into 1d C++: 22.3163s (n_rep=10) so, a 1.10-1.20ish speed up, but not enough to meaningfully change the analysis.
@streamdx
@streamdx Ай бұрын
@@dougmercer totally might be true because compilers are very smart with optimizations now days. Table is a local variable so compiler is allowed to do basically anything( as long ad observed behavior is not changed). Also difference might be invisible if whole table can fit in cache. I dont remember size you tested. Anyway it is good to know that you already familiar with this little details. If everything is done properly it is really interesting why c++ looses some speed. I should look at your video more carefully…
@dougmercer
@dougmercer Ай бұрын
I had two input arrays of length 30,000. so that would induce a 30,000 by 30,000 matrix. So, kind of big? That said, the 1D indexing did close the gap between the taichi ndarray approach and the C++. So, I don't think it lost any significant speed to taichi. Reason being, the the taichi approach that allocates the field once is unfair (insofar as other approaches could have also made that optimization, but I didn't implement them).
@ethan91372
@ethan91372 Ай бұрын
4:00 where do you get this footage?
@dougmercer
@dougmercer Ай бұрын
Storyblocks
@markkim5117
@markkim5117 4 ай бұрын
WOW I'm impressed!
@dougmercer
@dougmercer 4 ай бұрын
Thanks! =]
@dearheart2
@dearheart2 22 күн бұрын
I wish all videos (no just youtube) has voice and music as separate channels. I hate music in educational videos.
@Erros
@Erros Ай бұрын
the speed up at 2:26 is a funnier number than 100x but also much lower 2:56 minutes -> 176 seconds / 2.56 seconds
@dougmercer
@dougmercer Ай бұрын
Ah, the clock visualization is confusing. The vanilla Python approach did take 256 seconds, not 2 minutes 56 seconds.
@BobbyMully
@BobbyMully Ай бұрын
75% of use cases you run into, it'll be fine to just use Python.
@dougmercer
@dougmercer Ай бұрын
Definitely agree
@nevokrien95
@nevokrien95 26 күн бұрын
More like 90%... I am trying to find an excuse to use c and its actually very hard finding something thst dosent have optimized code for you
@dougmercer
@dougmercer 26 күн бұрын
@@nevokrien95 same-- I need to carve out time to learn Go this year but have literally no reason to do so ¯\_(ツ)_/¯
@nevokrien95
@nevokrien95 26 күн бұрын
@dougmercer i am using it to write a proxy server that I switches between vpn connections. Go let's u do networking stuff python. Just can't.
@Rajivrocks-Ltd.
@Rajivrocks-Ltd. 27 күн бұрын
But did you put as much effort in your CPP implementation as your python implementation? I love python as much as the next guy and I know a lot of python peeps don't want to write CPP but, at some point you gotta really wonder, "should I just learn CPP?"
@dougmercer
@dougmercer 27 күн бұрын
In some of the other comments, people were able to squeeze another 10-20% performance out . It doesn't meaningfully change the msg of the video.
@cleteblackwell1706
@cleteblackwell1706 Ай бұрын
Can you do these kinds of comparisons for building flask apps?
@dougmercer
@dougmercer Ай бұрын
Hmm, what specifically did you have in mind? As an aside, I typically use FastAPI for Python web projects, but have used Flask in the past
@cleteblackwell1706
@cleteblackwell1706 Ай бұрын
Either is fine. Maybe an api that calls a couple other APIs and reads from a database. That would be your typical business api.
@demonman1234
@demonman1234 Ай бұрын
Yk.. I’m not a software dev or anything for companies and I code for either myself or requests from friends… I’ll wait or they can wait for my program to finish (: I get enough headaches as it is for absolutely no logical reason… no need for another (:
@dougmercer
@dougmercer Ай бұрын
Hah, I think that's a perfectly good approach =]. I'll take fast to write over fast to run on most days
@demonman1234
@demonman1234 Ай бұрын
@@dougmercer Exactly… plus this just seems like too much of a hassle for me. LOL my programs aren’t typically big enough for it to matter.
@rm9050
@rm9050 5 ай бұрын
Is useful use Taichi for load csv like pandas? I discover dask and is fantastic
@dougmercer
@dougmercer 5 ай бұрын
Hmm, I might be wrong, but I don't believe Taichi has any filesystem support. I believe the simple thing to do would be to read data in Python and pass it to Taichi for processing. That said, I love Dask and Pandas! They rock!
@stereoplegic
@stereoplegic 2 ай бұрын
Polars is faster than Pandas with almost identical API, right?
@dougmercer
@dougmercer 2 ай бұрын
Yes, it is. I'm actually working on a video that talks about trying to read a very large CSV file and do some basic number crunching with it. (The one billion rows challenge, 1brc, but in Python) Spoiler alert, Polars and Duckdb are great choices.
@incremental_failure
@incremental_failure 2 ай бұрын
Polars is by far the fastest to load CSV. It might even be faster when you load in polars and convert to pandas.
@Uveryahi
@Uveryahi Ай бұрын
Came for the video, stayed for the stock footage inserts x)
@dougmercer
@dougmercer Ай бұрын
=] I also used Nosferatu in my other video called "Your code is almost entirely untested"... I wonder what it means that I keep putting horror movie clips into my Python explainers 🤔
@budidarmawan6959
@budidarmawan6959 Ай бұрын
this is a very nice video.
@dougmercer
@dougmercer Ай бұрын
Thanks =]
@monza8844
@monza8844 4 ай бұрын
Meh... I'm just going to learn a language that is fast, instead of dealing with this hassle.
@B_a_s_t_e_r_b_i_n_e
@B_a_s_t_e_r_b_i_n_e 3 ай бұрын
Well, if you work with special case like i do. It's better to deal with this hassle instead learning another language lmao.
@JonitoFischer
@JonitoFischer 2 ай бұрын
Not everything should run fast. Maybe running slow and use less time to develop is the way to go.
@HuxleysShaggyDog
@HuxleysShaggyDog 2 ай бұрын
Is it academics you work with or libraries nobody ported to other languages or made into DLLs for interop?
@anandsuralkar2947
@anandsuralkar2947 Ай бұрын
Good luck making a website in c++ or js compared to python
@HuxleysShaggyDog
@HuxleysShaggyDog Ай бұрын
@@anandsuralkar2947 It's really not that hard, MVC frameworks exist for everything.
@frikkied2638
@frikkied2638 Ай бұрын
Hey man, very interesting content. Some unsolicited advice that is meant to help and not be mean, but in my opinion all the stock video you use to try and describe every single sentence is a bit distracting and doesn’t add value and background music is a bit loud/unnecessary. Very interesting content though 👍
@dougmercer
@dougmercer Ай бұрын
Hey, thanks for the really polite and sincere feedback! I agree with both points. In more recent videos, I've used less and less stock footage, and I think I've gotten a bit better at mastering my audio to keep my voice easier to hear. Hopefully I keep getting better at this moving forward. Cheers!
@frikkied2638
@frikkied2638 Ай бұрын
@@dougmercerI should have checked out your latest stuff before commenting, I will check it out now, and subscribe 👍
@Gardenmonkey78
@Gardenmonkey78 4 ай бұрын
Numba is super cool, you can also parallelize super easily
@dougmercer
@dougmercer 4 ай бұрын
Absolutely, I ❤️ numba
How Fast can Python Parse 1 Billion Rows of Data?
16:31
Doug Mercer
Рет қаралды 131 М.
5 Good Python Habits
17:35
Indently
Рет қаралды 286 М.
Айттыңба - істе ! | Synyptas 3 | 7 серия
21:55
kak budto
Рет қаралды 1,5 МЛН
YouTube's Biggest Mistake..
00:34
Stokes Twins
Рет қаралды 49 МЛН
The magical amulet of the cross! #clown #小丑 #shorts
00:54
好人小丑
Рет қаралды 16 МЛН
Her Birthday Was Ruined 😰😩 He Created A Trap For Her🙀
00:40
Giggle Jiggle
Рет қаралды 4,7 МЛН
Write Python code people actually want to use
8:03
Doug Mercer
Рет қаралды 11 М.
Make Python code 1000x Faster with Numba
20:33
Jack of Some
Рет қаралды 437 М.
How To Use List Comprehension In Python
6:41
Taylor's Software Solutions
Рет қаралды 2,4 М.
25 nooby Python habits you need to ditch
9:12
mCoding
Рет қаралды 1,7 МЛН
Python's 5 Worst Features
19:44
Indently
Рет қаралды 58 М.
So You Think You Know Git - FOSDEM 2024
47:00
GitButler
Рет қаралды 907 М.
I Rewrote This Entire Main File // Code Review
16:08
The Cherno
Рет қаралды 83 М.
STOP Using Plain Python Scripts! Do this instead (5 reasons)
15:47
Turn Python BLAZING FAST with these 6 secrets
5:01
Dreams of Code
Рет қаралды 28 М.
The Bubble Sort Curve
19:18
Lines That Connect
Рет қаралды 314 М.
Айттыңба - істе ! | Synyptas 3 | 7 серия
21:55
kak budto
Рет қаралды 1,5 МЛН