The Fastest Way to Loop in Python - An Unfortunate Truth

  Рет қаралды 1,402,951

mCoding

mCoding

3 жыл бұрын

What's faster, a for loop, a while loop, or something else?
We try several different ways to accomplish a looping task and discover which is fastest.
― mCoding with James Murphy (mcoding.io)
Source code: github.com/mCodingLLC/VideosS...
SUPPORT ME ⭐
---------------------------------------------------
Patreon: / mcoding
Paypal: www.paypal.com/donate/?hosted...
Other donations: mcoding.io/donate
BE ACTIVE IN MY COMMUNITY 😄
---------------------------------------------------
Discord: / discord
Github: github.com/mCodingLLC/
Reddit: / mcoding
Facebook: / james.mcoding

Пікірлер: 2 200
@AlexPadula
@AlexPadula 3 жыл бұрын
You know, if the time difference between a while loop and a for loop matters to your application, Python might not be the most appropriate language.
@abebuckingham8198
@abebuckingham8198 2 жыл бұрын
Comparisons are expensive and iterating is cheap. That's true in assembly, C and Python.
@firstname4337
@firstname4337 2 жыл бұрын
WRONG -- he just showed that numpy works fine
@HiltownJoe
@HiltownJoe 2 жыл бұрын
@@firstname4337 Yes, because if you use numpy you use C. That is the whole point of numpy, to not use python for operations that are repeated millions of times. Alex' point stands, because in the numpy case we use Python for everything that is not time sensitive, but for the thing that is time sensitive we use a small C monster wrapped in a thin fluffy Python blanket.
@boredhero
@boredhero 2 жыл бұрын
@@HiltownJoe Have you heard of PyPy. I ran his source file in pypy3 and got less than a quarter of a second for all methods except sum generator and sum list comp (0.72 seconds for those)
@coreylazy
@coreylazy 2 жыл бұрын
Clearly not a protip. A professional understands the constraints of their environment and the project they're working on. The choice of programming language is not (always) up to the individual writing the application.
@nathandecena7694
@nathandecena7694 3 жыл бұрын
"The Fastest way to Loop in Python is not to Loop in Python" Great sentence.
@karx11erx
@karx11erx 2 жыл бұрын
Talk about understanding the letters, but not the meaning.
@samuelhulme8347
@samuelhulme8347 2 жыл бұрын
To loop,or no to loop that is the question
@nathandecena7694
@nathandecena7694 2 жыл бұрын
@@samuelhulme8347 Hahaha DEEP
@mohamedfilali977
@mohamedfilali977 2 жыл бұрын
@@samuelhulme8347 ok, nice job, now im lost ..lool
@Nerketur
@Nerketur 2 жыл бұрын
"a strange language. the only winning way is not to use it. how about a nice game of chess?"
@peterwilson8039
@peterwilson8039 Жыл бұрын
I'm a physicist and my experience is that when you spend the majority of your time writing the code, and running it takes only a few seconds, reducing the development time is much more important than reducing the run-time, and that's why I like Python.
@mCoding
@mCoding Жыл бұрын
So true, and something many eager-optimizers need to hear! They will certainly learn from experience as they discover where they spend their time.
@yutubl
@yutubl Жыл бұрын
Hi Peter, well I've been a software developer delievering a (Basic) scriptable technical measurement data PC-application NI DIAdem, and we put most of our development effort into giving our users like you (engineers & physicists) optimized script commands, which behave similar to that of Python. So using already implemented loop command may give you better performance than own loops as described here.
@FlanPoirot
@FlanPoirot Жыл бұрын
that's the thing tho, they're not mutually exclusive, you can have a very easy to write and a decently performant language (look at julia) why python is slow is not bc it can't go faster or bc it's a sacrifice for expressiveness (as there are python implementations that are way way faster but that lack total compatibility with libs) The reason why python is slow is because it's poorly designed. It uses linked lists for lists instead of vectors (log of n vs log of 1), it compiles to an intermediate language but it leaves it at that, there's no JIT in Cpython, the global interpreter lock impedes people from writing super performant code, etc, etc Python is slow bc of the choices they've made over the years, they're consistently made choices that affect performance, they never took the time to think about it and see how they could implement abstractions that would not take a big toll on performance, they just added things on top as they saw fit. But that's not a big surprise once you see that python was a pet project of Guido and he just made it bc he could and people happened to pick it up and spread it (even tho it's performance and features were worse back then)
@CottidaeSEA
@CottidaeSEA Жыл бұрын
You're using Python for simpler scripts, like what it was made for. That's a valid use case. Making large systems in Python is just stupid for the simple reason that it will be expensive to run.
@spl420
@spl420 Жыл бұрын
That's depends on task, but when we talking about science - it's 100% true.
@tomaskot9278
@tomaskot9278 3 жыл бұрын
Fun fact: A good C compiler would detect this common pattern (summing sequential numbers) and replace the for loop with the direct calculation of the result. And if even the number of elements was hard-coded, the compiler would even simply replace the whole code with the hard-coded numeric answer.
@Czeckie
@Czeckie 3 жыл бұрын
I always wondered what's the point of this? Are series we have summation formulas for really coming up in a real software? As far as I can tell the only application is to show it in a presentation about optimizing compilers. And frankly, this is even a wrong showcase of optimization, cause the useful techniques are general and pliable enough for compiler to reason about. This is just a hardcoded nonsense.
@vinno97
@vinno97 3 жыл бұрын
@@Czeckie this exact problem isn't common. But good compilers can recognize loads of (small) patterns in a codebase that, when combined, can amount to a large speedup. Sometimes some optimized chunks can again be combined into a new, smaller and/or faster, chunk.
@joestevenson5568
@joestevenson5568 3 жыл бұрын
@@oj0024 Yes, if you write shitty enough code the compiler cannot fix it for you because it can’t make assumptions about what you’re trying to accomplish. This is not a revelation or even a shortcoming, its a feature.
@oj0024
@oj0024 3 жыл бұрын
​@Matt Murphy He is right that compilers can do it, but if we look at the two most modern compilers right now only one of them can detect it and only if you write the code in a non-obvious way. The second part showed that compilers don't do it with constants either, they just execute the code internally. I'm not saying that compilers aren't doing great optimizations, just that the claim that "A good C compiler would detect this common pattern" is wrong. Especially since such optimizations don't gain anything (I'm talking about the pattern transformation, not the constant evaluation). They can only be done in very few cases, and you usually assume that the programmers know what they are doing. (e.g. the only reason to implement the summation code would probably be for benchmarking purposes)
@oj0024
@oj0024 3 жыл бұрын
@@joestevenson5568 clang didn't detect the pattern when using a normal for loop, it only detected it if you write the non obvious and usually not used while(n--) loop.
@daifee9174
@daifee9174 3 жыл бұрын
pro tip how to boost python: use C
@gutoguto0873
@gutoguto0873 3 жыл бұрын
or C++, either choice is good.
@embeddor2230
@embeddor2230 3 жыл бұрын
C++ should be the better option here, cause of the OOP support.
@mCoding
@mCoding 3 жыл бұрын
Or any compiled language, really.
@Jonas_Meyer
@Jonas_Meyer 3 жыл бұрын
Assembler the only real choice.
@leopoldbroom7679
@leopoldbroom7679 3 жыл бұрын
@@Jonas_Meyer You're not a real programmer until you can program in assembly
@FaranAiki
@FaranAiki 3 жыл бұрын
What I did not know about this video, is, 100_000_000. I did not know if python has a feature to add underscore between number. Thanks, anyway :)
@mCoding
@mCoding 3 жыл бұрын
I add secret lessons to all my videos!
@lawrencedoliveiro9104
@lawrencedoliveiro9104 3 жыл бұрын
Added in Python 3.6.
@monojitchatterjee3185
@monojitchatterjee3185 3 жыл бұрын
Same! I didn't know this too!
@loutragetadk453
@loutragetadk453 3 жыл бұрын
Same
@rogervanbommel1086
@rogervanbommel1086 2 жыл бұрын
Same
@kitrodriguez992
@kitrodriguez992 3 жыл бұрын
I'm in my 3rd year in Computer Science and trust me, I've watched SO many youtubers that do this and you're by far the smallest for an unknown reason. You explain things really well, you don't beat around the bush and overall just make quality content. :D I'm now subbed and I'll be coming back for more :D
@mCoding
@mCoding 3 жыл бұрын
Thanks so much for the kind words!
@deedewald1707
@deedewald1707 3 жыл бұрын
Excellent content and testing summaries !
@krele14
@krele14 2 жыл бұрын
Then again, he's the only one that would place "just know your result ahead of time lul" as a viable suggestion.
@julianmahler2388
@julianmahler2388 2 жыл бұрын
5:08 It's not about NumPy being primarily written in C (so is CPython), but about Python' dynamic vs. NumPy's static typing. In a dynamically typed language like Python, every time the interpreter encounters an expression like c = a + b, it will first have to check a's type (and find out it's an integer). Then it checks b's type (and finds out it's an integer too). Then it checks if there's any known operation to add (+) two integer types (of course there is). If you do this one million times in a row (e.g. in a loop), the same checks will be performed over and over again. It's like asking "Is a still an integer? ... What about b? ... Do I know how to add two integers?" one million times in a row. And of course all of that happens at runtime. That's why Python is so damn slow in that regard.
@dariuszspiewak5624
@dariuszspiewak5624 Жыл бұрын
Yeah... that's the price you pay for flexibility. But there are ways to overcome this. They all require some kind of hinting that the structures you use contain only elements of the same type.
@sumnerhayes3411
@sumnerhayes3411 Жыл бұрын
> In a dynamically typed language like Python, every time the interpreter encounters an expression like c = a + b, it will first have to check a's type (and find out it's an integer). Then it checks b's type (and finds out it's an integer too). Then it checks if there's any known operation to add (+) two integer types (of course there is). A dynamically typed language implementation doesn't *have* to do things this way, that's just the naïve approach (albeit the easiest approach out of the box for new implementations). A smarter specializing JIT can create precompiled versions of functions and expressions for commonly used types (e.g. integers), and can determine that they are invariant and avoid having to re-check the types each time you hit the expression. The default CPython implementation does things the way you describe, but alternate implementations like psyco and PyPy take the specializing approach in their JITs. Not surprisingly, PyPy executes these for and while loops on the order of 50-100 times faster than CPython does on my machine (and ~10 times faster than numpy). $ python3 benchmark.py while: 5.254076757992152 for: 2.9640753349813167 sum_range: 1.1016407490242273 sum_numpy: 0.6697274879843462 $ pypy3 benchmark.py while: 0.05818561598425731 for: 0.07444041498820297 sum_range: 0.07667819698690437
@abhishankpaul
@abhishankpaul Жыл бұрын
I hope this problem can be partly solved if people declare data types of nearly all the variables as they need to do in C before hand like I do most of the time
@kylehart8829
@kylehart8829 Жыл бұрын
@@sumnerhayes3411 There are tons of other reasons why dynamic typing is pointless and bad. It saves you having to *gasp* tell the program the type of a variable one single time; and it costs you dearly. All python projects should enforce static typing, even though that doesn't gain performance. All languages should be statically typed, period, because it is infinitely more readable and provides no development benefits at all.
@grawss
@grawss 11 ай бұрын
@@kylehart8829 I would have disagreed prior to learning Python, back when it was just dogma in my head. Interesting how that works. However, the learning process was made easier by removing a step. I think a better solution would be to make it extremely clear how to toggle Python into enforcing static typing, and why that's a huge benefit.
@opticalmoose8091
@opticalmoose8091 2 жыл бұрын
Sometimes it’s just impossible to express something in numpy - iterative algorithms, for instance. I had a very hot piece of code, which implemented an A* pathfinding algorithm, which was slowing my entire program down. I looked for solutions and found a package called Numba - it allows you to decorate a single function and it would try to compile it the best it can. There are various optimization settings to consider, and in order to get *significant* improvements in performance (like, order of magnitude and more) you basically had to rewrite your program as if you were writing in C, but using Python’s syntax - so, no dynamic typing, using very limited subset of built-in types and refrain from using custom classes as much and possible. But it didn’t require too much work, and it was a great benefit in the end - I ended up with a blazing fast function without needing to write it in actual C (not very hard) and then figure out how to couple this with Python (quite finicky if you’ve never done it). So yeah, I’d recommend looking at Numba if you need a more general solution.
@NJ-wb1cz
@NJ-wb1cz Жыл бұрын
Or just ditch python and go for java or other jvm languages or go or rust etc
@pavelperina7629
@pavelperina7629 Жыл бұрын
@@NJ-wb1cz Eh, rust is likely one of the hardest languages to learn (after "modern" C++ with template metaprogramming). Julia seems quite easy to learn and pretty fast (if strongly typed it's close to C, maybe twice or three times slower, but not two orders of magnitude like Python). On the other hand I'm not sure how to structure code which is too complex for a single file and too simple to put it into package. It was basically developed for data crunching. Also year or two ago Rust was lacking some libraries, especially for plotting graphs (C++ lacks them too), painting (Cairo) and even exporting data as images (no support for writting of 16bit PNG files)
@julians.2597
@julians.2597 3 ай бұрын
​@@NJ-wb1cz right, who would do something as stupid as just adding another library as dependency when they could add an entirely different development stack just to optimise one function.
@NJ-wb1cz
@NJ-wb1cz 3 ай бұрын
@@julians.2597 why is adding libraries stupid? That's a strange stance. Sure, moving to another language is an option, but you'll likely use libraries there as well.
@SteinGauslaaStrindhaug
@SteinGauslaaStrindhaug 3 жыл бұрын
Half through the video I thought: actually for this simple sum you don't even need to loop just the formula. And you didn't disappoint. While this might seem like a very synthetic example which won't often be possible in real life, you'd be surprised how many times I (a programmer who is not even very good at maths) have been able to replace large loops and delete large chunks of code and replace it with a simple formula. Not usually this clean and simple, but quite often you find that the code is looping through a lot of data calculating stuff and then filtering out stuff, and if you simply rearrange things you might be able to do the filtering before the loop which might even reduce the loop to a few statements without a loop. Very often this is not obvious in the code but very obvious when sketching out the desired results on paper. Pencil and paper is definitely not obsolete.
@mCoding
@mCoding 3 жыл бұрын
Indeed! Never underestimate math! The feeling of deleting 30 lines of scratch and replacing it with 3 lines of math is indescribable.
@OatmealTheCrazy
@OatmealTheCrazy 3 жыл бұрын
@@mCoding it's the moments when you're lying in bed and it kinda hits you out of nowhere that are the best Of course, it is kinda a let down when the divine revelations are actually wrong though
@ttt69420
@ttt69420 2 жыл бұрын
Doing it on paper with a pencil probably triggers a different part of your thinking process from all the years of doing math that way.
@delqyrus2619
@delqyrus2619 2 жыл бұрын
You can actually be even faster than with math, if you know how CPUs work. If you instead of calculating (n * (n -1)) // 2 just calculate (n * (n -1)) >> 1 it should (in theory) be a few clockcycles faster. While most math operations are pretty trivial for the CPU(+, -, *) division is a pretty annoying task, so you might spare some time with just bitshifting.
@timewave02012
@timewave02012 2 жыл бұрын
@@delqyrus2619 Bit shifting is math too. No different than multiplying and dividing by powers of ten being easy in decimal. A good compiler will optimize it though, so it's probably best not to bother.
@deemon710
@deemon710 Жыл бұрын
Oh snap! I thought I was saving on resources by not using additional modules like numpy but now I see it's ridiculously faster. I wasn't aware pre-built functions were that much better either. Thanks for the reveal!
@eric_d
@eric_d Жыл бұрын
He actually did say in the video that Numpy uses a lot more resources, and with a bigger number, you may not have enough memory. Your original thought was correct, that you do save resources by not using additional modules. You may save time, but not resources. Also, if you compile your code, those additional modules will need to be compiled in, which will make the executable larger, and require more resources as well. As we are quickly approaching the limits of how far we can go with hardware (CPU technology is pretty much capped out, unless someone figures out how to make atoms smaller), we need to be more mindful of the amount of resources we use. Programmers need to stop being lazy. They have to stop thinking that they can bloat their code with hundreds of modules because CPU's will be faster by the time their code is in use. Just because speed and bandwidth are available, doesn't mean they should be used. For example, I know of some websites that include 50-100MB of Javascript code for each page view. Sure, dial-up is a thing of the past in most parts of the world, and people have faster internet connections than ever before. The page loads pretty fast on a gigabit connection. What about the people who have much slower connections? Is it really necessary to design your website where every single page view uses an entire day's worth of bandwidth on a dial-up connection? If you have several pages of the same website open in different browser tabs, your memory utilization goes through the roof! I've seen a single browser tab on one of those websites using over 2.5GB of RAM! I haven't even checked how much unnecessary Javascript is downloaded when using something like Facebook, but I've seen a single tab with Facebook open using over _5GB RAM_!!! This kind of waste needs to stop. Even if it does save a second or two here and there, it could cost several more seconds (or longer) in other areas. I've always hated importing modules in any language I've ever programmed in. I would much rather write my own code to do what I need done, whenever possible. I realize that's not always possible, but it's always better than adding bloat to a program that's not needed. Look at what's known as the "demo scene". It's something that started LONG ago, where people try to make the most exotic program possible in assembly language that fits within a 64KB executable. People have fit 10 minute long full screen videos with music into 64KB! They have created complete first-person-shooter games in 64KB! These programs run plenty fast on hardware from over 30 years ago! If more of todays programmers took the time to write good code, we wouldn't even need computers as fast as we have. Sorry, that started off as a 1 line reply, and ended up going 100 directions. There's still soo much more I want to say about the topic, but who is going to read it anyway?
@moustafael-shaboury2659
@moustafael-shaboury2659 Жыл бұрын
@@eric_d I will! I'm very new to programming and I really appreciated reading that. Thank you for the reminder :)
@CodeDisease
@CodeDisease 3 жыл бұрын
“Python is an incredibly slow language” You just insulted my entire race of people... but yes.
@CodeDisease
@CodeDisease 3 жыл бұрын
I guess I’ll be first then...
@suryanshthakur5820
@suryanshthakur5820 2 жыл бұрын
Your guess right
@soupnoodles
@soupnoodles 2 жыл бұрын
Its not incredebily slow, compared to others its slower yes but it can acomplish the tasks its meant for with no issues Machine learning, Automation, Data Science and Cyber Security
@CodeDisease
@CodeDisease 2 жыл бұрын
@@soupnoodles Yes but you can do all of that much faster with C, you're better off using C.
@CodeDisease
@CodeDisease 2 жыл бұрын
@@soupnoodles Also why did you like your own comment?
@NickByers-og9cx
@NickByers-og9cx 3 жыл бұрын
You want to be careful when timing functions, your processor likely speeds up or down based on system load and ambient temperature changes. Although you can still capture order of magnitude changes very easily like you did. If you're on a laptop though where cooling and power are limited, take extra care, and run multiple simulations in different orders at random times!
@mCoding
@mCoding 3 жыл бұрын
I cannot stress this enough! Benchmarking is such a difficult topic even though it seems so simple.
@luziferius3687
@luziferius3687 3 жыл бұрын
One thing you can do is to disable CPU boost clocks during benchbarks. This way, you get stable clocks regardless of load created by the tested function. For example on Linux, you can just write 0 to /sys/devices/system/cpu/cpufreq/boost to disable the boost or write 1 to enable it again. This’ll get you much more stable results that are comparable even if ambient temperature changes.
@NickByers-og9cx
@NickByers-og9cx 3 жыл бұрын
@@luziferius3687 that's a good approach! Myself I usually try to do perf stat whenever possible, and count the actual CPU cycles it takes. Run each command say 3 or 5 times and take the median!
@Teriton
@Teriton 3 жыл бұрын
@@mCoding Obviously the best way is to time it yourself with a stopwatch
@MisterAssasine
@MisterAssasine 3 жыл бұрын
imo it would have been better to use a smaller n in the function and a higher number of runs
@saivardhanchowdary7918
@saivardhanchowdary7918 3 жыл бұрын
The KZbin algorithm got you mate. All the best!
@mCoding
@mCoding 3 жыл бұрын
Thanks for the support!
@sploofmcsterra4786
@sploofmcsterra4786 Жыл бұрын
This was a valuable lesson to learn when I was coding a neural network. I wasn't aware of this fact and was trying to parallelise, before realising numpy automatically did that. Turning everything into vector operations led to a speed increase that was crazy.
@DeuxisWasTaken
@DeuxisWasTaken 5 ай бұрын
Very good video. Note: NumPy isn't just C, it's highly optimised C with SIMD capabilities. The only way you can go faster than that is with GPGPU like OpenCL, CUDA, Vulkan or Metal. (And even that isn't assured since GPGPU has initial overhead.)
@GuilhermeFArantesSouza
@GuilhermeFArantesSouza 3 жыл бұрын
"The best way to loop in python is to not loop", hahaha, loved it. Also got curious about doing the numpy sum but with the built-in range (because of the memory problem), how much would that impact the performance?
@mCoding
@mCoding 3 жыл бұрын
In my experience, I've never hit a RAM limit because of this, but in memory constrained systems this is definitely something to watch out for. I wish numpy has something like std::ranges::iota to use as an array-like.
@aquilesviza5550
@aquilesviza5550 3 жыл бұрын
​@@mCoding There is a library called dask, is used in distributed computing and has support for numpy arrays. working with blocks consume less memory but might be a bit slower.
@tonyflury2291
@tonyflury2291 Жыл бұрын
i think numpy.arange(...) creates a potential memory issue (it allocates the a large array in this case), but the Python3 builtin range(...) doesn't allocate a large array - it is a lazy iterator that simply tracks the last value generated so it knows what to generate next.
3 жыл бұрын
This is the reason why python is such a good programing language. You can write your program with easy and pretty syntax and use packages wrapping C code to do the complex and time consuming tasks. This just shows that you can use python and have fast running code. I would also add that if your program's only focus is speed and performance then don't use python, simple as that.
@mCoding
@mCoding 3 жыл бұрын
People too often leave out the speed of development in their calculation of speed!
@waldolemmer
@waldolemmer 2 жыл бұрын
Python isn't special in this regard
@imrobbinganyonewhotalkstom4881
@imrobbinganyonewhotalkstom4881 Жыл бұрын
People oftentimes forget lua.
@HydratedBeans
@HydratedBeans Жыл бұрын
A faster processor is cheaper than another developer to speed up development. That’s why I use python.
@NJ-wb1cz
@NJ-wb1cz Жыл бұрын
@@HydratedBeans unless you depend on oython-specific libraries I don't think the development speed is noticably greater compared to, say, kotlin. And a loosely typed language will always have more mistakes and harder to statically check
@npip99
@npip99 3 жыл бұрын
Fun fact: If you write this in C, gcc/clang will automatically optimize the for-loop sum into n(n-1)/2 for you.
@underfilho
@underfilho 2 жыл бұрын
really?
@tharfagreinir
@tharfagreinir Жыл бұрын
There's a fun story about that summation formula. Apparently mathematician Carl Friedrich Gauss came up with it at a young age when his teacher set his class the task of adding up all the numbers from 1 to 100 and was very surprised by Gauss' solution. Regardless of the veracity of the story, the formula is indeed often referred to as the Gauss summation formula. Mind you, a lot of things in math are named after Gauss as he was extremely prolific.
@magicmulder
@magicmulder Жыл бұрын
You left out one important aspect; the teacher wanted to keep the kids busy for a while and Gauss came up with the solution within minutes.
@thomas-sk3cr
@thomas-sk3cr 6 ай бұрын
Another important aspect, the formula is actually (n+1)*n/2.
@MattDunlapCO
@MattDunlapCO 5 ай бұрын
@@thomas-sk3cr Exactly. I was looking for this correction.
@adrianmizen5070
@adrianmizen5070 5 ай бұрын
@@thomas-sk3cr which actually illustrates a pitfall of using external math formulas instead of letting the computer do the work, that they are going to be less clear about what the code is supposed to do. getting the wrong answer really fast is worse than getting the right answer slowly
@thomas-sk3cr
@thomas-sk3cr 5 ай бұрын
@@adrianmizen5070 not sure if I understand you correctly. I do think that it's definitely better to use a formula if there is one. But it should be properly documented of course and you have to make sure that it's the right one. Maybe we can agree on that? I don't think you should reimplement well-known formulas with loops just because it's more clear to the reader at first glance :)
@hemanthkotagiri8865
@hemanthkotagiri8865 3 жыл бұрын
To the point, quality content. Subscribed!
@mCoding
@mCoding 3 жыл бұрын
Thanks for the support!
@chrisray1567
@chrisray1567 3 жыл бұрын
Great video. I’ve always found list comprehensions to be faster than the equivalent for loop or while loop.
@simonmultiverse6349
@simonmultiverse6349 3 жыл бұрын
There is a language called Lython, so you don't try to loop in Python; instead you poop in Lython.
@Connor-of6mu
@Connor-of6mu 3 жыл бұрын
I really like your channel! These are very interesting insights into the Python language in small videos which makes them easy to digest. Keep up the good work!
@mCoding
@mCoding 3 жыл бұрын
Awesome, thank you!
@cottawalla
@cottawalla 2 жыл бұрын
It was the one addition statement executed many times that slowed the python "for" loop. The conclusion is that looping can be very fast but executing python statements anywhere in the code is very slow.
@laughtale1607
@laughtale1607 3 жыл бұрын
That's why Gauss exists : simple use of (N-1)N/2 in this case
@mCoding
@mCoding 3 жыл бұрын
And he came up with this at such a young age too!
@bFix
@bFix 3 жыл бұрын
wait isn't it: (N+1)*N/2?
@laughtale1607
@laughtale1607 3 жыл бұрын
@@bFix He takes sum to N-1 in the example
@Samsam-kl2lk
@Samsam-kl2lk 3 жыл бұрын
@@laughtale1607 Yes, so it is n*(n-1)/2
@laughtale1607
@laughtale1607 3 жыл бұрын
@@Samsam-kl2lk yup
@andreabradpitto
@andreabradpitto 3 жыл бұрын
This is very informative and well presented. Also, it seems like you are very talented in explaining stuff, but this is only the first video of yours I have watched, so I'll have to do some further benchmarking ;) Thank you!
@aidanwelch4763
@aidanwelch4763 4 ай бұрын
Thanks for this, I was trying to explain to people in JS that essentially always calling a native function is faster than writing a more optimized(in terms of Big-O) way of doing something. They were asking Big-O proof and didn't believe me that it's just not relevant. Looping through a string 10 times with a native regex check is faster than iterating through it once in pure JS.
@TARS..
@TARS.. 3 жыл бұрын
This channel is a gold mine, thank you so much for these. Keep them up!
@mCoding
@mCoding 3 жыл бұрын
Thanks, will do!
@kenhaley4
@kenhaley4 3 жыл бұрын
Another reason the numpy approach is faster: The elements of a numpy array must all be of the same type, and there's no overhead to check (and convert to numeric, if necessary) the type of each array element.This probably contributes much more to the increase in speed than the language difference (C vs. Python.)
@vishnukumar4531
@vishnukumar4531 2 жыл бұрын
Other than that, the sum itself can be computed parallelly and fused together later, thus we are not forced to iterate serially element by element!
@dmitripogosian5084
@dmitripogosian5084 2 жыл бұрын
In compiled languages loops are often optimized on compilation stage, such as being unrolled where necessary, with particular emphasis on efficient cache/prefetch from memory utilization. One should look on original Fortran code of LAPACK routines where some loop optimizations were done 'by hand'
@shinobuoshino5066
@shinobuoshino5066 4 ай бұрын
@@dmitripogosian5084 loop unrolling is never efficient when cache and prefetch is the topic unless you explicitly wrote "from 0 to 4" just to avoid copy pasting code 4 times by yourself, and even that may not be worth unrolling depending on body of the for loop which most compilers can decide by themselves if you do profile guided optimization pass. There's a reason why -O2 more performant for most software than -O3 and unrolling is one culprit of that.
@GuRuGeorge03
@GuRuGeorge03 3 жыл бұрын
Oh this is an amazing recommendation by youtube. I came to the same conclusion as you while having massive performance issues in my machine learning models in non obvious parts of the code. Once I noticed that it is the loops that are awfully slow I tried refactoring them as much as possible and ended up using a mixture of things u've shown here. I didn't even know or speculate that it's because of C but now everything is much clearer in my head. Thanks!
@mCoding
@mCoding 3 жыл бұрын
You should also check out Jax, Numba, and Cython for potentially enormous speedup for little effort if you are doing ML.
@robertmielewczyk4219
@robertmielewczyk4219 3 жыл бұрын
you're not supposed to do ML with for loops in theory you can and it's very undestandable if you do it that way but any solution using matrices/tensorflow will be faster. Like he says in the video numpy is fast
@christianhill8651
@christianhill8651 3 жыл бұрын
You just convinced me to learn algorithms and functions. This was an amazing illustration.
@Jcewazhere
@Jcewazhere 3 жыл бұрын
In my one of my programming classes we learned about loop unrolling. This is much more handy to know about. :)
@supergamerfr
@supergamerfr 3 жыл бұрын
This channel is gonna blow up great work
@shingo371
@shingo371 3 жыл бұрын
Awesome content, got very surprised when I saw your low sub count
@mCoding
@mCoding 3 жыл бұрын
I appreciate that!
@anb1142
@anb1142 3 жыл бұрын
me too
@TheSandkastenverbot
@TheSandkastenverbot 2 жыл бұрын
Thanks for the information! It's really helpfull to get a glance into the Python interpreter's inner workings
@googleuser4203
@googleuser4203 3 жыл бұрын
A lot of important tips! Thank you. Subscribed!
@jetison333
@jetison333 3 жыл бұрын
Very informative video, keep at it and im sure you'll grow a ton.
@mCoding
@mCoding 3 жыл бұрын
I appreciate that!
@dwarftoad
@dwarftoad Жыл бұрын
I think numpy also will generally, if it can, use parallel CPU operations (SIMD) to process the array, which is probably why it's faster than standard sum() (and also why it makes sense to create the array of data with arange() even if each value could be calculated in the loop. But definitely good to be aware of this memory requirement.)
@spacelem
@spacelem 2 жыл бұрын
My own results were 6.06 seconds for the while loop, 4.25 seconds for the for loop, and 569 ms for the numpy loop. I tried it in Julia (almost identical code except for the end tags), and the results were 1.3 ns for both the for and the while loop. Checking the lowered code it turned out it had optimised the loop away completely, so we got sum_math for free! So instead to force it to do the actual sum, I modified it to s += i * (rand() < 0.5), and it came in at 576 ms. So even with a call to rand() and a branch, Julia was still nearly 10x faster than raw Python, and was about on par with NumPy (which didn't have a call to rand() ). If I force Julia to construct the range object with collect(1:n), then I don't need so that it matches NumPy, then it gets even faster at 290 ms, but now it allocates 763 MiB of memory, so clearly there are pros and cons of using range objects, although I'd stick with the range object for most usages. So, if you like Python but you want C performance (and you like to use arrays a lot and don't like to add all the extra numpy syntax just to do that), maybe check out Julia.
@incremental_failure
@incremental_failure Жыл бұрын
Most like myself use Python for the libraries, nothing else comes close except for maybe C++. It would take a lifetime to recreate all the libraries in Julia.
@spacelem
@spacelem Жыл бұрын
@@incremental_failure there has a lot of ongoing work to make native Julia libraries (mostly in the scientific and statistical communities), but Julia can also call C, C++, and Fortran code, and the PyCall package already allows you to embed Python in Julia, so you're never left in the lurch if you choose to use Julia. I've already seen Julia be hugely faster and more flexible when doing a lot of ODE modelling than when using R and calling deSolve which is written in Fortran.
@brianchandler3346
@brianchandler3346 3 жыл бұрын
Expertly presented. I never would have thought to look at it where C was used behind the scenes. I will definitely try and make the better to worst progression habit when writing Python.
@dddbra1748
@dddbra1748 3 жыл бұрын
I think your channel deserves more subscribers.
@mCoding
@mCoding 3 жыл бұрын
I appreciate that!
@BartMassey-PO8
@BartMassey-PO8 3 жыл бұрын
Rust with 64-bit numbers looks to be about 50× faster than numpy on my box (once I convinced the compiler to stop solving the problem at compile-time without losing optimization). Rust with BigUint bignums looks to be about 5x faster than numpy. The obvious for loop with pypy3 is about 12x faster than numpy. So there's still plenty of room for speedup.
@MrMonishSoni
@MrMonishSoni 3 жыл бұрын
This Video is so awesome, never came across such a beautiful way to understand loop in such depth. Thank you so much for making this video. ❤❤❤💕💕💕
@mCoding
@mCoding 3 жыл бұрын
Glad you enjoyed it!
@fillfreakin2245
@fillfreakin2245 Жыл бұрын
If you're using a counter within the while loop, then yes that counter is slower than the for loop. If you don't, and you use the while loop as intended and doing the same stuff inside the loop, then it is essentially the same speed. def while_loop(n=100_000_000): s = 0 while s < n: s += 1 return s takes about the same time as: def for_loop(n=100_000_000): s = 0 for i in range(n): s += 1 return s Use the right loop for the right task. If you have to put an incremental counter in your while loop then you're using the wrong loop.
@ssfjor
@ssfjor Ай бұрын
Comparing the sum with the counter limit certainly won't give the correct result in s: We are calculating the sum of all integers and analyzing various ways to do that. I. e., if the limit where 10, with 1,2,3,4 we reach it, way before reaching the correct sum 55.
@minghaoliang4311
@minghaoliang4311 3 жыл бұрын
Can you do a comparison of recursion over loop?
@mCoding
@mCoding 3 жыл бұрын
I have many video ideas in the works, this is definitely a contender.
@ikickss
@ikickss 3 жыл бұрын
Usually recursion is slower and takes up more memory. And many "big" problems cannot be coded recursively due to stack size limitations. But if compiler/interpreter can do tail recusion optimization, it's a very nice option. Unfortunately, not all problems are applicable to it.
@shukterhousejive
@shukterhousejive 3 жыл бұрын
The CPython interpreter doesn't do tail recursion so even if you write it out correctly it's still faster to loop, save the recursion for when you forget how to convert your code into a loop correctly
@sbwlearning1372
@sbwlearning1372 3 жыл бұрын
@Sec Coder Check out Lisp or Prolog. The recursion is conceptual in terms of the problem or model. Imagine a family tree. The definitions and relationships are recursive. Mom and Dad produce offspring each of which in turn can become Mom's or Dads who produce offspring. Comes into its own when these Relationships are highly complex ( Imagine every leaf on a real tree was in fact another tree and so on and so forth. The Mantlebrot set is recursive ) Essentially they become huge lists or recursive data structures. Those languages became prioritized and optimised around lists and recursive data structures.( With a few constructs and language primitives not found in the likes of C or Pascal such as "is a type of " "is connected to" "is related to", "is derived from" etc etc ) Pythons modelling capabilities Tuples Lists etc etc Means most of it can now at least be modelled easily before being translated to something else for speed. Ps for fun check out Occam . It's editor was indent sensitive just like Python It also let you mix sequential and parallel programming .!!!! Oh lordy 😁
@fat_pigeon
@fat_pigeon 3 жыл бұрын
@Sec Coder It lets you express common algorithms in a purely functional way, i.e. without needing to mutate variables. Doing so makes your code easier to analyze mathematically and makes it more composable.
@ahmadhafian3785
@ahmadhafian3785 2 жыл бұрын
It would be interesting to see how recursive function would perform
@danejohnson8657
@danejohnson8657 2 жыл бұрын
Python doesn't have tail-call optimization, so you probably just overflow the stack.
@sergeyshchelkunov5762
@sergeyshchelkunov5762 3 жыл бұрын
Very informative. Excellent insight into Py inner workings. Great incentive to employ Numpy (and Multiprocessing) whener possible. Give a simple yet splendid example of why knowing math is important for any programmer who tries to use Py for data science type applications.
@Parthsean
@Parthsean 3 жыл бұрын
This is probably the first coding channel that I looked up just for fun. Great work. Can you cover advance pythonic syntaxes and topics?
@mCoding
@mCoding 3 жыл бұрын
I'm always open for suggestion! What do you want to see?
@Parthsean
@Parthsean 3 жыл бұрын
@@mCoding Nothing in particular just any fancy pythonic things which help improve performance and code readability. Also, one thing you can do is make a mess of a code and start refactoring it. That would be amazing to see.
@supermariobrosz8
@supermariobrosz8 3 жыл бұрын
Great video! Another idea is to time yield in functions and generator expressions in comparison to the traditional loops
@mCoding
@mCoding 3 жыл бұрын
Great suggestion!
@ferociousfeind8538
@ferociousfeind8538 2 жыл бұрын
yes, graph the size of n against the time it takes different strategies to compute the sum >:3 make the computer dance for my entertainment!
@ColinBroderickMaths
@ColinBroderickMaths 2 жыл бұрын
Just wanted to note that using an iteration index in a while loop isn't really good practice anyway, since that alone would almost always mean a for loop is the correct pattern. So it's not really surprising that the while loop was slower in this experiment. I'd also like to warn people that numpy is only faster if you are operating on large sets or large arrays of numbers. If you apply a numpy function to a single number, it's almost always much slower than the equivalent function in Python, presumably due to some overhead.
@rursus8354
@rursus8354 Жыл бұрын
That isn't the point. The point is the efficiency of raw Python.
@dragonsage6909
@dragonsage6909 2 жыл бұрын
Wow .. I learned about 5 new things from this.. using timeit is a great idea.. thank you!
@maxteer2800
@maxteer2800 3 жыл бұрын
Great stuff! I never considered that while loops would be slower than for loops, but even as you started I was guessing it would!
@mCoding
@mCoding 3 жыл бұрын
Glad you liked it!
@reprC
@reprC 3 жыл бұрын
Cool video. I’d be curious how `numba` stacks up in this, both the JIT and AOT variants
@youuuuuuuuuuutube
@youuuuuuuuuuutube 2 жыл бұрын
Numba would make it closer to the unoptimized C/C++ code. I used it to render fractals and I got it to run 50x faster than unoptimized Python, and almost the same speed as the C code. The thing is ... that you can make the loop a lot faster in C too because you can use SIMD intrinsics, so you could do 4 or 8 or even more operations in a single instruction.
@tobiasgrun5116
@tobiasgrun5116 2 жыл бұрын
Just tested.. The JIT execution time is similar to the math version. To all Numba / LLVM developers: Great job!
@bartkl
@bartkl 2 жыл бұрын
Nice! Question: if you increment i inside the for loop, won't that affect the amount of iterations (skip every other one)?
@mCoding
@mCoding 2 жыл бұрын
It actually does not affect the number of iterations. The iterator of range keeps its own internal counter that python uses to assign to i each iteration.
@incremental_failure
@incremental_failure Жыл бұрын
The range is built before executing the loop contents.
@lakshyachopra_
@lakshyachopra_ 2 жыл бұрын
This video was very informative. Thanks a lot James!
@mCoding
@mCoding 2 жыл бұрын
Glad it was helpful! You are very welcome!
@thelazyrabbit4220
@thelazyrabbit4220 2 жыл бұрын
Dayumn... TIL something new as well as some new trick. You definitely earned my sub!
@zf0666
@zf0666 3 жыл бұрын
Good example of how being good at math and algorithms beats all micro optimizations of the code itself :)
@mCoding
@mCoding 3 жыл бұрын
Exactly!
@edwinjonah
@edwinjonah 3 жыл бұрын
Awesome video! A few days ago I was comparing the execution time on using a loop vs. the Gaussian formula, it's nice to see some other options in between.
@mCoding
@mCoding 3 жыл бұрын
Thanks for the support!
@lb2040
@lb2040 2 жыл бұрын
@@mCoding Very interesting video indeed, something I never really thought about … but the math-guy in me needs to say it: You got the formula wrong… 😬
@fanoflinoa6109
@fanoflinoa6109 2 жыл бұрын
Great comparative video The math solution if you include n in the addition is ((n+1)*n)/2 Because you used range() and the while loop was set to < and not
@aioia3885
@aioia3885 2 жыл бұрын
Flooring the division is very much recommended, because otherwise python will convert the result to a floating point number which may cause you to lose precision
@fanoflinoa6109
@fanoflinoa6109 8 ай бұрын
@kurogami2771 it does in this case as we are adding up a series of numbers from 0 to n, which by definition excludes negative values. As for flooring, it is true that we should avoid using floats when not needed even for just one operation so we can either floor or int() the results.
@abebuckingham8198
@abebuckingham8198 26 күн бұрын
@@aioia3885 Either n or n-1 is even so it's never a float. Using floor isn't required.
@aioia3885
@aioia3885 26 күн бұрын
@@abebuckingham8198 edit: "either n or n-1 is even so it's never a float" is not correct because in python division with / will always result in float conversion (see type(1/1)). the rest of this reply is just trying to clarify what I now realize was not necessary to clarify in my comment I said "flooring the division is very much recommended" but what I really meant was not using the floor operator on the result but instead using the integer division operator. in python if you do a/b the result will always be a float bu if you do a//b then the result will be an integer if both a and b are as well. doing a//b is basically like doing int(a/b) except that there will be no conversion to a float and so no precision will be lost. for example, int((10**18+2)/2) evaluates to 5e17, because the division was a floating point one, but (10**18+2)//2 does result in the expected value of 5*10**17+1
@FallinPython
@FallinPython 3 жыл бұрын
Thanks for the tips! The comparison with NumPy is not fair because it is made for array computing and hold an unique data type. By the way, you deserve far more subscribers.
@mCoding
@mCoding 3 жыл бұрын
Thanks!
@yolamontalvan9502
@yolamontalvan9502 3 жыл бұрын
That’s why I would use C to program a missile.
@mCoding
@mCoding 3 жыл бұрын
Watch out for those null pointers though, wouldn't want your missile program to CRASH!
@bamberghh1691
@bamberghh1691 3 жыл бұрын
Usually Ada or SPARK are used in high reliability systems like aircraft instead of c since they are much safer
@mCoding
@mCoding 3 жыл бұрын
I had heard about this before, thanks for sharing, it makes a lot of sense in safety critical applications. I know NASA still uses C for (as least some?) space shuttles!
@mwanikimwaniki6801
@mwanikimwaniki6801 3 жыл бұрын
@@bamberghh1691 I knew ADA is used for missiles... But is there a specific reason why.
@arpandhatt6011
@arpandhatt6011 3 жыл бұрын
@@mwanikimwaniki6801 Ada has been specifically designed to catch as many mistakes at compile time. It’s static analysis tools are extremely thorough, and in general, it’s designed to prevent undefined behavior, from its type system, to its syntax. It’s an interesting language.
@SillyLittleMe
@SillyLittleMe Жыл бұрын
If you are writing pure and iterative heavy Python code, then PyPy is your best bet for performance. I ran the same test as mentioned in the video on Python 3.10 and PyPy 3.9 and my god the difference was staggering! Here are my results : Python 3.10: for loop : 6.6 secs while loop : 11 secs PyPy3.9: for loop : 0.22 secs while loop : 0.36 secs PyPy is really meant for long running code and real world results show that.
@thebiomicroguy
@thebiomicroguy 3 жыл бұрын
Neat, beautiful, concise and to the point. You nailed it. Thanks.
@mCoding
@mCoding 3 жыл бұрын
You're very welcome!
@SanixDarker
@SanixDarker 2 жыл бұрын
Good content ! Liked and subscribed !
@OMGclueless
@OMGclueless 3 жыл бұрын
It's also worth taking a second to remember that even in its slowest form, the python code is adding up 100 million numbers in 20 seconds. It's slow, but "slow" is a relative term.
@mCoding
@mCoding 3 жыл бұрын
This is a good point that everyone should keep in mind!
@carmelo5991
@carmelo5991 7 ай бұрын
Is slower than other languages but still enough because computers are very fast
@edwinontiveros8701
@edwinontiveros8701 6 ай бұрын
On any CPU capable of many teraflops, 20 seconds for 100 million numbers means eons. It's like walking and taking a step only every 5 seconds, short distances It's not that bad, but will badly scale once you need to travel farthest.
@Generlc_Human
@Generlc_Human 3 жыл бұрын
when you dont know how to code but you know english: "i know some of these words"
@smjure
@smjure 3 жыл бұрын
Awesomely clear presentation, thanks a lot 👍
@mCoding
@mCoding 3 жыл бұрын
You are welcome!
@daveys
@daveys 4 ай бұрын
I’m a Python novice and I found this to be really interesting and useful. Many thanks!
@TheParkitny
@TheParkitny 3 жыл бұрын
There is another option. Often you can use the numba library to get a speed boost over numpy.
@gregorymorse8423
@gregorymorse8423 3 жыл бұрын
Did you run these tests with Cython? Probably bringing everything to C would have interesting results. The async model on the for iterator for example should not be faster than a while loop unless the async code were optimized out.
@ButtKraken01
@ButtKraken01 2 жыл бұрын
This was a great explanation and earned you my subscription!
@qorbanimaq
@qorbanimaq 3 жыл бұрын
This is the first time I subscribe to a channel based on seeing the first video I watch from that channel. Great job!
@mCoding
@mCoding 3 жыл бұрын
Such an honor! Thank you!
@qorbanimaq
@qorbanimaq 3 жыл бұрын
@@mCoding THANK YOU for creating such quality content.
@KonradKeck
@KonradKeck 2 жыл бұрын
Thanks for that! I had no idea about the ratios. Two ideas though: I'm doing statistics and ETL jobs with pandas. I would've been interested in other methods as well like recursion or doing the same with pandas instead of numpy (whether it's any slower than pure numpy), pandas + apply + lambda, etc. Second, going one step further and optimising performance for if-elif-else / mapping function would be great for another video.
@logicprojectsspeed2023
@logicprojectsspeed2023 3 жыл бұрын
Great video. Commenting mostly to help the algorithm push this. Would have loved to see a comparison with straight C thrown in there given that C code was the source of the speedup. What's the cost of the python to C transitions?
@mCoding
@mCoding 3 жыл бұрын
A C compiler would probably compute the answer at compile time so there is nothing to compare! A Python to C transition would be a huge undertaking even for a small project.
@oj0024
@oj0024 3 жыл бұрын
Here are the results on my pc: python with for loop: 8.74s c with gcc and no compiler optimizations: 0.235734s c with gcc and full compiler optimizations: 0.048770s (gcc doesn't detect the pattern and actually executes the entire code) c with clang and full compiler optimizations: 0.000001s (clang detects the pattern and optimizes it using the Gaussian formula)
@alejandromorales8784
@alejandromorales8784 2 жыл бұрын
@@oj0024 wow, there is absolutely nothing to compare. I knew python was slow and C was fast, but I didn't know that the difference is to this extent
@tonyflury2291
@tonyflury2291 Жыл бұрын
@@mCoding Nuitka will take Python code and transpile to C (which uses the Python standard Library to do the object iterations) - on some applications it is 2x faster than normal Python (and it is a work in progress).
@edwardwong654
@edwardwong654 Жыл бұрын
Quite interesting. I will probably never need to know this but if I get a chance to show off, I will find a way, haha. Thanks and very interesting.
@jamessawyer8565
@jamessawyer8565 3 жыл бұрын
Great tip! Thanks for the valuable content
@mCoding
@mCoding 3 жыл бұрын
Glad it was helpful!
@mrtnsnp
@mrtnsnp 3 жыл бұрын
The numba package may be useful in some cases.
@mCoding
@mCoding 3 жыл бұрын
Absolutely, if you can use Numba to @jit your problem, them it will very likely speed up the solution! I'll probably cover this in a future video.
@mrtnsnp
@mrtnsnp 3 жыл бұрын
@@mCoding I did a small test. from numba import njit @njit def int_sum(n=1_000_000_000): s: int = 0 for i in range(n): s += i return s timeit.timeit(int_sum, number=1) 5.373000021791086e-06 timeit.timeit(int_sum_p, number=1) 44.33714248199999 int_sum_p is the same function, without decorator. The result is very close to C speed. But obviously, the best improvement comes from doing the maths properly.
@dmccallie
@dmccallie 3 жыл бұрын
Numba is amazing, if you can limit the code in the @jit’ed routine to simple loops and math and/or numpy operations. You can’t use things like lists or dicts
@gregorymorse8423
@gregorymorse8423 3 жыл бұрын
Should do all the examples with numba this is also interesting. Cython also. Masterclass in squeezing loop performance, could also show sum(x for x in range(n)). Ideally use matplotlib to show the data for different sum sizes and prove the obvious that it is linear
@mrtnsnp
@mrtnsnp 3 жыл бұрын
@@gregorymorse8423 When trying to make a comparison with compiled C++ code, I discovered that clang removes the loop altogether, applying the maths that the programmer should have done in the first place. And I suspect that numba applies the same optimisation, otherwise the total time does not make sense with the clockspeed I have.
@andreiiaz2097
@andreiiaz2097 3 жыл бұрын
Python is kinda like the doctor: It's the last place you want to be, but you go when you need it, because otherwise things might get more complicated
@akshaybodla163
@akshaybodla163 3 жыл бұрын
I personally like to start with python. Its vast libraries and simple structure is easy for me to test programs. Later when I decide on a specific implementation, i'll code that in C++.
@andreiiaz2097
@andreiiaz2097 3 жыл бұрын
@@akshaybodla163 Yeah, I can see that, it's nice, but how do you transfer a python library to C++?
@antonliakhovitch8306
@antonliakhovitch8306 3 жыл бұрын
@@andreiiaz2097 A lot of C/C++ libraries have Python bindings so you can just use the same ones a lot of the time.
@maythesciencebewithyou
@maythesciencebewithyou 3 жыл бұрын
In reality, python is the first thing most people will use because it is easy to use. Python is only usefull because if works as a glue for other languages. Easy access to libraries written in more difficult languages.
@jaredgreathouse3672
@jaredgreathouse3672 3 жыл бұрын
Such a fact. I do statistics research using Stata. My rule is Stata where I can, Python where I must.
@bva0
@bva0 2 жыл бұрын
Cool video. I would add that numba and cython are also super fast alternatives.
@itsamelara
@itsamelara 3 жыл бұрын
great video, great channel, I'm subscribing!
@mCoding
@mCoding 3 жыл бұрын
Awesome, thank you!
@JensRoland
@JensRoland 3 жыл бұрын
No loop unrolling variant? Nice roundup of methods, and nice video in general!
@mCoding
@mCoding 3 жыл бұрын
You're right, I totally forgot about some "optimizations" one could try. Maybe a later video. Thanks for the support!
@superscatboy
@superscatboy 3 жыл бұрын
Unroll to 100,000,000 lines of additions, then reduce them to a single value and have the function just return that value. Boom, orders of magnitude faster!
@n00blamer
@n00blamer 3 жыл бұрын
@@mattmurphy7030 Unrolling is more about reducing dependencies than iteration overhead, the multiple accumulators are summed in the end. Branch predictor doesn't play much role here as this loop is fully predicable, only the last iteration flushes the pipeline after the loop gets up to speed. Data dependencies can potentially stall the ALU, but if the CPU has a really short pipeline it won't have any effect.. this is highly dependent on the microarchitecture so arguing one way or another is a bit pointless unless we specify the hardware..
@WhiteThunder121
@WhiteThunder121 3 жыл бұрын
"Is there anything o beat numpy?" Try compiling numpy top native machine code using numba: @nb.njit(cache=True, parallel=True) def numba_sum(): return np.sum(np.arange(100_000_000)) > 0.00064 seconds Or suing your GPU (Cupy): def cupy_sum(): return cp.sum(cp.arange(100_000_000)) > 0.001747 seconds Cupy should be an order of magnitude faster than you CPU, but only if you do some more complex computation than calculating the sum of the array.
@mCoding
@mCoding 3 жыл бұрын
Good point! Maybe some day I'll do a comparison of which C library for Python is the fastest... so many choices though.
@WhiteThunder121
@WhiteThunder121 3 жыл бұрын
@@mCoding Would be interesting to watch. Good video btw. I did not expect the "i = i+1" to have such a huge impact when iterating.
@fisch37
@fisch37 2 жыл бұрын
Question: How are those rates when you precompile using py_compile to create a .pyc file?
@verycrazyguy1
@verycrazyguy1 3 жыл бұрын
I'd love to see a discussion including jit which is also a parallelism c based implementation for for loops.
@nvmcomrade
@nvmcomrade 3 жыл бұрын
Half way through the video I thought to my self why even loop to find an answer to a simple arithmetic progression and then I was surprised that this was addressed as well. Wow, impressive.
@mCoding
@mCoding 3 жыл бұрын
Thanks!
@lepepito
@lepepito 3 жыл бұрын
It serves the purpose to explain the problem of loops and also the process of finding different solutions to a problem It’s a nice exemple
@emilemil1
@emilemil1 2 жыл бұрын
It would be a good idea to show examples that does a bit more work within the loop, to show that if your loop body is costly then how you loop is largely insignificant.
@K9Megahertz
@K9Megahertz 2 жыл бұрын
Yeap, I was thinking the same thing. These tests really don't tell you anything meaningful.
@entropyz5242
@entropyz5242 3 жыл бұрын
I like this video very much. Subscribed. Also, interestingly enough, I am taking a class about functional approach to engineering problems and the professor repeats every class to avoid looping like you would avoid the Black Plague. Lol
@braulioacosta6816
@braulioacosta6816 3 жыл бұрын
what's the name of that course?
@entropyz5242
@entropyz5242 3 жыл бұрын
@@braulioacosta6816 functional approach to engineering problems lol
@Shaddymaze
@Shaddymaze Жыл бұрын
Big thanks for explaining this clearly!
@MrGryph78
@MrGryph78 3 жыл бұрын
Pretty certain the numpy functions are built using SSE/AVX intrinsics, hence their massive speed up compared to scalar only methods.
@micirei
@micirei 3 жыл бұрын
Your channel is great! I suggest you use a high pass filter for your voice to cut off the low frequencies that make your voice sound muddy, great job so far.
@mCoding
@mCoding 3 жыл бұрын
Thanks for the tip! I have no idea what I'm doing video editing so tips like this can really help improve my video quality!
@sophiaelementaris4203
@sophiaelementaris4203 3 жыл бұрын
You video was amazing man, more one sub.
@user-hk3ej4hk7m
@user-hk3ej4hk7m 3 жыл бұрын
I'm quite satisfied seeing the fastest way to do stuff is using a declarative style, it's probably it safest option too since you eliminate off by one errors.
@PKua007
@PKua007 3 жыл бұрын
Actually any modern C or C++ compiler with optimization turned on (-O3 or similar) would by itself compute the value of the sum compile-time and just inline the result. I am shocked that Python interpreter does not even try to for example throw out unused instruction during compilation. There must be a way to enable this. Or maybe this instructions actually have a side effect because of its weird namespace policy and cannot be thrown away without potentially spoiling the code. I am not sure...
@shinobuoshino5066
@shinobuoshino5066 4 ай бұрын
Both C and C++ compilers such as GCC and LLVM are written in C++. Python is written in C. JavaScript is main Python's competitor in easy to use scripting language... And it's V8 is written in C++ and its JIT is incredibly fast, the only thing that can compare is LuaJIT which is practically abandonware because one dev who made it has stopped caring and noone else is intelligent enough to pick it up and maintain it properly. Still, the difference is that LuaJIT is very fast... But it can't stream code and execute it in my browser as fast as possible, most projects don't even use lua for one reason or another. Just shows that C is obsolete language and all good programmers use C++. There of course will still be C zealots that will pretend otherwise, but they can't even write their own compiler and have to depend on C++ programmers who they hate more than they hate C++ for some reason.
@andrewkim5556
@andrewkim5556 3 жыл бұрын
Wow excellent!
@mCoding
@mCoding 3 жыл бұрын
Glad you like it!
@tamoregi
@tamoregi 6 ай бұрын
If I recall correctly there was an optimisation in LLVM/Clang that was able to detect looping over elements in order to add them and inject the n(n-1)/2 directly into the assembler. :)
@Sorestlor
@Sorestlor 2 жыл бұрын
One can also use numba if applicable which can speed up numpy oddly in some cases. I believe because it bypasses the intermediary python steps.
@davidbellamy1388
@davidbellamy1388 2 жыл бұрын
Your lessons are awesome. They have a Corey Schafer feel to the presentation in their brevity and clarity.
@mCoding
@mCoding 2 жыл бұрын
Thank you! I've seen many of Corey's videos, so I'm glad to hear that!
@VivekYadav-ds8oz
@VivekYadav-ds8oz 3 жыл бұрын
This seems insane to me that summation took more than a second! What is the record for the same in raw C/C++?
@brandonkirincich8181
@brandonkirincich8181 3 жыл бұрын
It would definitely be much faster. Especially since this is a special optimization case. It might not even generate a loop.
@mCoding
@mCoding 3 жыл бұрын
100_000_000 is a big number! Agreed, C/C++ would likely optimize it away.
@toebs_
@toebs_ 3 жыл бұрын
@@mCoding I think you should be more clear regarding the difference between language and implementation. It was definitely an interesting video, but this is not a problem inherent to the language itself, but rather a problem of (probably?) CPython. Also, C doesn’t optimize anything, gcc does. At least if you don’t tell it otherwise.
@xrafter
@xrafter 3 жыл бұрын
@@toebs_ Oh yes you need to enable the optimization flag -On n for random number .
@acemanftw
@acemanftw 3 жыл бұрын
Great video! Subbed
@mCoding
@mCoding 3 жыл бұрын
Awesome, thank you!
@artar4363
@artar4363 3 жыл бұрын
Didn't knew that Python is SO slow with those loops! Good video!
@chiragsingla.
@chiragsingla. 2 жыл бұрын
btw it's doing 100M iterations so it won't affect too much in small iterations
Don't make this big O mistake!
3:46
mCoding
Рет қаралды 39 М.
25 nooby Python habits you need to ditch
9:12
mCoding
Рет қаралды 1,7 МЛН
СҰЛТАН СҮЛЕЙМАНДАР | bayGUYS
24:46
bayGUYS
Рет қаралды 416 М.
顔面水槽がブサイク過ぎるwwwww
00:58
はじめしゃちょー(hajime)
Рет қаралды 118 МЛН
Como ela fez isso? 😲
00:12
Los Wagners
Рет қаралды 18 МЛН
For loops are redundant
4:14
mCoding
Рет қаралды 137 М.
Python dataclasses will save you HOURS, also featuring attrs
8:50
Why I prefer attrs over dataclasses
6:21
mCoding
Рет қаралды 60 М.
Actually, you CAN divide by zero.
3:52
mCoding
Рет қаралды 252 М.
Python's sharpest corner is ... plus equals? (+=)
7:51
mCoding
Рет қаралды 256 М.
Python's collections.abc | InvertibleDict
14:00
mCoding
Рет қаралды 44 М.
super/MRO, Python's most misunderstood feature.
21:07
mCoding
Рет қаралды 210 М.
5 Good Python Habits
17:35
Indently
Рет қаралды 323 М.
Diagnose slow Python code. (Feat. async/await)
9:57
mCoding
Рет қаралды 208 М.
Приехала Большая Коробка от Anker! А Внутри...
20:09
РасПаковка ДваПаковка
Рет қаралды 82 М.
📱 SAMSUNG, ЧТО С ЛИЦОМ? 🤡
0:46
Яблочный Маньяк
Рет қаралды 1,5 МЛН
🤔Почему Samsung ПОМОГАЕТ Apple?
0:48
Technodeus
Рет қаралды 377 М.