How Much Memory for 1,000,000 Threads in 7 Languages | Go, Rust, C#, Elixir, Java, Node, Python

  Рет қаралды 637,891

ThePrimeTime

ThePrimeTime

Жыл бұрын

Recorded live on twitch, GET IN
/ theprimeagen
ty piotr!
pkolaczk.github.io/memory-con...
MY MAIN YT CHANNEL: Has well edited engineering videos
/ theprimeagen
Discord
/ discord
Have something for me to read or react to?: / theprimeagenreact

Пікірлер: 1 100
@jonathan-._.-
@jonathan-._.- Жыл бұрын
compaaring actual threads with async tasks seems kinda weird
@ccgarciab
@ccgarciab Жыл бұрын
And workers and a plain event loop. Terrible all around.
@MikyLestat
@MikyLestat Жыл бұрын
They are not the same, but having async tasks is a powerful functionality that isn't available in all languages. It is correct he wasn't comparing the same, but you could argue that he was comparing how you would achieve the same thing if you wrote it in each language
@lozanov95
@lozanov95 Жыл бұрын
​@@MikyLestat Depends, because with Python you will run on a single thread, but with go for example you will use multiple threads. If you are actually computing anything this will make a significant difference.
@MikyLestat
@MikyLestat Жыл бұрын
​@@lozanov95 Exactly. I think that the reason for the comparison is to get an indication of how much memory (minimally) each programming language will use to achieve the same thing. Achieving the same thing in each language is translated to using the features and constructs of each language. Python is a great language, but it isn't the fastest. The global-interpreter lock (in addition to Python being interpreted in CPython) causes it to be slow. Just because Python doesn't really have multi-threading, it doesn't mean we shouldn't use multi-threading/tasks in other languages and then profile the memory footprint.
@davidstephen7070
@davidstephen7070 Жыл бұрын
@@MikyLestat i think, this's wrong ways to compare language that only run in single thread vs multi-thread to get requirement memory to run that tasks. garbage collector have feature to queque overload thread. so fastest process means lower memory. and for tasks that have high range let say. first task 20KB, 70th task 1MB. Initial size heap higher give good response than set initial size to 50KB and re-allocate memory size. This all dependent user hardware to choose process ways or memory ways. if memory cheaper than cpu. than go memory, if cpu cheaper then choose like go or rush that re-allocator frequently
@thedoctor5478
@thedoctor5478 Жыл бұрын
Using Python's asyncio for this test was the wrong thing to do. It's similar to what was done with NodeJS. Asyncio is an event loop, not a thread. Python has threading libs for threads.
@Kobrar44
@Kobrar44 Жыл бұрын
multiprocessing xD no need for a benchmark, it would be just atrocious
@nikonyrh
@nikonyrh Жыл бұрын
@@Kobrar44 Yeah just run "multiprocessing.Pool(int(1e6))" and you are good to go :D Argh I hate python, but it is still my main language.
@just_a_random_
@just_a_random_ Жыл бұрын
​@@nikonyrhJust curious, why do you hate Python ?
@magicbob8
@magicbob8 Жыл бұрын
But asyncio is faster because pythons multithreading is so bad, so it’s what people use. And it accomplishes the same things
@ibrahimaba8966
@ibrahimaba8966 Жыл бұрын
this is an IO-Task so asyncio is the good solution!
@nunograca2779
@nunograca2779 Жыл бұрын
If I'm not wrong, C# uses a theard pool behind the scenes when using async/await and what it does is it recycles theards. That's why in the first test it was way up than the others. I think that was the threads pool being initialized with a bunch of threads.
@dziarskihenk8798
@dziarskihenk8798 Жыл бұрын
this.
@3ventic
@3ventic Жыл бұрын
Yup. It always allocates a fixed size pool of managed threads depending on the system it's running on, unless you set the size yourself, which is possible and would be separately interesting for this benchmark.
@MikyLestat
@MikyLestat Жыл бұрын
@@3ventic The ThreadPool default is much smaller, it shouldn't take 120 MB at idle. I'm betting he wasn't distinguishing between allocated and committed memory.
@GabrielSantAna-sm9zh
@GabrielSantAna-sm9zh Жыл бұрын
as far as I know, C# also compiles the async methods to stateful classes, so it generates the states of each “step” of processing beforehand, when you create that amount of tasks you are basically creating a list of super small instances in a queue to the threadpool to consume until the next state (await) and throw again in the end of the queue
@3ventic
@3ventic Жыл бұрын
​@@MikyLestat I was a bit mistaken, but there is a fixed minimum number of threads (ThreadPool.GetMinThreads). On my system it's 32 by default and the equivalent program on my system (1 task) takes up 195M RES 108M SHR while a million tasks is using 52 threads and 472M RES 23M SHR.
@hansenchrisw
@hansenchrisw Жыл бұрын
As a Java apologist, it first got virtual threads in 1997 with version 1.1 (edit: later removed and recently re-added in v 19). Also, Java (and presumably.NET) pre-allocates a bunch of memory by default. Hence how mem looks high for small numbers of threads and it doesn’t increase until you hit bigger numbers.
@Talk378
@Talk378 Жыл бұрын
Yep, rare prime L
@elraito
@elraito Жыл бұрын
Yes bu ran the same code aot comüiled for c# and its only 5mb baseline. The blog author misrepresented c# badly
@hansenchrisw
@hansenchrisw Жыл бұрын
@@elraito no doubt, but I don’t expect someone to be proficient at all those langs/runtimes.
@giuliopimenoff
@giuliopimenoff Жыл бұрын
That's why they should have used Kotlin coroutines
@mishikookropiridze5079
@mishikookropiridze5079 Жыл бұрын
​@@elraito That's the variation introduced by running it locally.
@casperes0912
@casperes0912 Жыл бұрын
There's also the memory vs. speed tradeoff. Sometimes keeping more things in memory can also make it faster. If the managed environments that have a higher starting point in memory usage already has a bunch of kernel threads lying dormant in a thread pool that's taking up memory but speeds up spawning of threads.
@cakedon
@cakedon 5 ай бұрын
if my hello world doesnt use 27 gigabytes of ram i wont write it
@maximumcockage6503
@maximumcockage6503 5 ай бұрын
Yeah. Bun.js was priding itself on being faster than Rust in it's beta. Then when it came out and people started benchmarking it was slightly faster than rust by like a few percent, but used 40 times more memory on average.
@devotiongeo
@devotiongeo 11 ай бұрын
Creating a million concurrent "tasks" (or spawning processes as we call them in Erlang/Elixir) and allowing them to remain idle is one thing, while making those processes actually do something, such as each one of them having a persistent connection to a client and feeding it, is something entirely different. In practical terms, when it comes to real-time apps, the BEAM (Elixir/Erlang) outperforms all other languages by a significant margin. This is precisely why Brian Action and Jan Koum chose Erlang for WhatsApp after years of experience with Yahoo Messenger and Yahoo Chat Rooms. If someone hasn't had the opportunity to work with any BEAM language, the above statement may appear to them as an empty boast, and I can't blame them for that.
@ThugLifeModafocah
@ThugLifeModafocah 11 ай бұрын
But then this example needs to be done and showed to the world as this primeagen is reacting. I'm surprised with Elixir performance here... in a bad way.
@xbmarx
@xbmarx 11 ай бұрын
@@ThugLifeModafocah I'm not. Erlang processes are completely isolated. COMPLETELY. Every "task" has a separate GC, memory space, everything.
@szymonbaranowski8184
@szymonbaranowski8184 11 ай бұрын
​@@xbmarxso if things crush only these things crush that's a feature itself
@Aaku13
@Aaku13 11 ай бұрын
The BEAM is pretty quick, but it won't "outperform all other languages by a significant margin". Ran several huge elixir services in production with lots of traffic and our Go services were much more performant.
@osazemeusen1091
@osazemeusen1091 9 ай бұрын
​@@Aaku13I can agree for only CPU bound tasks. For IO bound tasks, Golang doesn't come close in performance to Elixir
@Deemo_codes
@Deemo_codes Жыл бұрын
Each elixir process spawns with a 50k heap, garbage collection happens on a per process level (you dont stop the world, you stop a process). This is because the way processes are used in elixir is like how microservices are used. Each process does a small amount of stuff then sends a message on to another service. The erlang vm that elixir runs on will launch 1 scheduler per cpu and does pre-emptive multitasking. So if you had 1mn processes doing stuff you would get each process executing for a few ms then being switch out and added back into the queue that the schedulers pull from. So if you have more cores you get more parallelism, if you only have 1 core you still get concurrency. Whereas async runtimes tend to be cooperative require some form of explicit yielding from a running task, elixir will just swap stuff out. Makes it good for soft realtime stuff, if you want to do cpu intensive things you can delegat to NIFs (native implemented functions) written in C or Rust. The rust ones tend to be safer since panics are caught and raised as errors in elixir. Wheras a panic in C will crash the whole VM
@Overminddl1
@Overminddl1 Жыл бұрын
You can also specify the memory usage of a process as well on the beam VM, this significantly reducing the amount of memory something will use whenever it's spawned and doesn't really allocate anything, like in this case
@madlep
@madlep Жыл бұрын
And to do a test closer to what some of the other runtimes are doing, just call :timer.send_after(10000, :done) a million times, and then do a loop to receive :done 1 million times. Takes about 200mb instead.
@genericjam9866
@genericjam9866 6 ай бұрын
Elixir / Erlang processes have far less memory by default. More like 256 bytes but depends on word size on your system iirc.
@nyahhbinghi
@nyahhbinghi 5 ай бұрын
really smart GC model! Elixir was very well designed
@nyahhbinghi
@nyahhbinghi 5 ай бұрын
I wouldn't compare it to microservices. I would just say Elixir processes are independent and don't share memory. Which really makes it unique (I don't know of another runtime like this except Node.js webworkers).
@bryanenglish7841
@bryanenglish7841 Жыл бұрын
You forgot the extra Rust thread it takes to track all the bullshit drama in the Rust community
@Marhaenism1930
@Marhaenism1930 Жыл бұрын
oopsy! is it new feature of crablang in 2023?
@BlackistedGod
@BlackistedGod 11 ай бұрын
dammit why did I laugh so hard on this
@JensRoland
@JensRoland 11 ай бұрын
The Rust forums are just clogged with unproductive / outdated discussions that lead nowhere and make it harder to get anywhere as a community. The mods should simply go through all the threads once in a while and nuke the ones that are no longer relevant or helpful so the good stuff can get more space and everything would run smoother. Maybe they could even automate this with an LLM agent? They could call it “RustScheduledGarbageRemover”
@juniuwu
@juniuwu 4 ай бұрын
@@JensRoland Garbage Collector? BAN
@JensRoland
@JensRoland 4 ай бұрын
@@juniuwu banning people is just garbage collection for communities ;-)
@ThePhoenixProduction
@ThePhoenixProduction 6 ай бұрын
Where is c++?
@ErickBuildsStuff
@ErickBuildsStuff Ай бұрын
None cares😅
@SowTag
@SowTag Ай бұрын
​@@ErickBuildsStuffAh yes, no one cares about one of the most important and influential programming languages of all computing history
@InternetExplorer687
@InternetExplorer687 Ай бұрын
@@SowTagid argue that C is more influential but yeah, saying no one cares about the language most used in most performance critical applications, that also need low level access to memory, is a really big stretch.
@jstro-hobbytech
@jstro-hobbytech Ай бұрын
This guy reminds me of yongyea. Parrots other's work and makes more than the authors combined. He has no insight or original opinions or educated insight (from experiences academic or otherwise). I hate how people raise this guy up. Agreed on c++. That's my personal preference as I like the syntax being I learned it the same term I took cobol, Java (when it was new), visual basic and oop was still being defined. I've never worked in industry as a programmer but keep up to a middling ability. One thing I do know is that bullshit always smells like bullshit and this dude is full of it. People that talk during react videos do so only to fall under fair use, I see the same here transposed to a topic he is novice. Want for choice as mediocrity's excuse is no less evident than an untrained hand on display for no person's betterment or an opiate of excuse to be subject for one not turning to their purpose. I'm as wrong as apt to be right so there's that as well.
@idkwhatcouldbeavalable
@idkwhatcouldbeavalable Ай бұрын
​@@jstro-hobbytech I personally use Rust as it keeps some of the cpp syntax and adds on top of it to prevent common mistakes.
@shreyassreenivas4786
@shreyassreenivas4786 11 ай бұрын
Go reserves 4K of memory for each thread's stack so you could do quite a bit of work on each of those threads without incurring further costs.
@demyk214
@demyk214 10 ай бұрын
Makes sense
@-rate6326
@-rate6326 4 ай бұрын
goroutines aren't threads.
@tablettablete186
@tablettablete186 2 ай бұрын
​@@-rate6326Yeah, GO actually creates all threads at startup and just assign gorourines to them. All of this to say: it's a thread pool lol
@TanigaDanae
@TanigaDanae Жыл бұрын
An information that has not been said in the video is that: async functions in C# are State Machines and Tasks (are part of the Task Parallel Library and) are automatically run in thread pools. So the only internal state these async functions have is the time they need to wake up, and all Tasks could theoretically have the same wakeup time. I would've loved to see a C# Thread implementation. I suspect the C# compiler is optimizing redundant Tasks away since they lack any side effects.
@vitskr1
@vitskr1 Жыл бұрын
Thread pool has like 512 preallocated threads, hence high memory usage in idle. Tasks are actually running, but max degree of parallelism is 8 (8 threads CPU) so there is practically nothing allocate.
@q1joe
@q1joe Жыл бұрын
@@vitskr1 you can tune this, knowing your workload though. Some languages I feel didn’t he the best showing here as the author isn’t an expert in each one, which is understandable
@monad_tcp
@monad_tcp Жыл бұрын
@@vitskr1 Exactly what I suspected kzbin.info/www/bejne/jZuugoR3e9yliJY . Its using the Server tuning, I think on Desktop the default is Number of Cores * 2 .
@monad_tcp
@monad_tcp Жыл бұрын
@@vitskr1 512 threads * 512Kb = 256MB . Its not that big of a deal for servers with lots of cores.
@bangonkali
@bangonkali 8 ай бұрын
@@monad_tcp i agree. and irl if you plan to launch 1M concurrency your probably have the RAM to match. i still don't think many people do these in a single process anyway. probably better to distribute workload to multiple servers. i recommend orleans 7 for c# devs. 😅
@diadetediotedio6918
@diadetediotedio6918 Жыл бұрын
C# was the winner, clearly everybody was expecting this
@sanampakuwal
@sanampakuwal Жыл бұрын
yes
@shreyasjejurkar1233
@shreyasjejurkar1233 10 ай бұрын
Of course, kudos to .NET runtime team! 😎
@mattymerr701
@mattymerr701 10 ай бұрын
Clearly they fucked their setup [Insert cope here] To be fair, they did fuck it but...
@cnikolov
@cnikolov 10 ай бұрын
Running as AOT has even smaller footprint
@FilipCordas
@FilipCordas 9 ай бұрын
Also he wasn't using ValueTask, they reduce the memory consumption considerably. But I hate tests like this because a compiler could remove everything before the code isn't doing anything.
@Hallo503
@Hallo503 9 ай бұрын
C# has the lowest memory usage because it is using the threadpool, that recycles blocking threads, like when calling Task.Delay. So there aren’t actually a million threads created but rather they are queued into the threadpool. To avoid this create the threads explicitly
@user-qu5cc5oe2h
@user-qu5cc5oe2h 7 ай бұрын
pff... everyone knows that c# offloads 50% of tasks on Azure servers
@dieSpinnt
@dieSpinnt 7 ай бұрын
@@user-qu5cc5oe2h ROTFL. As a first time viewer I asked myself if ThePrimeTime is always on that level of cocaine? Well, its something different than other coding channels. A fresh breeze, so to say .... **g**
@muaathasali4509
@muaathasali4509 6 ай бұрын
@@user-qu5cc5oe2h free compute hack
@qendrimimeri8561
@qendrimimeri8561 4 ай бұрын
​@@user-qu5cc5oe2h😂
@gregorymorse8423
@gregorymorse8423 3 ай бұрын
No shit, Sherlock, all of the languages were using threadpools except Java and Rust with real worker threads. So you've failed to uniquely qualify C# altogether.
@markusn4614
@markusn4614 Жыл бұрын
That C# method has 2 extra layers, the code inside the for loop should just be tasks.Add(Task.Delay(TimeSpan.FromSeconds(10)));
@Eirenarch
@Eirenarch Жыл бұрын
This 👆 They created threads to run their threads inside
@PetrVejchoda
@PetrVejchoda 11 ай бұрын
@@Eirenarch No it should not. If you did it the way you describe, the work (in this case represented by Task.Delay) would not be scheduled on TaskScheduler and would instead be done on the thread that this code is running at thus blocking it and not using CPU cores to its fullest. If any, it should be Task task = Task.Run(Task.Delay(TimeSpan ...)); tasks.Add(task); This would save some memory while still scheduling the work on worker threads. I am not sure if there would be any benefits, if you used TaskFactory and Scheduler directly, whether it would be more performant, but I highly doubt so. Task itself is glorified coroutine and job child. Its just a premise of an action, that can wait for other actions to complete. Task.Delay does not do anything with scheduling, or threading. It just writes a timestamp, and deposits the Task to run later, when the proper time has come. But it would not start new thread/virtual thread/Task/Coroutine. Since they are trying to figure out, how costly scheduling a new thread/virtual thread/Task/Coroutine is, this would not do the work.
@manpt123
@manpt123 9 ай бұрын
c# and you are the 2 most useless stuffs
@FilipCordas
@FilipCordas 9 ай бұрын
Also I don't see value tasks and the list doesn't have a buffer set.
@taqial-faris6421
@taqial-faris6421 8 ай бұрын
I was looking for this comment. Guy who created that blog clearly knows nothing since he is using chatGPT and chatGPT also knows nothing if it outputs that kind of code... But hey, even my 'senior' coworker used to write async code like that so who am I to judge.
@wlockuz4467
@wlockuz4467 Жыл бұрын
It should've been "To infinity and NaN" as an homage to JavaScript.
@SirBearingtonSupporter
@SirBearingtonSupporter 11 ай бұрын
You actually pointed this out early on. In the Java and C# version, he uses "ArrayList" without specifying the size. ArrayList in both these languages hold an actual Array object. It's why the lookup time for "get" is a memory address lookup time. When Java needs to expand the array size, it creates a larger array that is twice the size of the current array size. I believe the default is 10. Java also doesn't run the garbage collector unless it needs to be run or specifically invoked with System.gc. Because the JRE doesn't plan ahead for your bad code, it just looks for a new place to put the object in memory, leaving all the old references that need to be deleted alone - because the GC will deal with it as needed. Just to recap there are several arraylist objects each holding an array of size n (below) in memory - and if the JVM is given enough memory, all 11 of these will still be there. So that means there are 20510 threads in memory on the test. While his approach to joining all the threads was barbaric, it's also the accepted answer on StackOverflow, we are not measuring the speed of the execution, just the memory of it. If you were not trying to measure the memory performance of threading on difference languages, I would actually give java more threads to manage the threads (parallelize stream). Finally thoughts, We aren't concerned about thread space in production equipment, we are concerned about execution time and if my entire program hangs because one calculation couldn't be done, I'm missing out on something important - it could be a trade, moving servo for a robotic (self driving cars) or producing an input for a chess game. Collecting the information that I can allows me to implement an algorithm that is capable of making educated guesses based of what was calculated. If we do care about thread space, we would be better off doing single threaded applications since we don't have an overhead associated with the effing cost of the thread. TL;DR Something something short equal something something int because the JVM go fast blah blah addresses blah blah blah 4. (primitive array blah blah addresses, blah blah)
@W1ngSMC
@W1ngSMC Жыл бұрын
To be fair, Elixir is spawning new processes with their own memory and PID (inside the VM).
@isaacyonemoto
@isaacyonemoto Жыл бұрын
And also providing stuff for graceful restarts and an entire message queue
@BosonCollider
@BosonCollider Жыл бұрын
And preemptive scheduling, if any one of them fails or blocks indefinitely it cannot take the rest down with it.
@sukidhardarisi4992
@sukidhardarisi4992 4 ай бұрын
usage of Task.async in elixir, it comes with lot of boiler plate that is wrapped on top of GenServer. if the test has to be performed for concurrent tasks, one could go with primitives like spawn, send and receive in order to know the true potential. Just my opinion on why elixir used a lot of memory.
@gregorymorse8423
@gregorymorse8423 3 ай бұрын
It's not doing anything. The erlang process concept has nothing to do with threading. Sure it explains the memory usage, but there are ways to pool it so a maximum amount of processes could be spawned at any time.
@igordasunddas3377
@igordasunddas3377 Жыл бұрын
Man I am allergic to empty catch blocks in Java - always. After looking for exceptions that have never been rethrown or really handled, I am really on the fence. Empty catch blocks should not exist or even be allowed...
@gregorymorse8423
@gregorymorse8423 3 ай бұрын
You are allergic to using your brain, yes we know. Maybe if you knew what checked and unchecked exceptions are and stopped making dumb comments. This is why you should stop the drugs and go back to school, fool
@albertmagician8613
@albertmagician8613 Ай бұрын
I have no problems with empty catch blocks, as long as my compiler is allowed to optimize them away.
@NameyNames
@NameyNames 9 ай бұрын
As likely already pointed out, C# uses a thread pool, and will definitely not create a gazillion threads in this test, and the memory required to house all of these insignificant tasks will be very small, which is apparent in the test results. I tried it out in LinqPad, but with one additional task whose only purpose was to keep track of the number of simultaneous threads actually in use. For 1 million tasks, the actual active thread count peak never even exceeded 50 on my system (usually much lower). No wonder, when all that the tasks are "doing" is async-waiting on a delay. This benchmark is broken in the sense that it doesn't really do what the author thinks it does, i.e. it does NOT create a lot of threads (virtual or otherwise) in all languages/runtimes, and measuring the memory usage is thus close to pointless.
@bahtiyarozdere9303
@bahtiyarozdere9303 5 ай бұрын
Thank you for sharing and commenting on this one. I would love to see C# with AOT compile. I believe it would make a huge difference.
@chigozie123
@chigozie123 5 ай бұрын
The go results are not surprising. It's a well-documented feature that each goroutine starts with an initially pre-allocated stack size. Prior to go 1.2, it was 4kb, then it went to 8kb, and I believe it's now at 2kb for go 1.4+. So 2kb × 10k means an additional 20mb on start. At 100k, it means a minumum of 200mb on start. The math seems pretty consistent with the results we see for go, although they seem to suggest that initial stacksize may be closer to 2.7kb than 2kb. We also have to keep in mind that there is a garbage collector running in there, and we didn’t account for how much memory it requires to keep track of everything going on.
@Mentox2
@Mentox2 Жыл бұрын
9:30 - In the 19th century the german mathematician Georg Cantor proved that there must be more than one kind of infinity, such a the infinity of the natural numbers, and the infinity of real numbers and so on, and that there are larger infinities than others. The smallest infinity is that of the natural numbers, and its called Aleph Zero. So yes, Buzz can indeed go to infinity and beyond, so long it is mathematical infinity.
@ko-Daegu
@ko-Daegu Жыл бұрын
pretty cool i remember studying this part of set theory and how Alef (first alphbet in Arabic) the idea is that the set of natural numbers (1, 2, 3, ...) has the smallest cardinality and is denoted as Aleph Zero (ℵ₀)
@JamieNeubertPedersen
@JamieNeubertPedersen Жыл бұрын
Thanks. Was thinking the same.
@user-zt7gj5ff8n
@user-zt7gj5ff8n Жыл бұрын
Nothing "and so on". That is not clear. In fact it can neither be proven not disproven with standard mathematics. It is called the continuum, hypothesis
@mykhailonikolaichuk6392
@mykhailonikolaichuk6392 11 ай бұрын
@@user-zt7gj5ff8n The continuum hypothesis is that there are no intermediary infinities between "infinity of integers" and "infinity of reals". It is, indeed, but an axiom. However, the cartesian product of a set with itself ALWAYS yields a set with higher cardinality, so infinitely many distinct infinities can be constructed by the repeated usage of it.
@d7ffab979
@d7ffab979 11 ай бұрын
@@mykhailonikolaichuk6392 That is just wrong. Infinite cartesian products of natural numbers, for examples, are "just" rational numbers.
@peppybocan
@peppybocan Жыл бұрын
So this article is definitely comparing apples to oranges - light threads/proper threads and runtime limitations. Go has support for parallelism, but it will only allocate as many threads as there are CPU processors (see GOMAXPROCS env variable) and on those the runtime scheduler runs these tasks. Python with its notorious GIL (Global Interpreter Lock) is the main bottleneck, though not visible in this flawed benchmark, as the threads themselves are not doing anything, this looks fine until you actually need to run some code. So Python would very likely burn in throughput benchmark, regardless of the number of threads. (See Python's sys.setswitchinterval). NodeJS, as The Prime mentioned, again, massive event loop and timers on it. If you do a computationally heavy work on it, your one poor CPU will go into early retirement....
@daasdingo
@daasdingo Жыл бұрын
The article was using the single-threaded event loop in Python.
@peppybocan
@peppybocan Жыл бұрын
@@daasdingo still wrong though.
@mennovanlavieren3885
@mennovanlavieren3885 Жыл бұрын
I concur. With IO heavy tasks the NodeJs event loop is okay, and keeps your programming model simple. With computational work you need to use workers on NodeJs as per NodeJs documentation itself. And even with IO tasks you should not use one Node process on a gazillion core machine. Also, not all light thread implementtions (hate the word green in this context. Green, in practice, means illogically wasteful in the name of virtue signaling) offer the same features out of the box.
@ddomen9488
@ddomen9488 Ай бұрын
​@@daasdingoalso in nodejs since promises are not actual threads
@Lyynx92
@Lyynx92 Жыл бұрын
.Net pre-allocates a thread-pool at startup though the memory shouldn't be quite that high. Pretty sure it also utilizes a work stealing scheduler under the hood for continuations and its async/.await behavior. Also if you want to further optimize for memory the ValueTask struct will do some caching cleverness to dodge Task allocations if the work is either already done or can be done synchronously. Given how simple the test is, the GC probably won't kick in as it can recycle a lot of those Task objects.
@Trekiros
@Trekiros Жыл бұрын
Intro: let's not compare apples to potatoes The rest of the video: compares making threads with maintaining an event queue
@stevenhe3462
@stevenhe3462 Жыл бұрын
Elixir reserves 4kiB of RAM for each of its processes. Each process in Elixir has its own separate heap to eliminate the possibility of stop-the-world-GC.
@llothar68
@llothar68 Жыл бұрын
Each Linux kernel thread needs 32kb (28kb of it are non swappable physical kernel stack space) + 1kb for kernel structures.
@metaphysicalconifercone182
@metaphysicalconifercone182 Жыл бұрын
I wonder why Kotlin wasn't included, I guess it does share similarities with Java and Go but it's implementation of Coroutines is supposed to be different from that in Go. I guess testing it would also have to include both JVM and Native compile targets because you never know.
@avalagum7957
@avalagum7957 Жыл бұрын
If you include kotlinx library, you should add Scala Actor, ZIO ... too.
@DeliOZzz
@DeliOZzz Жыл бұрын
@@avalagum7957 suspend keyword and channels are part of the standard kotlin library. Coroutines package includes coroutines' builders and stuff like flows. For some reason Prime just ingores Kotlin whatsoever :/ But i'd really like to watch some quality kotlin roast.
@sharkpyro93
@sharkpyro93 11 ай бұрын
@@DeliOZzz cause its not a popular choice for backends, alot of people still thinks kotlin is only for android, im afraid this stigma will stick around for the time being
@AlanPCS
@AlanPCS 6 ай бұрын
It runs in the same VM. At most it would be equal to a competent implementation in Java only.
@madlep
@madlep Жыл бұрын
The Elixir solution has a LOT of room to squeeze out. I can get it running in about 990mb with some tweaks. Main thing is the default heap size. Passing `+hms 1` as part of `erl` options sets default size to 1 4-byte word. Also, using plain spawn calls instead of Task (which accumulates results, and adds extra memory and GC and processing overhead) reduces it further.
@mennovanlavieren3885
@mennovanlavieren3885 Жыл бұрын
True, but as long as the "threads" don't actually do anything it is a useless comparison. The constructs on these platform all provide a different feature set, so comparing performance is bogus. I mean a C# Task is just one or a few objects waiting in several queues to be invoked by native threads in the thread pool with a job stealing algorithm. NodeJs and Python are single threaded with a single event loop. I don't know what the others do and give you for free, but this isn't apples to apples. (Edit: I automatically type thread with a capital T)
@madlep
@madlep Жыл бұрын
@@mennovanlavieren3885 Yup. The comparison is pretty meaningless. The "cheap", non-idomatic Elixir way to do this, would be to start 1,000,000 timers, and wait for them to finish. Effectively doing the same thing as some other platforms. I just tried that - uses about 200mb in total of memory. If all it's doing is starting something that sits there idly for 10 seconds, there isn't much difference. No point carting round a whole isolated separate stack and heap for each process, and associated house keeping. Elixir processes are cheap, but they're not *that* cheap.
@woolfel
@woolfel 11 ай бұрын
back in the JDK 1.3 days, the JVM would allocate 1MB per thread, but it was changed around 1.6/1.8, I forget exactly which release they fixed that. It's also important in Java to get the memory used, not memory allocated. The biggest issue with java for me is once the JVM allocates memory, it doesn't release it until you stop the JVM process.
@pinoniq
@pinoniq 11 ай бұрын
If you want node to actiually use multiple threads, you need to tell libuv to use multiple threads. There is a env variable for this: UV_THREADPOOL_SIZE . Like you said, node has an eventloop. Thats not multi-threaded. It's single threaded with callbacks. Thats why setTimeout is more a 'minimum' guideline and not precise at all (under heavy loads). Just make a busy-wait program in node and you'll see it only filling up a single core on ur CPU
@Bourn77
@Bourn77 Жыл бұрын
C# master race. Lets go. .NET team is optimizing the fu*k out of the stack for a few years. Hands down the best api backend language to work with. 🥰
@reddragon2358
@reddragon2358 Жыл бұрын
I hope that it become so good that it could be perfectly used for full stack language.
@BosonCollider
@BosonCollider Жыл бұрын
@@reddragon2358 It does work fairly well together with HTMX
@reddragon2358
@reddragon2358 Жыл бұрын
@@BosonCollider Oh, glad to hear, but for example with Java could be used for full stack development with the help of Java frameworks.
@mishikookropiridze5079
@mishikookropiridze5079 Жыл бұрын
@@reddragon2358 That produces horrendous UI. Could be future using WASM.
@reddragon2358
@reddragon2358 Жыл бұрын
@@mishikookropiridze5079 I heard that C# has UI frameworks. I hope that the get better with time.
@andzagorulko
@andzagorulko Жыл бұрын
C# has threads. Benchmarking Tasks instead is just confusing, because those aren't theads.
@pavelyeremenko4640
@pavelyeremenko4640 Жыл бұрын
As you may have noticed, he's benchmarking green threads(tasks in c#, goroutines in go, etc.) across the languages.
@carlinhos10002
@carlinhos10002 Жыл бұрын
C# does not have green threads. Tasks are not green threads
@pavelyeremenko4640
@pavelyeremenko4640 Жыл бұрын
​@@carlinhos10002 Now that I've re-read the definition of green threads, I'm not sure how they aren't. They are not OS managed. They are lightweight thread-like primitives managed by the runtime. What are they missing? Wikipedia also lists them as such on en.wikipedia.org/wiki/Green_thread Not sure if this is as important though, every language in the lists was using their concurrency primitive built on top of some managed pool anyway.
@metaltyphoon
@metaltyphoon Жыл бұрын
@@pavelyeremenko4640 he’s just making things up. Most implementations are using some abstraction over OS thread. Only one of Java and Rust versions dont do that.
@zephyrprime
@zephyrprime 21 күн бұрын
C# tasks use a threadpool to execute. But one thread can have multiple tasks waiting simultaneously and the code this guy used had each thread sleeping for several seconds
@TizzyD
@TizzyD Жыл бұрын
🤔 I concur with you Big P...let's look at some more real use cases. Going outside of the process itself will complicate analysis with other elements (e.g. DB, ORM, etc.) that should be held constant; however, there are good use cases to eliminate as much of the 7 layer stack as we can: 1. Storage - with the good old random file manipulation, etc. 2. Network - doing something more like a UDP listener to eliminate possible contamination with socket handling 3. Memory - malloc, 😮multi-threaded data manipulation, release (to watch garbage collection) 4. Compute - not all compute operations are math-based, but do some string parsing, concatenation, etc. I'm thinking we want to eliminate math computations because most of those operations will come down to the underlying math implementation vs. actual performance (e.g. Fortran being fast, etc.), but network issues could have the same impact. Consider the history of Java IO vs. NIO.
@smallfox8623
@smallfox8623 Жыл бұрын
i'm ready for the C# arc let's go, it has a really bad reputation that is totally undeserved these days
@reddragon2358
@reddragon2358 Жыл бұрын
True.
@MH_VOID
@MH_VOID Жыл бұрын
My personal hate for it came from the pain of trying to use it in my SW dev course on linux compared to those windoze fags who have first class support for everything, and from missing a bunch of the things I love about Rust when doing C# (e.g. immutable by default, f, u, i (though byte is fine and I guess using "long", "short", etc. isn't really bad. more just personal preference and more efficient), match, traits, enums, macros! True some of these stuff are to a decent extent available in C#, but the.. culture doesn't use them primarily like Rust does). But the language itself genuinely looks pretty nice, and has some nice features and shit even over Rust. I'm definitely comfortable calling the language "better Java", and would be okay programming in it professionally or even hobbyistically.
@reddragon2358
@reddragon2358 Жыл бұрын
@@MH_VOID Yeah. Rust is very intriguing language (excluding the dramas and BS). Also things should be a lot better than before. Although there still is some windows/Microsoft bias in the language.
@sohn7767
@sohn7767 Жыл бұрын
I think C# is great honestly. Not the best in anything, but it’s good in many areas
@reddragon2358
@reddragon2358 Жыл бұрын
@@sohn7767 Yeah agree. And I think that it is its main strength. That it can be used for everything.
@baxiry.
@baxiry. 11 ай бұрын
There is some important information not mentioned in the article Goroutines are compared to threads, either real or virtual. It is not compared to event loop Go has event loop libraries As long as the author of the article used the event loop in other languages, he should use it in Go as well in order for the comparison to be unbiased. Other information: The advantage of goroutines over threads is that it is portable. It does not depend on the operating system. If your application requires on-the-metal operation such as chips or microcontrollers that do not have an operating system, a goroutine can be run. With threads it is not possible. Because the language is not the one who does the job but the operating system. And where there is no operating system, there are no threads. One last thing When an application uses system threads, the system will reserve memory. The question is: Did the author of the article calculate the memory reserved by the system ??
@Jmcgee1125
@Jmcgee1125 Жыл бұрын
15:11 Python, by default, only uses one worker thread. When writing asyncio code you do need to be careful that you don't block. My understanding is that each event loop may have only one worker, but I'm not experienced enough to be confident in saying that.
@ShaneFagan
@ShaneFagan 3 ай бұрын
To expand on this a little more for people: 1. They used asyncio which is just an event loop, there is no threading, just a loop that does the tasks in FIFO. The memory usage would be just the amount that stores the task information/statuses, it wouldn't have overhead from spawning threads 2. Virtual threads in Python are in the threading module. They are limited to one core but can run in parallel and independent as you would expect from a thread. 3. For proper hardware threads you have to use multiprocessing and it works very similar to other languages that use fork but with the added stuff like the ability to spawn a thread pool for batch processing and maybe limit the amount of threads to a number that wouldn't cause stability issues on the system. Also in Python3.12 there are some interesting changes related to the GIL which change how concurrency works in general with the ability to run code in basically another instance of Python. That will change mega high performance Python concurrency quite a bit in the future but as of right now it's one of the 3 above I described. Just note the blog post he is talking about is 1 which isn't parallel.
@dipi71
@dipi71 Жыл бұрын
Erlang, a language used in telecommunications, still seems to be the concurrency champion (according to a book by Röhrl and Schmiedl called »Produktiver programmieren«, I've read it in German a while ago).
@robfielding8566
@robfielding8566 11 ай бұрын
Go is definitely not a memory hog; at least for IO-intensive tasks. The main thing is that the Go libraries are always very careful to stream large inputs; rather than buffer them in memory. Java itself doesn't really have major memory issues beyond spawning threads; but in any large Java project, the code will be full of things being buffered into arrays, rather than being streamed. I tried rewriting netty to make it stop doing dumb things; and just switched (permanently) to Go. Part of Java's program is also the legal issues of shipping a JVM; and the existence of Oracle thumb-breakers and lawyers; to come punish you for shipping.
@misterkevin_rs4401
@misterkevin_rs4401 10 ай бұрын
C# Uses a thread pool behind the scenes with a default config of #X amount of threads depending on the system it's running, it's usually 20 if I remember correctly from my .NET days. What's interesting to me is how it can spin up more if required and scales correctly.
@FilipCordas
@FilipCordas 9 ай бұрын
Should be equal to number of cores you have available on the machine.
@3x10.8_ms
@3x10.8_ms Жыл бұрын
crab is fast and fox is slow
@ThePrimeTimeagen
@ThePrimeTimeagen Жыл бұрын
do a barrel roll
@autismspirit
@autismspirit Жыл бұрын
tbh the C# number kind of makes sense, it scales incredibly well, especially in later .NET versions. Some C#-based fancy Unity optimizations can beat out GCC in raw speed and memory.
@autismspirit
@autismspirit Жыл бұрын
Granted, there is probably some optimization going on in Release mode, since it's not doing anything. I'd expect the memory consumption to be higher, but not 4GB high.
@marcossidoruk8033
@marcossidoruk8033 Жыл бұрын
What do you mean by "beating GCC" last I checked GCC was a compiler.
@CorvinhoDoMal
@CorvinhoDoMal Жыл бұрын
​@@marcossidoruk8033 yeah, the optimizations are made by the compiler. He meant the C language, but specifically with GCC. If you used the microsoft compiler or other options you would have different performances.
@marcossidoruk8033
@marcossidoruk8033 Жыл бұрын
​​​@@CorvinhoDoMal No way C# is going to beat carefully written C code in any imaginable benchmark ever, its just impossible. Plus what he said makes no sense, "unity optimizations" how do you compare C# unity performance with C unity performance if you can't do unity scripts in C? Am I going crazy or what. And if he means the engine that is written almost in its entirety in C++
@janus798
@janus798 Жыл бұрын
@@marcossidoruk8033 Google the Unity Burst compiler. Faster than GCC in fibonacci and NBody simulation.
@quachhengtony7651
@quachhengtony7651 Жыл бұрын
C# fan bois are eating good these days
@reddragon2358
@reddragon2358 Жыл бұрын
Yup
@tofaa3668
@tofaa3668 8 ай бұрын
The issue with the java threads i feel like is not preallocating the array list, every time an arraylist gets appended it checks for the size and generates a new array. Which in this case would be a whole lot of arrays in memory for the gc to collect.
@iforgot669
@iforgot669 Жыл бұрын
C# now has native aot and would have significantly improved the memory footprint of this
@SurvivalGamingyt
@SurvivalGamingyt Жыл бұрын
Yeah, 7,4mb for just a standalone release mode app.
@sgbench
@sgbench Жыл бұрын
Also trimming
@FilipCordas
@FilipCordas 9 ай бұрын
@@sgbench ValueTasks and adding a buffer size to the list will help.
@CeleChaudary
@CeleChaudary 4 ай бұрын
@@FilipCordas That's a good point
@om3galul989
@om3galul989 11 ай бұрын
yea node example is not spawning threads, it's just placing tasks on the timeout callback queue of the eventloop to be executed later using the main thread.
@casperes0912
@casperes0912 Жыл бұрын
I will most likely need to use C# as my primary language at my next job
@reddragon2358
@reddragon2358 Жыл бұрын
Wish you all the best
@dziarskihenk8798
@dziarskihenk8798 Жыл бұрын
c# is life, c# is love
@reddragon2358
@reddragon2358 Жыл бұрын
@@dziarskihenk8798 XD
@ghevisartor6005
@ghevisartor6005 11 ай бұрын
dont use maui
@blowfishfugu4230
@blowfishfugu4230 7 ай бұрын
just for fun, did creating threads in c++ in a similar fashion: static std::atomic toInc = 0; { std::vector threads; for (int i = 0; i < 1'000'000; ++i) { threads.emplace_back(std::jthread{ []() { toInc++; } }); } } running on a cpu providing 8 cores it took endless (we're talking bout 15minutes) to allocate thread-handles, resulting maxmemory consumed was 75MB. deallocating the thread-handles took the same amount of time creating them. so. this testcase highly depends on what kind of platform/OS is in use. Also it's not advised to use more threads than your hardware can handle on native cores, on my system the highest multithread-performance was on 32 threads (including an if < 1'000'000 inside each thread's lambda). and the peak-performance for the simple task was on singlethreaded (guess because no locking on atomic was necessary) --- everything just observations and measurements
@jonstewart5525
@jonstewart5525 3 ай бұрын
Since this is a Linux system it’s using the completely fair scheduler (cfs) which means each thread runs at the same priority (as apposed to the mlfq (multilevel feedback queue) that windows uses). The issue then is that the OS is processing at the same priority as each of the threads created so the computer just freezes up. There’s also a minimum time spent in each thread so you rarely get to execute an action.
@R4ngeR4pidz
@R4ngeR4pidz 11 ай бұрын
You're 100% right about the complexity of the task. But also, I would have stopped reading after they said they used ChatGPT to come up with the code. You need to have these contributed by people that actually write this language and that actually understand this language. The ambiguity between what the code was actually doing in all of these was horrible, as other commenters have also pointed out.
@nikolalukovic2593
@nikolalukovic2593 Жыл бұрын
My man hates C# so much, it's hilarious! To be fair though I agree with everything you said and would love to see your benchmarks about this topic.
@sanjayidpuganti
@sanjayidpuganti Жыл бұрын
​@@cethienI love C# but hate MS. I use Rider and Linux to code in my personal time and I like it. I think it's very good for API development.
@DaddyFrosty
@DaddyFrosty Жыл бұрын
@@cethien VS sucks, Rider rules. I do also hate Microsoft but it’s a good language nonetheless
@pavelyeremenko4640
@pavelyeremenko4640 Жыл бұрын
@@cethien I've been developing c# on linux and macos for a couple of years now using Rider (I just like it more but the Visual Studio is also fully cross platform). I don't personally enjoy the language as much nowadays but the tooling is great whatever platform you pick.
@DaddyFrosty
@DaddyFrosty Жыл бұрын
@@pavelyeremenko4640 last time I used visual studio on mac it was only for Xamarin
@ko-Daegu
@ko-Daegu Жыл бұрын
@@cethien I loooove writing Razor components 🤓 // MyComponent.razor @using Microsoft.AspNetCore.Components @Title @Message @code { [Parameter] public string Title { get; set; } [Parameter] public string Message { get; set; } } the fuck is this shit
@nelsonoussahsigha1300
@nelsonoussahsigha1300 Жыл бұрын
yes he could've use worker to create thread for concurrent task, by using settimeout you're still mono thread so all those setimeout will be queued inside the callback queue
@Overminddl1
@Overminddl1 Жыл бұрын
I'm also curious how OCaml's task library would go, as well as rust using a future joiner instead of full tasks just for curiosity, lol
@erickmoya1401
@erickmoya1401 Жыл бұрын
My wife says you yell too much. I tried to prove she is wrong. My argument didnt last a second.
@SharunKumar
@SharunKumar Жыл бұрын
I wanna see Nick Chapsas's reaction on this 🤣
@rahulagarwal968
@rahulagarwal968 11 ай бұрын
For building the backend for a Flutter application or any frontend. Which server side language will you prefer : Go or Node js ?
@OlegSkalozub
@OlegSkalozub 8 ай бұрын
Go
@sciencefirefly837
@sciencefirefly837 6 ай бұрын
Does it also not depend on the type of task which is executed? Usually, it should be some validations and a CRUD in DB.
@boredstudent9468
@boredstudent9468 Жыл бұрын
He said he launched 1 Task, as soon as you start one async task C# (in .NET 6) already sets up all the thread pool stuff and Access control. For such simple instances you should use threads in C#. Afaik it greatly improved with .NET 7. But in exchange you are prepared to scale incredibly, also yeah the .NET runtime does some incredible smart magic in the background, e.g. have a looked at LINQ performance in .NET 7.
@metaltyphoon
@metaltyphoon Жыл бұрын
CAS is not a thing anymore in dotnet core world.
@sgbench
@sgbench Жыл бұрын
@@metaltyphoon CAS?
@rroscop
@rroscop Жыл бұрын
Can you really run 1 million C# threads?
@boredstudent9468
@boredstudent9468 Жыл бұрын
@@rroscop on my hardware no problemo, remember that they are way more like go routines than like hardware threads, so only a dozen is actually working in parallel, the rest is just queued.
@rroscop
@rroscop Жыл бұрын
@@boredstudent9468 nice. Are you talking about System.Threading.Thread's? Or tasks run via Task.Run()? my understanding was that Task.Run() used a thread pool under the hood, but real Threads were more heavyweight. I'm not a C# developer though, just dabbled
@c4ashley
@c4ashley Жыл бұрын
The name is the C-sharpagen.
@mattymerr701
@mattymerr701 10 ай бұрын
C# uses loads of thread pools and I think the issue is they likely didnt trim the assemblies etc so it kept a bunch of unused crap
@paklenizmaj
@paklenizmaj Жыл бұрын
I believe that in the java example, the program will "block" on the first unfinished thread, and when that thread finishes and the dispatcher returns execution to the main thread, the for loop will "fly" to the next unfinished thread and then hand over execution to the next thread. As the dispatcher flags the thread when it is finished, the join method simply switches (do not block, just switch) the thread if the finished flag is false. So there is no penalty.
@RichardKures
@RichardKures Жыл бұрын
The code in java could be done much better: try (varexecutor = Executors.newVirtualThreadPerTaskExecutor()) { for (int i=0; i { try { TimeUnit.SECONDS.sleep(10); } catch (InterruptedException e) { Thread.currentThread().interrupt(); } }); } }
@paklenizmaj
@paklenizmaj Жыл бұрын
​@@RichardKures Thread pools are great if you don't need long running tasks, if you need long running sockets or drawing gui in a loop you need to use raw threads. It's not just for Java but for any language. Thread pools create a small number of threads and when a task completes, the new task merges with the previous one, so there is no execution on the new task until the first task completes. Thread pools are for (parallel) computation, not for long-running tasks.
@insylogo
@insylogo Жыл бұрын
AOT and tree shaking business has come a long way with c#. I would assume actual minimums an order of magnitude or less, but he did say default release configurations.
@vighnesh153
@vighnesh153 Жыл бұрын
More interested in seeing Nodejs 20 with worker threads as they claim that there is a lot of perf improvements in Node 20
@awilliamwest
@awilliamwest 9 ай бұрын
I'm sad for F#. Interesting to see PrimaGen and others re-excited about OCaml, and perhaps the 5.0 release is one reason, but I was an F# fanatic for several years, and just returned to F# for a recent small project. (I *try* to choose Rust for new projects, but got frustrated with Rust's lack of a REPL and wanted to use IonIDE in VS Code for my small project (involving parsing XLS and zips of text files); sometimes it's more about the tooling/IDE than it is the language...) C#'s good performance here makes me think F# might also perform equally well; .NET has continued to make impressive optimizations.
@indramal
@indramal 11 ай бұрын
So what is final choose for high traffic? does it only need memory consideration? number of concurrent connection also matter.
@quachhengtony7651
@quachhengtony7651 Жыл бұрын
Let's rewrite Elasticsearch, Kafka, and Cassandra in C# and get free performance
@reddragon2358
@reddragon2358 Жыл бұрын
Wohooo. Let's go
@jeremiahgavin9687
@jeremiahgavin9687 Жыл бұрын
Look up ScyllaDB as a Cassandra replacement. It's written in C++
@tecoberg
@tecoberg Ай бұрын
Where is C++?
@zolniu
@zolniu Ай бұрын
In C# when you use Tasks with async/await, the default implementation creates a state machine that uses pre-existing thread pool to schedule execution of your tasks on the threads in the thread pool. Not only that, but it can even detect if the task in the thread is small enough to be executed synchronously - in that case it won't even end up in the thread pool - it will just execute and return as normal function call. To test how much memory threads consume in C#, you can't use Tasks with async/await - you have to use Thread class directly - that way you circumvent all of the optimalizations done in the runtime and in the Tasks scheduler.
@bentels5340
@bentels5340 Жыл бұрын
Quick correction regarding the Java remark: virtual threads are not a preview in 21, they are done. What *is* a preview is structured concurrency, which handles thread-spawn and rejoin more elegantly.
@reddragon2358
@reddragon2358 Жыл бұрын
Let's go C#
@joejazdzewski
@joejazdzewski Жыл бұрын
Prime will now worship at the altar of Anders (creator of C# and Typescript) /s
@reddragon2358
@reddragon2358 Жыл бұрын
XD
@TheSwissGabber
@TheSwissGabber 8 ай бұрын
in python there is asyncio, thread and multiprocessing. ordered according to their overhead. if you want to use multiple cores you need multiprocessing.
@nyahhbinghi
@nyahhbinghi 5 ай бұрын
If you are creating a new Elixir "process" per task it will scale up pretty linearly with the number of tasks, hence why it's high. High memory usage is not really a bad thing, perse. Likewise, the same with Go and goroutines, whereas other runtimes with a fixed threadpool or Node.js with it's single event loop won't keep climbing linearly. I would be more interested in CPU usage. You're welcome for this insight! 🤜🤛
@pdgiddie
@pdgiddie 4 ай бұрын
This. The BEAM VM was designed to prioritise latency and predictable scalability. Copy-on-write and other memory consumption optimisations can produce latency spikes.
@remrevo3944
@remrevo3944 Жыл бұрын
12:30 Per default tokio creates worker threads equal to the amount of cpu cores. Though thinking about it, if you only use timers having a single threaded runtime would likely be just as fast and more efficient.
@llothar68
@llothar68 Жыл бұрын
Not a good choice. You often have long running threads that also do block. In fact all the systems where the kernel is not controlling the worker threads sucks. This means: Linux,Android and the BSDs. The other systems have kernel driven thread pools for much better handling making sure that IO blocks don't prevent utilisation.
@remrevo3944
@remrevo3944 Жыл бұрын
​ @llothar68 I explicitly meant that for the case of using only timers, which are neither cpu intensive nor use blocking APIs. When using a async runtime like tokio you shouldn't use blocking APIs anyway and if you have to there is tokio::spawn_blocking, which spawns a thread/uses a thread pool.
@shayvt
@shayvt Жыл бұрын
C# Task is an abstraction using the threadpool. He should use the Thread class which instantiates a real thread.
@DarkOoze123
@DarkOoze123 Жыл бұрын
*managed thread
@LuaanTi
@LuaanTi Жыл бұрын
No, C# Task implies no threads whatsoever. It uses the thread pool by default for CPU work, yes, but that can easily be just the part of the job that says "this task is finished" (e.g. handling the async I/O response). Creating an explicit thread (_not_ a hardware thread, _not_ an OS thread - you don't have control over those natively in .NET) is something completely different, and very rarely used in modern C#. It negates the whole point of using asynchronous I/O in the first place, which is avoiding the overhead of threads that do nothing but wait for something to complete (whether that's a timer or a HTTP request). Which, let's not forget, was part of the point of the original article - showing how expensive "real" threads are, and that different approaches to handling asynchronous code have vastly different results. But that article is very flawed anyway. It would make sense to compare multi-threaded code with other ways of doing asynchronous I/O... but instead, we get an arbitrary choice of one or the other for each platform. You can have promises in any language. Many have commonly used or outright built-in APIs for that. Seeing the difference between, say, Java threads and Java Futures would be a bit illuminating, at least... though it still needs to be noted that you have a lot of control over things that absolutely crush this comparison anyway. The default stack size of a new thread on modern .NET is usually 1 MiB. Windows doesn't really allow you to go very small with thread stack sizes (you're supposed to use a few threads, not thousands). Linux is designed around multiple processes/threads using the same memory for as long as possible, so a thousand threads each with 1 MiB memory can actually occupy just a few megabytes (until you actually start to modify the memory). Every performance benchmarks needs to have a goal. This one doesn't really seem to have one, apart from a simplistic "weird that memory usage in async stuff can vary wildly"... I mean, pretty much every platform out there allows you to pre-allocate as much unused memory as you want, but it'd be a weird way to compare different platforms, right?
@ingenium1502
@ingenium1502 Жыл бұрын
Yes we would like to know about socket and tcp connection test. Thx for video😀
@wdavid3116
@wdavid3116 Жыл бұрын
I don't think the thread joins are actually an issue. All that is being measured is memory. The time cost would be real but if you actually have to wait on all those threads the order shouldn't be very meaningful and to get any sort of speedup you'd need an os that supports joining multiple threads at once or you'd have to do something more elaborate to make use of some sort of multiple message capability in the kernel (maybe something with epoll?) If you're waiting on thread 0 and thread 1 quits you'll be sleeping in thread 0 while other threads use the CPU to finish and then once the thread you're joining on ends you'll burn through the finished threads and then repeat the sleep as needed. Syscalls are expensive but not *that* expensive.
@_daniel.w
@_daniel.w Жыл бұрын
I'm curious about C, C++ & Zig. Also, I love Go. What happened, why did it end up using so much memory? Kinda sucks
@_daniel.w
@_daniel.w Жыл бұрын
@nósferratu Oh, alright. I was watching chat go by and someone mentioned Go is stackbased or something along those lines. Thanks for the info 👍
@hvaghani
@hvaghani Жыл бұрын
​@nósferratu right I was going to comment the same and found this
@scotter7663
@scotter7663 Жыл бұрын
The C# implementation is completely bogus compared to the others. It's using a small thread pool (task.run) to set a bunch of timers (task.delay) that's why it shows low memory usage. This is not demonstrating concurrency. If the implementation did a thread. sleep or used real threads the results would be completely different and probably worse than Java since C# doesn't have virtual threads. In the real world Go runtimes will have considerably less memory overhead than C# or Java
@scotter7663
@scotter7663 Жыл бұрын
​@@_daniel.w Go has a delay() function that looks similar to what's used in the C# impl. Rework the Go implementation to use this and I suspect it will perform drastically better
@tedchirvasiu
@tedchirvasiu Жыл бұрын
Is this the first time in history he turned off the notifications before starting the video?
@ThePrimeTimeagen
@ThePrimeTimeagen Жыл бұрын
Don't tell anyone...
@Zooiest
@Zooiest Жыл бұрын
Well, technically, JS structs can take up as few bytes as any other language, as long as you ignore the sizes of serialization/deserialization definitions and only care about the size of the ArrayBuffer you put data in
@RoccoWocco
@RoccoWocco Жыл бұрын
C# has a parallel for and foreach for these types of scenarios. You can tell it the degree of parallelism and it'll just do it for you. In no scenario is the way shown in the article correct. That's an anti pattern in 99% of cases. If you do want to do async in your parallel code then there are async versions of the parallel loops. You could also just manually make threads
@maxharmony6994
@maxharmony6994 Жыл бұрын
Now imagine giving Tom a C#
@ringishpil
@ringishpil Жыл бұрын
Go's minimum stack size is (I think) 4KB per Goroutine and it grows/shrinks as needed. Not sure whats the minimum stack size. Therefore the ~2GBs in Go is not surprising. So in 3GB of memory, you can put 1mil/10mil and probably even 20/30 million goroutines, they will just shrink in size. You can probably with the example from Piotr do even more, since it's a very simple non-memory consuming routines. But as I said, not sure whats the minimum stack size that will be consumed by a gorutine. But its less then 4KB for sure (in your example 2.8GB/1_000_000 = 2.8KB). My guess is that is not shrinking even less than this since there is enough memory available. Anyway you put it nicely, this is not a real world test, TCP/Websocket connection would be much better
@Rakstawr
@Rakstawr 8 ай бұрын
Go test here was completely misrepresented by non optimized garbage collection settings and not profiling how much of that was colored for deletion.
@MikkoRantalainen
@MikkoRantalainen 5 күн бұрын
I agree that the problem was that the benchmark was "running concurrently" when every "thread" was sleeping. If every thread had acquired accurate timestamp from the OS, formatted it into a string and appended it into a file as a single line and repeated that 100x before exiting, we would have seen at least some real processing and you could check the timestamps to see how concurrently things were happening. With a "task" that sleeps and exists you basically have a queue of tasks than can be processed one after another. Zero parallelism.
@luvincste
@luvincste Жыл бұрын
due to some bugs in services at work they spawned threads and didn't dispose them, and had easily 100_000 threads on an older windows, like server 2008; when it happened i had to restart it, though, or didn't work well after
@istovall2624
@istovall2624 Жыл бұрын
C# to the moon! Havent finished yet. Drum roll.
@reddragon2358
@reddragon2358 Жыл бұрын
Let's go.
@urbanelemental3308
@urbanelemental3308 Жыл бұрын
Yeah, the C# example is not real threads. The code is just adding tasks to the scheduler, similar to "setTimeout" in JS. Which might be fine for most things, but each "Task" is taking up memory and then waiting to run. IMO, these tests are not good overall. I agree the Java one is probably not a good example wither with the synchronous join.
@metaltyphoon
@metaltyphoon Жыл бұрын
Dude… only the one of the Java and Rust was real threads. All other tasks have use a pool abstraction. I think Elixer uses actual process.
@zephyrprime
@zephyrprime 21 күн бұрын
Not full threads but not just tasks either. Tasks use a threadpool to manage execution and the .net runtime will decide how many threads are in that threadpool.
@sikor02
@sikor02 9 ай бұрын
If C# has memory available it will swallow a lot for optimizations. Once i experimented with docker and performance tested my simple api endpoint with Bombardier (tool written in GO) - bombarding it with thousands of requests. My app used 1.5 gig of ram (!). But then I started limiting my container's available memory (-m parameter), and guess what, I went down to 15 MB and still worked. GO equivalent required at least 16 megs to work. The C# API with so little memory available performed almost the same as when using 1.5 GB anyway. (The GO was like 2% faster though, not gonna lie)
@TertiumAverruncus
@TertiumAverruncus Ай бұрын
Virtual threads are backed by a fork join pool and basically uses asynchronous operations with event loops - but at the vm level. Sigh not even close to using that right since it’s dependant on the default fork join pool, which depends on your systems number of cpus etc
@jimiscott
@jimiscott Жыл бұрын
The threads used in C# async await come from a pool and is different to system.threqding.thread
@jimiscott
@jimiscott Жыл бұрын
I wouldn't be surprised if some of this were optimised away in release mode by Roslyn. The test case is not valid.
@baka_geddy
@baka_geddy Жыл бұрын
It specifically said there were no significant difference between debug and release...
@nocturne6320
@nocturne6320 Жыл бұрын
Idk how he compiled the C# program, but I rewrote his program line by line in Net core 6 and running it gave these results (Checked with Process Hacker 2, as Task Manager doesn't report all memory): - 1 task = ~7.3 MB - 10000 tasks = ~13.5 MB - 1000000 tasks = ~430 MB - Compiled with Net Core SDK 6.0.408 - CPU: AMD Ryzen 9 7900 - OS: Windows 10 build 19045 I assume that either C# cheats on Windows by having Windows preload the runtime into memory and then reusing it for all C# programs and it simply doesn't report the memory consumed by the runtime, or some other shenanigans are going on in that article. I also tried building the app in the "self-contained" mode, where it includes the whole runtime in output, not requiring it to be installed and the footprint hasn't changed.
@stefano_schmidt
@stefano_schmidt Жыл бұрын
you should try creating the actual threads, instead of re-using ThreadPool (Tasks) as shown in the article
@nocturne6320
@nocturne6320 Жыл бұрын
@@stefano_schmidt I tried allocating actual threads with new Thread(() => {Thread.Sleep(10000);}) and that started using *a lot* of memory. Million threads took ~4 GB of memory and shutting them down with Thread.Join took forever. But considering that Threads are really not recommended by anyone these days they might be lacking the optimizations over the years.
@Alguem387
@Alguem387 Жыл бұрын
You can try to compile AOT
@metaltyphoon
@metaltyphoon Жыл бұрын
@@stefano_schmidt it’s useless test. The OS will just spend more time context switching than doing real work (thrashing)
@LuaanTi
@LuaanTi Жыл бұрын
@@stefano_schmidt The whole point of asynchronous I/O is to avoid wasting threads for things that do not need threads. If you're working with well-written modern .NET code, you don't really need more threads than you have logical CPU cores, so why pay the cost? If anything, this shows exactly one of the reasons why spinning up new threads for every task you want to do is painfully wasteful. The article tries to compare different ways of handling asynchronous code. Threads are just one of those ways, and explicit threads should be really rare in any modern codebase. The article doesn't talk about creating a million threads - it talks about a million _asynchronous tasks_ . It's the YT video that claims this is about a million threads, which is silly - there's very few platforms where the overhead from the language/runtime will be remotely comparable to the overhead from having a thread in the first place. The default stack size of a Windows thread is usually 1 or 4 MiB. It will never take less than 64 kiB (or more exactly, the page size). Now compare that to the ~230 B a C# task takes, or the ~600B (in a pre-allocated structure of at least 2 kiB) of a goroutine. When you change the code to create threads... your memory usage comes entirely from thread stacks. Which means... what exactly? We know threads are expensive, that's why we want to avoid them! :D That's where async comes from (mostly). The real failure of the article is that it doesn't even attempt to find async tasks in each of those platforms - though that isn't all that surprising given the code was written by GPT :D
@memespdf
@memespdf 11 ай бұрын
Ironically, I think it would make sense to start all programs by allocating a static 1GB of memory and keeping it around at the end. This ensures that no preallocated memory can be used
@geraldmaale
@geraldmaale 11 ай бұрын
I am interested in finding out what tool this person used to measure the memory usage for the C# part, as these results appear to be questionable.
@kellybmackenzie
@kellybmackenzie Жыл бұрын
I would have loved to see Haskell tested like this, it'd be so good
@FinnBender
@FinnBender Жыл бұрын
It's surprisingly bad :( 1 thread: 5.0 MB 10 threads: 4.9 MB 100 threads: 4.9 MB 1k threads: 8.3 MB 10k threads: 63.1 MB 100k threads: 803.8 MB
@kellybmackenzie
@kellybmackenzie Жыл бұрын
@@FinnBender Aww man! Yeah, that makes sense, Haskell is infamous for its high memory consumption because of thunks and stuff like that. I'm surprised it's that bad for 100k though, damnnn!
@alxizr
@alxizr Жыл бұрын
The nodejs example is off point. You need to choose worker threads for staying in line with all of the other examples. The same goes for the Python AsyncIO example.
@everyhandletaken
@everyhandletaken Жыл бұрын
Agree
@HotakaPeter
@HotakaPeter 7 ай бұрын
Elixir/Erlang have a lot of services running by default. These can be optimised in the Erlang boot script.
@Hector-bj3ls
@Hector-bj3ls Ай бұрын
In Rust, the default stack size for an OS thread on all tier 1 platforms is 2MB. Not sure if it's allocated up front, but that's probably something to do with when all the memory went.
@nacholopezosa
@nacholopezosa Жыл бұрын
☝🤓Buzz may be going from aleph-zero to aleph-one infinity. So to infinity and beyond could be correct
@TizzyD
@TizzyD Жыл бұрын
Maybe C# is doing something like Julia, that is, postponing execution until it actually needs to do something. Or maybe Roslyn has some under-the-covers optimizations. Any CLR experts care to comment?
@protox4
@protox4 Жыл бұрын
`Task.Run` uses the ThreadPool by default, which is very conservative when spinning up new threads. The benchmark would pretty much make the ThreadPool never spin up new threads since each task completes immediately. It waits a good long while before deciding it actually should spin up a new one, which is why you see the memory increase at 1 million.
@my_yt666
@my_yt666 Жыл бұрын
It creates n (depending on the CPU) managed threads for the default scheduler. If he wants to optimize for memory allocation, he should have used ValueTask and reduced the max managed threads of the default scheduler. But then again he should have measured threads instead of a higher level concept.
@mikestiver9000
@mikestiver9000 Жыл бұрын
Task.Run(()=>{}); does not create a thread, but will instead schedule work on the Thread pool. Task.Delay() halts execution, and 'await' returns the thread to the threadpool. The benchmarks extremely useless for C#, since all you are doing is juggling the same handful of threads back and forth starting a task, and then doing no work until the delay is up and the Task is discarded. You don't need many Threads when your Task doesn't actually do any computation or IO work.
@TheTim466
@TheTim466 Жыл бұрын
@@mikestiver9000 Would be the same for any language with async capabilities or not?
@LuaanTi
@LuaanTi Жыл бұрын
@@TheTim466 It's true for any true asynchronous I/O. You can do it in Windows with a C program, no need for fancy async languages. I/O doesn't need threads, and `Task.Delay` is just I/O - you get a notification from the system timer at a given time in the future, then a threadpool thread is used to handle the continuation (which in this case essentially just signals that the task is completed). That's also why the C# version doesn't need much space for the tasks at all - just a few pointers, a cancellation token and a tiny state machine. It fits in a few dozen bytes on x64 per task. You could trim it even lower if you wanted.
@thekwoka4707
@thekwoka4707 Жыл бұрын
Why were they using the newest rust from last month and nodejs from like 4 years ago? Like AWS doesn't support the version they used. Or 3 major verisons after it.
@Gennys
@Gennys 4 ай бұрын
It looks as though c sharp is creating a thread pool by default instead of actually launching threads.
The BEST Backend Language for You | Prime Reacts
32:57
ThePrimeTime
Рет қаралды 284 М.
How to -10x Engineer Correctly
22:22
ThePrimeTime
Рет қаралды 474 М.
Do you have a friend like this? 🤣#shorts
00:12
dednahype
Рет қаралды 48 МЛН
Шокирующая Речь Выпускника 😳📽️@CarrolltonTexas
00:43
Глеб Рандалайнен
Рет қаралды 9 МЛН
Serverless Was A Mistake | Prime Reacts
13:40
ThePrimeTime
Рет қаралды 204 М.
Is TypeScript (NodeJS) Faster than Go?? |  A server comparison
9:54
ThePrimeagen
Рет қаралды 211 М.
Best Programming Languages Tier List
33:02
Serif Sundown
Рет қаралды 8 М.
Recorded CloudFlare Firing And CEO Response
40:32
ThePrimeTime
Рет қаралды 610 М.
Why I Use Golang In 2024
9:21
ThePrimeTime
Рет қаралды 236 М.
how NASA writes space-proof code
6:03
Low Level Learning
Рет қаралды 2 МЛН
C++ vs Rust: which is faster?
21:15
fasterthanlime
Рет қаралды 371 М.
Why do developers hate Rust?
8:20
Let's Get Rusty
Рет қаралды 92 М.
The Ultimate Tier Programming Tier List | Prime Reacts
26:57
ThePrimeTime
Рет қаралды 300 М.
ПК с Авито за 3000р
0:58
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 1,6 МЛН
#miniphone
0:18
Miniphone
Рет қаралды 11 МЛН
Обзор игрового компьютера Макса 2в1
23:34
Индуктивность и дроссель.
1:00
Hi Dev! – Электроника
Рет қаралды 1,6 МЛН
How charged your battery?
0:14
V.A. show / Магика
Рет қаралды 2,9 МЛН
Power up all cell phones.
0:17
JL FUNNY SHORTS
Рет қаралды 49 МЛН