You could even see the reflection of both of them in that monitor behind them!
@brod5156 жыл бұрын
@@TheRandomSpectator I mean that comes for free
@STRAIGHTBOXMUSIC6 жыл бұрын
@@TheRandomSpectator u noticed too
@dipi715 жыл бұрын
Best Computerphile video ever, I'd say. Congrats! But was it also the most difficult Computerphile video to produce?
@taylorfaucett82825 жыл бұрын
Who? You forgot to finish your sentence.
@Nayus6 жыл бұрын
When he handed the token to the clone I was really impressed. This was great
@OpenGL4ever4 жыл бұрын
The force is strong with this one.
@yato33353 жыл бұрын
Oh wow I didn't even notice that 🤣 It was so seamless, I forgot it was the same person
@BillBodkin6 жыл бұрын
this is next level editing
@atlantic_love2 жыл бұрын
What was the video about?
@bilalzahory77656 жыл бұрын
Just a reminder that the point of the video wasn’t to optimize the code for finding the sum of all the numbers that add up to 1,000,000, but to break down how multi-threading works and some solutions you would use for more common scenarios. Great video!
@nullvoid35453 жыл бұрын
thanks for this. My first thought was making "a" into 2 variables and then adding the results would be faster at the expense of some memory. Am i right is assuming this?
@bernardoborges85982 жыл бұрын
There is a close formula for adding numbers in order so this is really just for demonstration purposes. Sum(1 to N) = N*(N+1)/2
@Lightbringer77342 жыл бұрын
Wouldn't that be optimized away by the compiler with optimization flags on anyways? You should just write your code to be readable, then only optimize if you're sure there's a bottleneck(by profiling)
@nickwilson3499 Жыл бұрын
@@Lightbringer7734 no one cares. This isn't a video about compilers. Sounds like you're scared of having to think for yourself
@Lightbringer7734 Жыл бұрын
@@nickwilson3499 wut?
@VredesbyrdNoir6 жыл бұрын
I get the impression the University of Nottingham once ordered 5000050000 boxes of that old style continuous feed printer paper back in the 80's and still don't know what to do with it all.
@kemfic6 жыл бұрын
VredesbyrdNoir they feed it to the grad students
@pewp436 жыл бұрын
I wouldn't recommend using their washrooms.
@kemfic6 жыл бұрын
@@pewp43 new channel: physicsphile - where physicists solve problems with single-ply toilet paper
@derpmarine2166 жыл бұрын
you literally get it in giant crates for cheap.
@MrLampbus5 жыл бұрын
Although they did get an Acorn A5000 out of the deal (lowest machine in the stack on the desk behind). I had (have in store still) one - great ARM machine but a bit light on RAM.
@gracejenna97872 жыл бұрын
My husband works from home. He is a very intelligent programmer, he talks to me everyday. This stuff is one of them. I am just drinking coffee while listening to him. I had no idea what he is talking about every single day. Thanks for helpful video. I will watch this everyday.
@merlingallagher44846 жыл бұрын
What still amazes me. This is "just" an educational tech related channel, were a professor/teacher is explaining stuff about computers and IT stuff. Yet, it has 1.2 MILLION subscribes and around 20k views in a few hours. Great job guys, you show that learning (tech) stuff is just really enjoyable! On a way that apparently a lot of people can understand! thumbs up!
@menachemsalomon6 жыл бұрын
A good Part 2 would be explaining how the mutex works, because implementing the mutex also runs into the read-modify-write problem. Kudos for the visual effect of two Steves, especially handing the disk back and forth. Steve gets credit for his acting skills, and bonus points for having a 3 1/2" floppy disk just lying around on the desk.
@UweEichel6 жыл бұрын
That was some great editing! Good job and great video!
@lawrencedoliveiro91046 жыл бұрын
Not just editing, but also choreography of Steve’s movements.
@aylazer2310 ай бұрын
Wow, finally after a while I understood how threads and mutexes work. The visuals are great
@B20C03 жыл бұрын
The first time I felt smart watching a Computerphile video. I saw the code and the fact that both threads use the same variable and thought "That's not gonna work as intended.".
@slpk6 жыл бұрын
The floppy disk transition was pretty clever
@simontay48516 жыл бұрын
Yes, i want to know how that was done. Steve 1 seamlessly passed it to steve 2 as if he was real.
@ais41856 жыл бұрын
@@simontay4851 Are you trying to say he isn't real?
@snooks56075 жыл бұрын
never done any video CGI but if I'd have to guess someone was sitting there accepting the disk initially, then they switched places and had real time split-screen of the previous footage running on a screen so they can verify the disk aligns on the second pass.
@carlosmspk4 жыл бұрын
@@snooks5607 Still how did he deliver the disk at the precise level each time. You'd expect some altitude difference that would make the floppy disk "snap" from one altitude to another (and there is, but very, very little, which is impressive).
@nightshadefns3 жыл бұрын
@@carlosmspk Any imperfections in position are masked by the outline
@MirkWoot5 жыл бұрын
I want 8x Steve solving all my tech problems and projects.
@Gnarrrl6 жыл бұрын
Having the multiple Steves talking over each other and interfering with each others task to explain what happens in multithreaded tasks was quite brilliant!
@nazoj64623 жыл бұрын
Steve literally divided himself like a cell just for this example thats some true dedication
@gravity631611 ай бұрын
steve just forked himself
@snooks56075 жыл бұрын
3:00 for completeness, there's also a thing called "green threads" where OS is cut out of the resource management and it's all done by the user-space application. reasons for this vary but for one it might avoid context switching in compute intensive cases.
@Peds0136 жыл бұрын
Insta-liked for multiple Steves :-)
@maxtaylor35316 жыл бұрын
Bagley is exactly like this in lectures. The only difference is he has awesome, black, heavily animated keynotes.
@recreant3596 жыл бұрын
Max Taylor I woulda stayed in college if these guys were my professors.
@MatthewTaylor866 жыл бұрын
That sounds amazing
@maxtaylor35316 жыл бұрын
He’s honestly a legend. Gutted I haven’t got any modules with him next semester.
@kemfic6 жыл бұрын
The numberphile and computerphile people have seemingly endless amounts of feed printer paper and butcher paper for some reason
@gmc97533 жыл бұрын
Maybe back in the 1980's, some company offered the university a great deal for a huge number of boxes of green bar paper.
@ko.pi.pe.6 жыл бұрын
The `time` process is not well equipped at all for measuring performance at the microsecond level, as you found. Many things skew the results, as a whole process needs to be spawned and torn down, and this stuff is included in the total; context switching also will bite you. I only use `time` for benchmarking when the benchmarked process is expected to run for longer than one half second. Also, remember that there's a Bash built-in also named `time`, which is even worse than /usr/bin/time. `gprof` or `prof stat` would have been a much better choice.
@sugarbooty6 жыл бұрын
Would making the set of numbers being added together larger to make it last longer increase the accuracy of the time process?
@snooks56075 жыл бұрын
@@sugarbooty sure, you could also measure the difference with a physical stopwatch but why make things hard for yourself. right tool for the right job.
4 жыл бұрын
Do we not have access to a low level high res timer?
@billykotsos46426 жыл бұрын
Merry Christmas Computerphile!!!!
@nem646 жыл бұрын
I've always had love for parallelism and multi threading. This video explained it even more. Thank you!
@TheSiddh1236 жыл бұрын
The method of explaining how a thread works and how we can use mutex for the solution was exemplary. Great editing.
@ShujathHussain06 жыл бұрын
explanation and visual presentation are just outta the world. teaching every CS student wishes for
@jangambler99986 жыл бұрын
Not sure if this is an actual educational video or just the editor and Steve flexing on their editing and acting skills respectively.
@oysteinsoreide43235 жыл бұрын
i found the content clear. but I already have been dealing with locks, and programs not always giving the same result. And of course the video production itself can be educational at the same time. Win win situation.
@BassHelal6 жыл бұрын
Learned more in 15 minutes than I did in my entire university Concurrency module. Awesome stuff as always :)
@RoccoBruehl6 жыл бұрын
nice profile pic ^^
@TheUglyGnome6 жыл бұрын
You went to Patriot University?
@BassHelal6 жыл бұрын
@Zero Cool Of course I was exaggerating, the course was bad and the lecturer didn't care, so I had to take things into my own hands but yes they did somewhat go over this stuff in that course just extremely poorly
@Toopa886 жыл бұрын
@Zero Cool my professor in databases was from Russia and couldn't speak English neither the language that's spoken in my country. I don't know how he hasn't been fired yet but I assure you that you learned more in 1 hour research than the entire course.
@vitfirringur6 жыл бұрын
This is your problem. You're in a university. You're not supposed to be spoonfed by the professor. You're supposed to learn to study on your own. If you expect everything to be spoonfed to you, you'll have a hard time finishing your education.
@TheChondriac6 жыл бұрын
This is one of the best videos posted to this channel in a while. Really really great upload! I like that we actually got to see some code. I'm finishing my second year toward my degree in computer science and this is right at the level that's great for me. Cheers!
@mikumikuareka4 жыл бұрын
Oh, what a great video. I wish I was introduced to it when I was learning about multithreading because it makes it look so easy despite the fact that all the things in this video I learned in a very hard way.
@sethnuzum6 жыл бұрын
One of the highest value channels on the platform, thanks so much for the fantastic content!!
@poke_champ6 жыл бұрын
Needed this month's ago for my exams but was a great reminder! Your videos always help me and other programmers understand the concepts visually and practically!
@philippetrov48816 жыл бұрын
My vision for education in programming is that the students should learn multithreading right from the start - in the introduction course. That way we can form a "divide and conquer" approach to solving problems from the beginning and not struggle to fight their "linearity" in thinking afterwards. The problem is that most of the people tend to use the methods of programming they know best, they know best whatever they used most and they used most whatever they learned first. So when we introduce parallel programming say in third or fourth course in the Bachelors degree it's already too late. Nowadays the mainstream programming is over multicore computers and parallel thinking in problem solving is a must.
@mrrdelorenzi84786 жыл бұрын
scratch has multi-threading (it is taught at primary schools).
@theblackwidower6 жыл бұрын
I spent three years studying programming in college, we never once covered multi-threading. Really annoying.
@justgame55086 жыл бұрын
T Duke Perry I stumbled upon multithreading myself during my final college assignment (not university), It was a cloud storage app programmed in windows forms(C#) I had to use multithreading for the networking aspect, ie keeping the user interface responsive while downloading/uploading files and on the server allowing multiple clients to access data simultaneously. It was never formally taught however
@philippetrov48816 жыл бұрын
@@theblackwidower that's very sad to hear...
@philippetrov48816 жыл бұрын
@@justgame5508 what I do in my courses is that we start directly from GUI programming (Java Swing in NetBeans). That way we directly hit the responsive interface issue in the second lecture. We usually make some simple game - like a bad guy chasing a hero around the board so the hero must be in one thread controlled by the human and the bad guy in another. We then follow the path down towards low level programming.
@patrickgoetz2 жыл бұрын
The best explanation of threads that I've even seen -- great job! Oh, and of course brilliant video production.
@jayyyzeee64094 жыл бұрын
Gauss' shortcut: The sum of positive integers from 1 to x, where x is even is equal to: (1+x) + (2+[x-1]) + (3+[x-2]) + ... (x/2+[x/2+1]) = (1+x) + (1+x) + (1+x)..., where there are x/2 terms, or (1+x)*(x/2). The example in the famous story about Gauss was finding the sum from 1 to 100 which is equal to (101)*50 = 5050. In this case, x=1,000,000, x/2 = 500,000 => The sum is (1+1,000,000)*(500,000) = 500,000,500,000.
@AbdulFourteia6 жыл бұрын
Hats off to the editor and the presenter, and one more time to the editor.
@TruthNerds5 жыл бұрын
1:46 (about processes) "If you changed one memory location, it updated in the other side […]" - *NO! THAT IS WRONG!!!* The defining feature of processes was memory isolation. No changing memory in other processes unless both processes asked for it (look up shared memory). It is true that the process' memory is not necessarily copied during the fork(). The CPU's support for virtual memory allows using page faults and what is known as copy-on-write, or COW, for short. This means both processes continue to use the same memory, but set as read-only in the page tables managed by the OS, until one of them tries to write to that memory. This causes a page fault by the CPU (writing into a read-only page), but this is transparently handled by the OS which allocates a new (physical) page, copies the data and attaches it to the writing process as read/write, then the "offending" process is resumed. But this is an optimization by the OS and does not change the concept of memory isolation. I got worked up over this, sure, people make mistakes, but memory isolation versus common address space… that is basically the *defining* difference between processes and threads. E.g. here is an excerpt from the documentation of the GNU/Linux function clone(): Unlike fork(2), clone() allows the child process to share parts of its execution context with the calling process, such as the virtual address space, the table of file descriptors, and the table of signal handlers. (Note that on this manual page, "calling process" normally corresponds to "parent process". But see the description of CLONE_PARENT below.) One use of clone() is to implement threads: multiple flows of control in a program that run concurrently in a shared address space. Note how they *define* threads as "multiple flows of control … *in a shared address space* ."? I'll admit that this is tricky stuff, e.g. the distinction between threads sharing the table of file descriptors whereas fork() effectively copies this table. This means that, while after both a fork() or a clone(), both processes will have the same files opened with the same file descriptors (these are integer IDs usually), *changes* to the FDT will only be seen by a clone, not by a forked process - e.g. if one forked process closes a file descriptor, it will yet remain open in the other. TL;DR: Your explanation of the difference was like saying "a bike is different from a trike in that it has three wheels", which is horrible, which in turn should explain my reaction.
@toxicdesire88115 жыл бұрын
10:57 I need a 10 hour version of this.
@vovakarma78856 жыл бұрын
Great to see Steve programming ! Very educational how he organizes the code and writer it!
@kisame_53315 жыл бұрын
Best 15 minutes I've ever spent today
@PhilipBlignaut6 жыл бұрын
One of the best multithread examples ever! Well done!!
@pmiron4 жыл бұрын
Everyone learning programming should look at your video. Great explanations as usual.
@OzHarrrry6 жыл бұрын
The production quality is Insaneeeeee
@nitishgupta68025 жыл бұрын
Find me a better channel on KZbin, I'll wait . Every time I come to computerphile, it blows my mind open.
@MihaiStoian16 жыл бұрын
This is so educational, please do more videos with multithreading subject! ❤️❤️❤️
@RodrigoVzq6 жыл бұрын
This video was just brilliant. Great production
@ianbryant3 жыл бұрын
passing the token between threads effectively makes it a single threaded execution, because only one thread can work on the problem at a time. The efficient version of the problem presented in this video (adding numbers up to 100000) using two pthreads is to make each pthread responsible for adding up 1 half of the numbers in 100000, and make it so each pthread is modifying a separate integer variable. Then, after pthread_join'ing the two threads, add the two result integers together to get the final answer. So, in summary, thread1 handles adding 0 through 50000 to variable a, while thread2 handles adding 50001 through 100000 to variable b, then after joining the two threads, the final answer is a+b.
@Keithfert4903 жыл бұрын
Congratulations, you summarized the video.
@r0b3rtdq3 жыл бұрын
@@Keithfert490 if you use answer = a + b then you don't have to use mutex_lock anymore because every thread uses it's own variable. Is the use of mutex_lock better because of the memory allocation?
@OrbitalCookie3 жыл бұрын
@@r0b3rtdq General rule of thumb is to synchronize all shared variables, even if the possibility of shared threads modifying them in wrong order seems unlikely. The final value "a" is updated from multiple threads once, yes, and it is unlikely that threads finish execution at the exact same time to update the final variable. However, what is unlikely on one machine can become very likely when distributed to 10000 users and used multiple times per day. If this bug caused crashes or data corruption it would be very hard to find and diagnose.
@TheSulross4 жыл бұрын
Even though C++ (since C++11) has added threading to its standard library, in a lot of situations am still preferring to design around forking where a parent process supervises child processes to do the real work. With copy-on-write optimization of forking, the cost is the page mapping of the process address space having to change upon context switching per kernel scheduling. But the forked child process has a private heap that doesn't have contention with other threads for access. In the age of big data processing, this tends to be more crucial - plus it's a more fault tolerant design approach for implementing self-healing software services that need to run robustly 24/7 (child processes are vastly easier to scavenge successfully than a failing thread of execution which can tend to destabilize the entire program). I prefer this so much to multi-threading that I wrote a custom Java launcher called Spartan that enables (easily) breaking my Java programs into child processes - just as with C and C++, I can write a single Java program that splits itself into a parent and (one or more) child processes for logic execution. And the Java heap gets problematic (perf-wise) when dealing with more than 4 to 6 GB of memory objects - is really better to run concurrent, heavy data-centric processing in multiple child processes instead of one Java process using threading in a single heap that starts going well beyond 6 GB of memory utilization. Oh, and here's another optimization of this approach - the Java child process just does a System.exit() when its done so that there's no attempt to garbage collect multi GB of objects - in fact with Spartan, its feasible to launch the Java child process such that garbage collection is turned off - just allocate memory until done but never bother to reclaim it - the OS does that when the process exits. In Golang, it's a different story - pretty much have to go with model that's baked into the language. It's light weight compared to other language threading models, and garbage collection these days now works much, much better.
@Wolves23146 жыл бұрын
This was a great video. The visualizations were awesome and I love seeing actual code.
@Bowhdiddley6 жыл бұрын
The presentation and editing in this video is the best I've found on KZbin when it comes to explaining and demonstrating multithreaded execution. Amazing job Dr. Bagleys and Computerphile. I'm a computer engineering student and one of the concerns I had when learning about and experimenting with multithreaded programming was the _real_ performance gain. As was shown in the video, the summation example didn't benefit a ton from being multithreaded. I assumed that a lock like a mutex creates a major bottleneck in your code that really makes it no faster than the single-threaded, sequential case. Moving the lock to the end of the function to minimize its usage made a lot of sense and, at scale, I could definitely see how multithreading would be the way to go!
@jesseokeya6 жыл бұрын
Getting more in-depth knowledge on parallelism and multi threading. Thanks!
@bobplested556 жыл бұрын
Please do a video on how you had two Steves, apparently motionless wrt each other, while using a hand held camera, and interacting with each other. That was too cool for School.
@MrSonny61555 жыл бұрын
This summarizes the difficulties in group work. All parties are working on the same thing, slowing down efficiency due to mutex locking. Of course, that assumes that there is even a mutex, because most of the time it's just a single guy doing all the work, with the other parties blackmailing that thread to credit them equally.
@Cybeonix6 жыл бұрын
Steve-P-U Huh? Sounds like a RISC-y piece of hardware ;)
@NeilRoy6 жыл бұрын
Very nicely done! We can never have enough Dr Steve Bagleys. ;)
@nickbarton31916 жыл бұрын
When I was first at work 40 years ago, we used to call operations that interrupts or the threads can't interfer with Atomic operations. These days nobody has a clue what I'm talking about.
@X_Baron6 жыл бұрын
Atomic operations are used in real-time computation, when you can't afford mutexes slowing things down or making response times unpredictable, and to prevent priority inversions. There is std::atomic now, and std::memory_order. Of the PC applications that people use daily, games are the most performance-critical and hence require these kind of features.
@eotsdev5 жыл бұрын
love the two Steves!
@fomx27535 жыл бұрын
Anybody else impressed by Brady's guess of 500 billion? .. 500 000 000 000 ~= 500 000 500 000. That's like 0.0001% error
@xSCOOTERx26 жыл бұрын
Steve and other Steve complement each other very well.
@robertboran62345 жыл бұрын
Nice explanation. We master the sequential process a long time ago. We have very good mathematical models on sequential processing but even now in 2019 we lack a good mathematical model on concurrency processing. Having a huge complex program and trying to split it in multiple threads is a nightmare. I think because we have only two hands our minds are not good at concurrency thinking. I would really like to see a programming language where the source code file has "normal code - handled by programmers" and on top of it there is a "synthetic code - handled by AI", the "synthetic code" is invisible to the programmer and it is in the same source code file. So the human job is to write the code and the AI job is to split the same code automatically. So as a human we don't need to know that the code was run in a complex concurrent fashion. This programming language can have multiple layers (Human layer + many AI layers). It is quite difficult to make a sequential looking code (TEXT) act concurrently (This just makes the code look quite complicated) .
@LuLeBe Жыл бұрын
the two hands thing doesn't even hold up. We actually can't do more than a single thing consciously. Everything else consists of routines we learned and that only need our attention when they're being stopped, started, modified etc. Take walking down the street while talking on a phone. We can walk without consciously thinking about it, we are decent at observing our surroundings without it and we can produce speech without it. We only need to think about where we want to go and the content of our speech, all the muscular functions are completely automatic. Once you think "a quick brown fox" you don't need to actively think about how to produce that sound. And when learning to play an instrument, for example, you can quickly see how we are completely unable to coordinate both hands as long as we have to actively think about their movement.
@JasonL2206 жыл бұрын
Very well explained! (I know this is sideware ... but) DON'T use mutex's in production code without thinking about if it is the best solution, they scale very badly (as the number of threads increase, and the amount of code the lock is held over) and if you are not careful can deadlock your process (i.e. every thread is waiting for another thread to relinquish the lcok). In this example C11/C++11 atomics should be used instead of the lock to add the final results, then there is no overhead from the context switch of the mutex system call, and deadlock is never an issue.
@mrrdelorenzi84786 жыл бұрын
Yes, a mutex is a way to use threads, to create single threaded code.
@trestenpool90453 жыл бұрын
One of my favorite vids by yall guys. Thank you !
@artit915 жыл бұрын
It was the optimiser. It sees you are adding a fixed amount of numbers to a variable so it converts to a + sum(stuff) form and puts the mutex around it auto-magically.
@longlostwraith51065 жыл бұрын
Wait... there's something very wrong here. Since the variable is volatile, the compiler must (if it wants to adhere to the volatile flag) add the local variable into the global memory address directly, and in x86 most operations that alter memory lock that memory address until they're done. Why isn't gcc doing it here? Does it have to be compiled with optimizations turned on? What gcc's probably doing: mov eax, [i] cdqe mov rdx, [a] add rdx, rax mov [a], rdx While it should be doing: mov eax, [i] cdqe add [a], rax
@abdalhadifitouri1315 жыл бұрын
I like this guys videos because they're a little more advanced.
@RemcoStoutjesdijk6 жыл бұрын
Mutexes are for non-parallelizable problems. A large sum is easily broken in a few subsums and that would teach the viewer more about design vs troubleshooting.
@dhkatz_5 жыл бұрын
I think this is obvious if you think about it. Yeah, the is going to be slower than a normal loop because you're essentially only adding one at a time still. It's just a very easy example.
@Keithfert4903 жыл бұрын
Did you watch the video? That's what he goes on to do...
@Guitarm4n996 жыл бұрын
Awesome special effects, and really good topic!
@DiddlySquids6 жыл бұрын
I did a project rendering 2D fractals using multithreading. This video would have be SO helpful. I'm gonna share it with everyone who hasn't done the project yet.
@adilkhatri74753 жыл бұрын
Great Video!! he explained following things - Race Condition Problem - solved race condtion problem using mutex. - code inside lock and unlock is called critical section. i have read so much about this but never seen program which explain all these concepts!! AMAZING VIDEO!!!!!
@simontay48516 жыл бұрын
How the hell was there a second identical steve in the video at the same time and how did steve 1 on the right seamlessly pass a real floppy disk to steve 2 on the left as if he was there. That is some very clever filming and editing by Sean Riley. Sean, if you read this, please reply.
@Computerphile6 жыл бұрын
Hah, it took five times longer to edit than any other, may have to do a behind the scenes on how.... Sean
@bsharpmajorscale5 жыл бұрын
It must be the work of an enemy stand
@sqyttles4 жыл бұрын
Visuals were sick!
@Shiunbird3 жыл бұрын
Very helpful. I'm proficient in shell scripting and powershell overall for my work life, and recently started learning C. This helped me sort out a problem with one of my projects, so thank you for that. =)
@hrnekbezucha6 жыл бұрын
Great job! I can see somebody had a bit of fun with this one
@stonent5 жыл бұрын
To me it would seem the most efficient way to deal with this problem is have one thread do A and one thread do B, and when they both finish A+=B, so no locking is required.
@ardiansyah67946 ай бұрын
You can build university with this content quality
@x3ICEx6 жыл бұрын
Not much faster? .06 / .07 = .8, that's like 20% improvement... Anyway, as a rule of thumb; Turn off optimizations when benchmarking intentionally bad code... Also; A much more trivial solution (sum first half into A, second half into *new variable* B and add them together in the end) distracted me from te actual point of threading, mutex, etc.
@helsontaveras79615 жыл бұрын
20% improvement sure, but he only ran the program a handful of times. we don't know if that result is statistically significant. i think he was looking for a more dramatic improvement right off the bat.
@funprog5 жыл бұрын
Some compilers might optimize the sum with the Gauss formula for sums S= n/2(a1+an) so there would be no difference in speed, or precompute the result at compile time ! One has to be aware of these optimizations before drawing results about speed etc
@dewsjievpdav65575 жыл бұрын
I was thinking the same idea about having 2 separate variables and then adding them together but you can only do that if you are running some sort of sum calculation, you can't apply that same method for more complicated code that isn't just a calculation of a sum
@mrbane20005 жыл бұрын
I just had thought god uses multi threads for generation learning which we call alternate universe and this video pops up.Nice one youtube,nice one,and this multi Steve nailed it
@andrez765 жыл бұрын
Great video as usual, though this one gets a prize for cool editing effects. I was wondering; how about using two separate accumulators, one for each thread, and adding them together after the join?
@emkaix936 жыл бұрын
i really enjoy dr. bagleys explanations. keep up the good work
@Finsternis..2 жыл бұрын
14:15 I feel like this point isn't stretched enough. It is the very reason multi threading works and is efficient. Due to needing the lock only once, which itself is helpful for performance, this change makes the thread+lock concept usable in the first place. Otherwise you'd only have hard to debug code that is at best as efficient as a single threaded application.
@drawapretzel60036 жыл бұрын
Wouldnt the fastest and easiest way to write this program and utilize two threads to just be to have each thread have its own variable that it writes to, make "A" thread have "a" variable, and "B" thread have "b" variable, so they can run entirely independently, and then at the end, set another value to true or something to show that you are finished running, (if theres an easier way to show it is finished running, which im sure there is, substitute that instead) and then after both are finished running, just add a and b together into c, and then that way you would always be 100% accurate because you avoid the memory overwrite problem, by storing each thread's variables in its own memory. I know that means it would eat more memory, but you could divide the process up into as many threads as your processor would allow, 4, 6, 8, whatever, and each add up 1/4th, or 1/6th, or 1/8th of the million digits, then at the end add up each finished value into one value, and perform the task 4x or 6x or 8x faster than just single threading. In fact, isnt that the whole point of parallel processing? that it can take more ram because each thread needs its own memory, but it accomplishes the task x times faster where x is the number of threads?
@DeusExAstra6 жыл бұрын
Yes, you could do that in this example, so that probably wasnt the best, but it was easy so people would understand the concept of a mutex.
@edilgin3 жыл бұрын
wowww the editing was superb! nice job computerphile nice job
@neerajk_105 жыл бұрын
Visual effects were superb
@dasten1236 жыл бұрын
Interesting topic, well explained, nice editing. Thumbs up!
@yaBoyDreamer5 жыл бұрын
can someone please explain the origin of &aLock as an argument passed in the mutex lock function?
@homomorphic4 жыл бұрын
Threading is never beneficial on a single core as thread context switching is far more computationally expensive than a coroutine.
@oscargarciahinde42475 жыл бұрын
Two Steves is always better than one.
@keyboard_toucher5 жыл бұрын
Problem with the demo was that adding two numbers takes around the same amount of time as acquiring a lock. Giving each thread some real time-consuming work before needing to get the lock would show a significant speedup.
@Kavukamari Жыл бұрын
very helpful explanation for why mutex is necessary, I better understand now
@__jan6 жыл бұрын
holy moly that editing
@francescomartella14411 ай бұрын
No need to desturb Mutex when the 2 threads sum up the numbers in the second case. The 2 partial sums can be stored in 2 public variables partialSum1 and partialSum2 and finally be summed up into a variable Result.
@martijn31516 жыл бұрын
I’d (strongly) advise against using volatile for variables that are potentially accessed from different threads. I’ve seen this in a lot of code and it seems like people deem it the correct way of handling those variables. It is not and I’ve encountered multiple agonizing hard-to-debug bugs as a result. Let me elaborate. Volatile is archaic, was invented for a whole different purpose and is currently just a compiler hint: it gives no guarantees on whether a variable is accessed atomically and whether or not memory barriers are properly set. In effect: it doesn’t do much anymore and worse: it gives a false sense of security. For concurrent threads the accessing behavior of volatile variables is actually undefined. Most of the time it works fine. But so would a simple non-volatile declaration. The problem arises when it is not most of the time :) If for instance two separate cores decided to access that variable at exactly the same time, you just entered a world of pain. To handle this properly use std::atomic for instance or some other locking mechanism such as a mutex.
@gustavomagana69965 жыл бұрын
There's no std::atomic . I might be mistaken but I think he's writing C and not C++.
@martijn31515 жыл бұрын
Gustavo Magaña I merely used std::atomic as an example. Also in C he shouldn’t use volatile for this. Instead he should use a proper locking mechanism.
@TruthNerds5 жыл бұрын
@@gustavomagana6996 C11 has _Atomic though, e.g. _Atomic int i; and a host of functions that go along with it, e.g. atomic_load, atomic_store, atomic_fetch_add… basically the same functionality as in C++ under slightly different names.
@taragnor5 жыл бұрын
All volatile does as far as I know, is that it makes the compiler check the value of the variable each time from memory. So for instance, normally in this case you'd just store the variable to a register, the counter to another register, and do the addition all with registers, then store the variable back to memory once you're done. But if it's volatile, the compiler will do the "load from memory, perform op, store to memory" after each execution that's described. It's notably slower, because there's a lot of extra instructions involved. But it grants no inherent atomic operation capability, all it really does is prevent the compiler from taking shortcuts and keeping the variable in a register the whole time during the loop. Honestly I think volatile is one of those keywords that's rarely actually ever needed unless you're doing something extremely low-level working with hardware. You can't count on it at all to help you with parallel programming. There may be a time when it's needed, but honestly I have yet to encounter one. I expect it's few and far between.
@giginemsadze99264 жыл бұрын
Please make more videos on threads 🙏 this was great ✨✨
@martinparidon90567 ай бұрын
What magic is this with the floppy disc :D:D Amazing!
@MatheusOliveira-xw7vf3 жыл бұрын
Thank you so much, this is channel is helping me tremendously.
@disk0__6 жыл бұрын
Who are you? _I’m you, but threaded_
@Omnifarious06 жыл бұрын
This is an effective way to illustrate the problem and make it accessible to the audience. But it should be made clear that the situation in modern CPUs is far more complex. Cache coherency and compiler optimizations are big issues. So if you follow this simple model and make something that would work if it were true, it will still not work correctly.
@deckluck3725 жыл бұрын
The shared memory location coherency is like a read modify write instruction used extensively in symmetric multiple processors for example semaphores.
@tyrgoossens6 жыл бұрын
I'm just here because I really enjoy Dr Bagleys' pronunciation of 'mooltipple'
@salsamancer5 жыл бұрын
Great explanation of race conditions. Multi threading is complex stuff.
@amrtcpp62034 жыл бұрын
In the Windows machine… I do the following to get the correct results: volatile long long a = 0; // not only long because machine architecture & increase the loop to 10000000 :) in every Thread function I used lock_guard lock(); thanks very much for the wonderful explanation :)