Just Buy More Cores (1min to 1sec no optimization)

Рет қаралды 31,252

Tsoding Daily

Күн бұрын

Пікірлер

@nulligor 11 ай бұрын

The most underrated programmer on earth (right after Terry Davis). Shoutout fr fr!

@bebre_2288 11 ай бұрын

Where p in your nickname?

@nulligor 11 ай бұрын

@@bebre_2288 This is not even my final form.

@rubyciide5542 11 ай бұрын

Terry daviS Over DosING

@nephew_tom 11 ай бұрын

00:41:49 - Learning Math I would say that what Tsoding said is the right way to learn math, physics and programming (and probably, anything...). I took lots of high math and physics courses in engineering University, but since didn't care much too me (just to pass the exams), really didn't learn that much. Learning by something that cares you a lot, improves by a huge amount the learning.

@dkaaakd 11 ай бұрын

In order to split workload more evenly, you can increment chunk size for first N % T threads. This way number of samples per thread will differ by at most 1 (instead of at most T - 1)

@ryan-heath 11 ай бұрын

I was looking for this comment. Splitting with remainder: 9: 3,3,3 10: 3,3,4 11: 3,3,5 12: 4,4,4 Smooth remainder more evenly: 9: 3,3,3 10: 4,4,2 11: 4,4,3 12: 4,4,4 The effect is even greater with more processors.

@tekno679 11 ай бұрын

Or you could just do a ceil division. so `n_chunks = ceildiv(n_tasks, n_cores)`.

@berndeckenfels 11 ай бұрын

10:30 not sure if you meant this, but There is no guarantee that the first sample in a partition will show up in the k-first merged samples. (But you can abort the merger after k)

@Kul3Kow101 11 ай бұрын

First of all, I absolutely love your content. recreational programming ftw! When splitting the samples, my first thought was to distribute the remainder among the first few chunks. So if rem=1, the first chunk gets an extra sample. if rem=2, the first two chunks get an extra sample. That way the distribution is more even. You can also express this in an elegant way: when iterating over the range 0..nprocs and spinning up the threads, you can take the final chunk size to be (chunk_size + (i

@Blubb3rbub 11 ай бұрын

Would also be a great problem to dabble into OpenMP parallelization. To get more balanced chunks you can use `thread_index*(n/thread_count) + min(thread_index, N % thread_count)` as start in indices and `n/thread_count + (thread_index < n%thread_count ? 1 : 0)`. This will add 1 extra element to the first few chunks, so you don't end up waiting for a single thread for larger thread_counts because of work imbalance. Another, even easier, approach is to just increment by the thread_count instead of 1 and offset the start indices by the thread_index. But then you don't have continuous chunks, which might be slower?

@AntonioNoack 11 ай бұрын

If you have less than 2^64 items * cores, you can also use i0 = (threadIndex * numItems / numThreads) i1 = ((threadIndex+1) * numItems / numThreads) That gets rid of the error prone ifs.

@Blubb3rbub 11 ай бұрын

@@AntonioNoack Ah true! it puts the extra elements on the later cores, but is a lot easier to calculate. Nice! Thanks for sharing.

@anon_y_mousse 11 ай бұрын

Just so you know for the future, htop has a help screen that you can bring up by pressing F1, C11 also added threads, and Pascal does have pointers. If you meant that it doesn't have untyped pointers, like C has the void pointer, well, it does have those too. For general types, foo: type; bar: ^type; bar = @foo; bar^ = value; and there's the generic `pointer` type. Pascal is actually why my language has a with clause and uses := for assignment in conditional headers. So no confusing = and == in an if, you can assign or compare with impunity.

@pedropesserl 11 ай бұрын

he said pascal didn't have pointers *originally*

@anon_y_mousse 11 ай бұрын

@@pedropesserl Okay, except it always had them.

@pedropesserl 11 ай бұрын

@@anon_y_mousse oh

@cobbcoding 11 ай бұрын

im surprised urmom could handle 8 cores at once ngl

@AntonioNoack 11 ай бұрын

Usually the nodes with lots of memory w/could be called urmom though. (they're called fat)

@re_detach 11 ай бұрын

This topic is so fascinating. You are literally showing the relationship between compression and learning/intelligence/pattern detection in regards to humans and AI

@nephew_tom 11 ай бұрын

1:00:10 "cool memory management, who needs garbage collectors if you have aren-aaasssss" - that deserves a tweet! 🤣

@BboyKeny 11 ай бұрын

On the topic of learning math through programming. I think it's a really good approach since math can be hard and abstract when you don't have a context or application to use it. When learning the math you need because you need to apply it, then you already have an application. Also look up math symbols to programming. for example Σ(xs) means for x in xs: sum += x where xs is a list of numbers

@warguy6474 2 ай бұрын

its a really bad way to learn real math lol, real math is about proofs. Computation is not math. Computation is for computers.

@dnkreative 11 ай бұрын

if best compression option makes compression algo dependent on content it's better to not use it since it can mess up results so individual and concatenated versions of compression of the same data might use different ratios which will affect classification result.

@adamjasinski1463 11 ай бұрын

Ada took a lot of inspiration from Pascal, also has great type system and the 'use' clause My professor told me that I should try it out, it was painful at first, for a beginner programmer like me that used Go for most of his time, but it's so delightful to be productive in this language, I guess same goes for Pascal.

@sago27 11 ай бұрын

yo,this type of programming content is something else.🔥

@diegorocha2186 11 ай бұрын

Pretty impressive how you achieve amazing stuff with code that's both readable and simple!!!

@0msdev 11 ай бұрын

The solution you have as of timestamp 40:54 is good enough but not elegant. For instance, If the `count` is 27 and the `chunks` is 7, your solution will divide it in 7 chunks of 3, 3, 3, 3, 3, 3, 9 which is okay. But still the difference between the smallest and largest chunk comes out to be 9 - 3 = 6 What you're truly looking for in this case (in my opinion) is regularization, where you want to divide the `count` into the `chunks` in nearly equal size aka minimizing the size difference between the smallest and largest chunk. Here is how I would've done it: int remaining_count = count; int remaining_chunks = chunks; while (remaining_chunks > 0) { int chunk_size = remaining_count / remaining_chunks; remaining_count -= chunk_size; remaining_chunks -= 1; printf("%d ", chunk_size); } printf(" "); The output of the above logic when `count` = 27 and `chunks` = 7 will be "3 4 4 4 4 4 4" which minimizes the size difference between the smallest and largest chunk.

@AntonioNoack 11 ай бұрын

Not very elegant either. I prefer i0 = (threadIndex * numItems / numThreads) i1 = ((threadIndex+1) * numItems / numThreads)

@spacewad8745 11 ай бұрын

great video to pair with a relaxing evening... or so i thought

@bukitoo8302 11 ай бұрын

You can add some option to store the classification result to a file so you can use it from run to run, without the need to re parse everything.

@jakestewart5915 11 ай бұрын

you should try doing shorts. you explore interesting ideas and give funny/worthwhile takes on them. i, and i imagine many others, rarely have the time to watch 1 1/2 hour videos. i imagine you doing shorts would do quite well

@AndarManik 11 ай бұрын

Perhaps next up could be a ANN or approximate nearest neighbor

@xulitol 11 ай бұрын

43:36 This thing u said inspires me a lot I always afraid of doing dumb things and writing shitcode but I want to improve Also your English is so Russian so I feel like a native English speaker listening you

@KellyMurphy 11 ай бұрын

I wonder if you were to take and classify a bunch of articles , then add them to training as they are classified does it get more or less accurate.

@egk_nix 11 ай бұрын

yandex geepeetee can answer english questions in english, but not all of them and you have to ask in russian to reply in english explicitly and its really wierd. At one point it started giving me answers in english to questions in russian.

@yagamilight2166 11 ай бұрын

Similar to jai context, you could make temp allocator buffer thread local.

@vvarhand3985 11 ай бұрын

The Azozinator delivers, simple as

@dnkreative 11 ай бұрын

tail is very easy, just add 1 to each work batch count while remainder is > 0

@berndeckenfels 11 ай бұрын

I am more of a calloc guy for struct members in arrays

@Maik.iptoux 11 ай бұрын

20:06, you could check if the amount is zero.

@AntonioNoack 11 ай бұрын

As a recommendation for your next video, analyze the scaling behavior (#cores -> #speedup?), and look whether you can find out how many cores your system has without using a dedicated system call (would be quite educational, I think).

@kevinnguyen163 11 ай бұрын

Don't bring my mom into this

@mhamdmarch8709 11 ай бұрын

The one who always give me the reason to do some wierd projects, keep going especially with mathmatics and operating system stuff ❤❤❤🎉🎉🎉🎉

@DanelonNicolas 11 ай бұрын

:set paste Will help to paste code into vim 😊

@anon_y_mousse 11 ай бұрын

This might also be a terminal problem. I use Konsole and regularly paste into Vim without putting it in insert mode and it works without issue.

@matteovalentino4890 11 ай бұрын

Zap hosting is down at the moment? Can't seem to successfully donate

@Mozartenhimer 11 ай бұрын

The ugliest hack would of been making the temp buffer threadlocal. That would of been gross.

@ecosta 10 ай бұрын

I came for the programer... I love his programming skills... But the chat jailbreak was the best part of this video.

@int_i 11 ай бұрын

omg tsoding uploaded

@yousefsayed6380 11 ай бұрын

what about parallelizing on the gpu for poor people like me who can't buy a better cpu

@millieno 11 ай бұрын

Then you use openCl or CUDA or HIP

@AndrieMC 11 ай бұрын

hello tsoding

@TsodingDaily 11 ай бұрын

Hi!

@namefreenargrom5694 11 ай бұрын

I think you can get rid of the sort. ??

@lizardy2867 11 ай бұрын

This is why we have GPU compute cores.

@TheMelopeus 11 ай бұрын

Just throw computation and it will fix everything

@thomasziereis330 11 ай бұрын

yes until Amdahl's law kicks in

@God-i2 11 ай бұрын

Can you stream on weekends? I cannot watch during the work. I go to work

@NetBSDToaster-u8e 11 ай бұрын

I would enjoi these videos so much more if I had any clue what is going on ....

@Recreman 11 ай бұрын

Love you! MmmA!

@leifmessinger 11 ай бұрын

Imagine getting an instance of 8 shared cores on the cloud rather than having an 8+ core computer

@blastygamez 11 ай бұрын

Try visual programming languages for a video

@undefinedchk 11 ай бұрын

this man using google chrome?

@mindasb 11 ай бұрын

Wait - you don't consider parallelization as a type of optimization? LOL

@AndrieMC 11 ай бұрын

no its just doing two stuff at once

@mindasb 11 ай бұрын

@@AndrieMC *just* sure bro. Keep thinking that.

@forayer 11 ай бұрын

Khello!

@berndeckenfels 11 ай бұрын

1:11:01 23/25s real vs. user is not that much parallel, but Imguess the poor machine is struggling

@LBCreateSpace 3 ай бұрын

Now, your mom has 8 cores! 😂

@luijia 11 ай бұрын

Caspal

@xulitol 11 ай бұрын

Привет

@-rya1146 11 ай бұрын

1:04:26 lmao