I ain't gonna lie having an influencer style videos in the coding world just entertainment non educational helps me stay on track when i feel not motivated to study keep it up bro.
@Lewini7 ай бұрын
Hard agree. Keeps my algorithms in check
@cyrilemeka69877 ай бұрын
Agreed.
@isthatjakeАй бұрын
It'll help with work, too
@DjoumyDjoums7 ай бұрын
My guess on why 3 is slower than 2 : index bound checks. Compilers can very often eliminate them when you loop on the actual length of the array, not a pre-cached numerical value.
@grifon977 ай бұрын
sounds like a very good guess to me!:d
@Nenad_bZmaj7 ай бұрын
@@chigozie123 Not true.
@Pipe04817 ай бұрын
@@chigozie123That's just completely false lol
@chigozie1237 ай бұрын
@@Pipe0481 I should clarify that I'm comparing it to what some C/C++ compilers do. If you look up optimizations done by the Go compiler, you will find that the only optimization even worth mentioning is the conversion of "zeroing out a slice", to a simple memclr instruction. In C, we get that for free using `memset`. Maybe also function inlining, but that's nothing given that in C, the `inline` keyword is free to use. That's just the tip of the iceberg. Compare that list to what GCC does, and you will see why it's difficult to claim that the go compiler is an optimizing compiler. The main reason why I said that go doesn't do optimizations is that in C, there is no reason why the second and third code snippets should not run at the same speed. The compiler should have optimized the len call out of the second, to make it pretty much the same as the third.
@Nenad_bZmaj7 ай бұрын
@@chigozie123 The for loop with the range clause is clearly more optimized by the compiler compared to standard for loop with the condition clause, as he has shown. I agree with the len call - it should have optimized it, but he measured with the time package, not benchmarked it, and did not show us standard deviations. In Go you can opt for no inline. I simply disagree with the statement that Go compiler doesn't do exhaustive optimizations, as long as they are worth doing and as long as they would not bloat the compile duration.
@ColinFox7 ай бұрын
Go uses symbol lifetimes to determine whether to allocate on the stack or on the heap. If the object survives after the function returns, it's on the heap. Pretty straightforward.
@susiebaka33887 ай бұрын
"The more you write standard code, the more likely you're generally doing well." - ThePrimeagen
@samsqu4red7 ай бұрын
Found the channel just few weeks ago. Coming from a background in Webflow, with only a grasp of HTML, CSS, and a bit of JavaScript, I see myself as a puppy. Recently, I've decided to take coding more seriously because it's interesting, and I also have a web app idea that I'm genuinely determined to bring to life. Though 90% of what the man is saying still flies right over my head, I feel like my understanding is expanding. I've never attended school, worked in-house with a team, or studied with guidance. It seems like I found a semblance of a mentor who's bombarding me with jargon I can't possibly comprehend, but ultimately is helping me see throw the fog in a way. Feels like little by little I'm starting to grasp how everything is connected.. weird.
@andygarfield65297 ай бұрын
There’s all sorts of things he could have done here. Changing the order of the loops can give you wild improvements because it can change the proportion of cache hits. Also loop unrolling. Also using arrays in Go (which are a real thing) instead of slices.
@CjqNslXUcM7 ай бұрын
the loop order is correct, it goes through the innermost slice first.
@SaHaRaSquad7 ай бұрын
If you do something like manual loop unrolling always benchmark it, because most likely it will actually make your code slower by blocking much better automatic optimizations.
@Nenad_bZmaj7 ай бұрын
In my experience, array gives me worse performance than slice. After I checked that, instead of declaring a := [5]something, I always do: a := make([]something, 5, 5)
@Zullfix7 ай бұрын
Agreed. His single-array test could have been much faster if the locality of coordinates were stored contiguously.
@Yawhatnever7 ай бұрын
Loop unrolling might be one of those things where you end up fighting the compiler (would need to measure), but loop tiling might have a real chance of increasing cache hits
@HairyPixels7 ай бұрын
The compiler writes better code than you but what if you're writing a compiler? BRAIN MELTING
@ninocraft17 ай бұрын
if you write your own compiler, its probably gonna be even slower xD
@Patterner7 ай бұрын
just compile the compiler with the compiler. repeat. ... profit
@OkAtheistExplainVim7 ай бұрын
if the compiler compiles your compiler is it really your compiler?
@HairyPixels7 ай бұрын
@@Patterner In Soviet Russia compiler compiles you.
@joranmulderij7 ай бұрын
@@OkAtheistExplainVim If you wrote the compiler that wrote the compiler then yes
@jagc22067 ай бұрын
There is a big distinctions between array an slices in go. This video uses slices, which are pointers to arrays. A slatic array will have a compile time length and allows the compiler to do a lot more optimization
@Nenad_bZmaj7 ай бұрын
My experience with performance is the opposite, so I stick with slices.
@eni4ever7 ай бұрын
blue hair no?
@dejangegic7 ай бұрын
Old video
@sparrow2437 ай бұрын
He's getting bald
@greendsnow7 ай бұрын
@@sparrow243 hahahahahah
@yobson7 ай бұрын
no
@XDarkGreyX7 ай бұрын
Brain no?
@philipp047 ай бұрын
A year ago my prof had us write gaussian elimination, but the algorithm we had to implemeny was slightly different for each of us. I had to find the inverse of a matrix using Cholesky decomposition, which is designed to be fast for large symmetric matrices. I'll cut to the chase, the code I wrote was slow as hell and I couldn't figure out what's going on, though I could tell that I had tons and tons of cache misses. The professor though could tell what I did wrong immediately: the matrix was stored as a 1d array row by row, but I was traversing it columnwise. I transposed the matrix and swapped all indices in my code, and the total runtime decreased by a factor of 20 (on 8000x8000 matrices). Moral of the story: go through your data sequentially, it makes a huge difference.
@samhughes17477 ай бұрын
EDIT: this is wrong. Only the comprehension-inlining bit made the boat in 3.12. Hey. Python doesn’t have a loop problem anymore. There’s still the “it’s python” problem, but Python 3.12 introduced loop unrolling,
@ruroruro7 ай бұрын
source?
@samhughes17477 ай бұрын
@@ruroruro, oof. Yeah, that didn't get approved for 3.12, and I got that mixed up with the comprehension-function inlining that did get released in 3.12.
@g.c9557 ай бұрын
he could to a follow up by profiling the code to see where CPU and memory are spent, then analyze the assembly output of the code to understand why.
@rt15177 ай бұрын
The video author should have disassembled the resulting binary to see if he would be able to figure out what the hell is going on.
@swedishpsychopath87957 ай бұрын
I'm planning to throw a SIMULA-67 party in 3 years to celebrate its 60 years as the first OOP language. Hate it or love it the norwegian SIMULA-67 invented Object Oriented Programming (OOP) with Classes, Inheritance, Encapsulation, overloading, overriding and all the other good stuff. It is crazy to think that this paradighm shift in programming is 60 (SIXTY) years already!! Press LIKE if you would like to get an early invitation!
@0dsteel7 ай бұрын
Slices are just references to an array with set length and capacity. They are reallocated if the values wouldn't fit said capacity. On the other hand, arrays *may* be stack allocated, if the compiler is certain they don't escape the given scope and, well, fit in the stack frame. (this is not legal advice, please ask your friendly neighbourhood debugger)
@miroslavhoudek70856 ай бұрын
A guy watches a guy who doesn't know what's going on, while also not knowing what's going on, while people in chat suggest what might be going on, but they can't be all right. Good fun overall, would recommend, but don't know why.
@defeqel65377 ай бұрын
Optimization videos like this, I'd always want to know the size of the data, and the theoretical memory bandwidth, etc. to have some actual context. No point trying to optimize things if you are already limited by the hardware (or the optimization would need to be radically different at least).
@shapelessed7 ай бұрын
Omg GO... I only had a week with that language when I tried it but I love the errors as values concept. To the point that I started using them all over my JS.
@0dsteel7 ай бұрын
haters gonna hate, but you mention their uncaught runtime exceptions and they'll be in shambles :)
@max34467 ай бұрын
I like errors as values but Go would be so much nicer if they had monadic operations on errors (or at the very least, some sort of shorthand for `if err != nil { return err }`).
@debemdeboas7 ай бұрын
reading contiguous memory is generally faster because the compiler may not have to deal with paging and segmentation for every index
@Satook7 ай бұрын
The 2nd form in part 1 is pretty easy for a compiler to optimise with constant propagation. The x and y don’t change so the sub-expressions can be moved up out of the innermost loop.
@Nenad_bZmaj7 ай бұрын
The difference of cca. 250 ms for about one billion elements is to be expected if the additional GC sweeps are triggered. He should have insulated the three scenarios and benchmarked them separately, with the SD, not consecutively in one test. It seems that an additional GC sweep was turned on sometime after the second test. Also, package "time", in my experience, may interfere somehow with either GC or the execution. There are better tools for benchmarking, although I do use "time" to give me a sense.
@freedomextremist79217 ай бұрын
It's a slice not a list. And Go does also have arrays which are contiguous regions in memory.
@Leeway44347 ай бұрын
yes in go arrays are stack allocated and are created with a fixed length. [5]int is an array of length 5 on the stack and []int{1,2,3,4,5} is a slice of length 5 on the heap, pointer to it on the stack. I believe there is a size limit for stack allocated arrays at some point the compiler puts them on the heap* to avoid stack overflows
@Nenad_bZmaj7 ай бұрын
@@Leeway4434 All true.
@Leeway44347 ай бұрын
@@Nenad_bZmaj I actually looked into it some more and it is not guaranteed for slices to be put on the heap. The compiler has heuristics for whether or not something should end up on the heap or the stack. Large values usually end up on the heap. Small values (including slices of sufficiently small constant length) can often be allocated on the stack.
@Nenad_bZmaj7 ай бұрын
@@Leeway4434 Thanks. I didn't know that.
@rumplstiltztinkerstein7 ай бұрын
The notion of "code optimization" is mostly popular because of Javascript. We invented compilers so we don't have to worry about looking at everything the cpu is doing in every clock cycle.
@yugioh88107 ай бұрын
using only index with range is usually faster especially if the array is long or has heavy structs since it eliminates the copy operation to the variable in the for loop. for i := range x is faster than for _, val := range x
@miracleinnocent26497 ай бұрын
Go's allocations all depend on its escape analysis sometimes even interfaces can be stack allocated and I think the highest bytes for stack allocation is 32kb
@michaelrobb95427 ай бұрын
You can't put Sys calls in for loops. Calls like time.Now are calls to the OS, this will produce inconsistent results on each run.
@Nenad_bZmaj7 ай бұрын
True. There is a bigger for loop in this particular test, encompassing three tests.
@MattBolt357 ай бұрын
There are ways to pass link options to go’s compiler to show which values escape. Slices are typically heap allocated. You can stack allocate an array if you use a finite length (as he mentions).
@Nenad_bZmaj7 ай бұрын
But only small sized arrays.
@AlexnderMisuno7 ай бұрын
Once I implemented an Mineswepper game in go using "normal" approach to store data about field, cell state etc. And afterwards I did the same but using bit-wise operations (just for fun). To my surprise, not only the performance was the same, but also memory consumption
@georgehelyar7 ай бұрын
I'm not a go expert but it sounds like the reason 3 is slower than 2 and reverse is faster than forwards is because of bounds checking.
@defeqel65377 ай бұрын
which I think is something go should probably be able to optimize away in cases like this
@georgehelyar7 ай бұрын
@@defeqel6537 is it a jagged array though? If len(a[0]) can be different to len(a[1]) then it could potentially go out of bounds, but again I'm not a go expert. C#, for example, has multi-dimensional arrays of contiguous memory, indexed with a[x,y,z] where the lengths of each dimension are guaranteed, or you could make a jagged array by making an array of arrays where they aren't, and then you'd have to use a[x][y][z]. Reverse being faster in several languages always seemed like it could be optimised though. Reverse is faster because it always knows that the lower bound is 0 regardless of the array so if it starts inside the bounds and then never goes below 0 then it doesn't have to check bounds for each iteration. And of course dynamic arrays could change size at runtime Rust probably does bounds checking at compile time
@Nenad_bZmaj7 ай бұрын
@@defeqel6537 I believe that, too. The very typical difference of cca 200 ms for arrays of this size tells me that probably an additional GC sweep was triggered.
@haraldfielker46357 ай бұрын
I think the range works better with CPU caches (in that szenario).
@Gusto200007 ай бұрын
I’ve done the same thing a week ago after 1.22 released, benchmarking with testing.B and measured exactly the same result for each case… So I decided simple range is good enough. Bill Kennedy talks about “mechanical sympathy” in “Mastering Go” about this exact problem - it’s all about low level caching and prefetching by cpu and compiler
@Nenad_bZmaj7 ай бұрын
Regarding some comments: I did try arrays instead of slices of length known at the compile time, but performance was always slightly worse. In my experience, Go programmers can forget the arrays, since slices seem to be very well optimized. And, I like using slices when I have to do something differently in the last iteration or in distinct subranges. So, instead of for i := 0; i < k; i++ {...}; for i := k; i < len(a); i++{...}, I can write: b := a[: k]; c := a [k:]; for _, el := range b {...}; for _, el := range c {...}.
@flga_7 ай бұрын
I'm not sure what you did, but when switching to arrays you need to be careful to avoid copies. There is no inherent reason for arrays to perform slower, other than that.
@Nenad_bZmaj7 ай бұрын
@@flga_ I passed them as reference. Very small arrays, for instance, small hash tables with short slices of bytes as elements, or arrays of ints. But passing pointers to them apparently always triggers GC sweep. When I turn them to slices, the execution is faster overall for the duration of one GC sweep, which on my comp., for several programs I wrote, apparently (inferred) takes about 200 ms on average.
@flga_7 ай бұрын
@@Nenad_bZmaj I don't think you were measuring things correctly then, I highly doubt gc is a factor. Passing a pointer to an array will require a nil check, that's the only difference and can be done by you, upfront, instead of by the compiler per iteration. Put in other terms, using arrays is asymptotically equivalent to using a 1d slice. Tho for small sizes the distinction is unlikely to be meaningful as the chance of it all being cached in either scenario is high.
@Nenad_bZmaj7 ай бұрын
Small arrays can be on stack, but slices are always allocated on the heap (in Go).
@RenThraysk7 ай бұрын
His problem is that Go compiler doesn't perform loop unrolling, this is a deliberate decision by the go devs. If you manually loop unroll he'd get an improvement.
@Jarikraider7 ай бұрын
Personally, I thought a 100ms improvement on 1 billion elements was pretty good.
@MethodOverRide7 ай бұрын
Write smooth easy code? Sounds like something that genius Tom would do with JDSL.
@sotiriosmantziaris60977 ай бұрын
The compiler decides based on escape analysis where to put things.
@joyboricua37217 ай бұрын
@6:22 somewhere a hacker says "Oh yes we could"
@awesomedavid20127 ай бұрын
My assumption is that if you create a slice with a fixed compile time size and it doesn't outlive the function, Go will simply implement it as an array on the stack. The compiler knows its size and it knows how long it lives. Notice that even if you call append, it constructs a new slice. So the original could still be an array on the stack in implementation as long as you don't pass it outside of the function.
@Nenad_bZmaj7 ай бұрын
Most arrays and ALL slices are allocated to the heap in Go.
@Nenad_bZmaj7 ай бұрын
Also, when you call append, Go runtime doesn't necessarily create a copied array of twice the capacity. If there is enough capacity in the original array, appending a slice that points to it doesn't create new array. (On the side, a new slice is not declared with reasignment: a = append(a, el) .)
@what1heh3ck7 ай бұрын
@@Nenad_bZmajI thought append always create a new slice (still point to the original array if it has enough capacity), am I missing something here?
@awesomedavid20127 ай бұрын
@@Nenad_bZmaj I agree about the append thing. Though I still think the compiler could theoretically optimize a slice to be on the stack if you never call append, give it a constant time size, and don't return it. When you say it's always on the heap in go, do you mean that is how the main Go compiler implements it or is that part of the Go standard?
@Nenad_bZmaj7 ай бұрын
@@awesomedavid2012 I remember that this was the standard, although not in the language specification, since this doesn' t belong there, but this behaviour may change in future without changing the language specs. Somebody just told me that he looked at this more closely and that the current compiler does keep smaller slices that have properties you describe on the stack.
@OnFireByte7 ай бұрын
really interesting that flatten slice is a bit slower than actual 3d slice
@CjqNslXUcM7 ай бұрын
the slice is a pointer with a type and a len, so every time you move from the innermost slice it jumps to a cache miss. the flat slice is one fat blob of memory. no cache misses.
@OnFireByte7 ай бұрын
@@CjqNslXUcM that’s why im surprised that flatten one is slower
@Nenad_bZmaj7 ай бұрын
Something I noticed: you'll get slightly worse performance if you write involved expressions instead of simple indexes. Hint: assign such expression to a variable first and use that as an index. and let compiler optimize it. Simpler expressions are better put as an index directly, e.g. a[i+3], but write: ind := b[i]; do something with a[ind], rather than: do something with a[b[i]]. Not a big deal, though, since difference is marginal.
@metropolis107 ай бұрын
I liked that he said the 100ms improvement wasn't worth it. How often do people admit the technically faster thing isn't worth the headaches.
@blenderpanzi7 ай бұрын
I'm at 1:53 - I think the 2nd one could be the fastest, because an intelligent compiler could inline len() and eliminate bound checks in that case. It can't eliminate bound checks in the last version. Let's see if my intuition is right.
@RandomGeometryDashStuff7 ай бұрын
08:39 "a.length = 0" doesn't create new object so less stuff to gc
@adambright54166 ай бұрын
The fact that he used time in one function instead of builtin benchmarks is just... Wow
@RandomGeometryDashStuff7 ай бұрын
08:55 "a[a.length]=1" is slower than "a.push(1)" in v8
@rosehogenson13987 ай бұрын
go has escape analysis which can decide to place slices on the stack if it can guarantee they don't escape the current scope
@Nenad_bZmaj7 ай бұрын
True for some types of variables, not sure if this is true for slices. I think they are always on the heap.
@Bluesourboy7 ай бұрын
These complications from loops is why I believe C/C++ will be around FOR a WHILE longer.
@reactjs19007 ай бұрын
I think slices in go are like vectors and stored in heaps
@bbrodriges7 ай бұрын
Slices can be stack allocated. If fact they are by default
@metaltyphoon7 ай бұрын
Not they r not. The stack type, which is a pointer + length, will be stack allocated but the data won’t be in the stack.
@antoniusupov7 ай бұрын
All pointer values in go will go to heap. Go check out the runtime package in go and see what is inside. Only objects of fixed length live on stack, in go even strings contain pointers in them and go to heap.
@Nenad_bZmaj7 ай бұрын
The so called slice-header will be on stack (basically 3 numbers). It always points to the memory allocation on the heap.
@lukekurlandski76537 ай бұрын
These are dynamic arrays, right? I feel like you should be able to call a .contiguous() function or something that moves all of the elements of the tensor into a contiguous arrangement. Then again, I’m guessing there is some other data structure in Go that is closer to a traditional array or something…?
@Nenad_bZmaj7 ай бұрын
Yes, array : a := [k]something, as opposed to slice- a:= make ([]something, k). But in my experience, slices perform better.
@huge_letters7 ай бұрын
wouldn't it be faster to access an underlying list at each loop layer? instead of loop x loop y loop z t += arr[x][y][z] you do loop x xa = arr[x] loop y ya = xa[y] loop z t += ya[z] that way you do access on x and y only when needed and not every iteration? If you're saying it's likely memory access that's the issue
@Nenad_bZmaj7 ай бұрын
No difference. You still have to access the element of the array via xa. xa and arr[x] are both just pointers. And the array is one same. The compiler will transform your code into the one in the video, anyway, because xa and ya are just transient variables of slice (pointer) type.
@huge_letters7 ай бұрын
@@Nenad_bZmaj I mean that in option 1 you do access by x x*y*z times while in option 2 you do it only x times. Same for y - y*z times vs y times
@Nenad_bZmaj7 ай бұрын
@@huge_letters Oh, I wish. But simply not possible. No matter how you write it, the array contains x*y*z elements, and you need to access them all.
@k98killer7 ай бұрын
Go will store slices in stack memory if they are small enough (less than 2^16 bytes). Allegedly.
@masterchief15202 ай бұрын
Thought it all depended on escape analysis?
@k98killer2 ай бұрын
@@masterchief1520 I think it is a mix, but it has been a while since I last looked into this, so idk for sure.
@FunkyELF7 ай бұрын
The python range() is not like ranges in other languages. Python range just gives back numbers, that's it. I think go's range is more equivalent to general iteration in Python and the "in" keyword... `for thing in things: ...`
@davidzwitser7 ай бұрын
Meanwhile in APL languages you just write ‘matrix + number’ and it’s fully optimised
@justbrad_v39067 ай бұрын
yeah, that sounds like buffer overflow hell lol
@arkie877 ай бұрын
numba allows you to write fast python loops by compiling using llvm also, am I the only one who pronounced numpy in their head as num-pee for the longest time
@GaborGyebnar7 ай бұрын
Case 5 is fastest because the code is buggy: y is never used. It's just reading the same row multiple times, cache hits make it fast but the result is incorrect.
@RandomGeometryDashStuff7 ай бұрын
09:19 "while" is usually worse enemy
@Arwahanoth7 ай бұрын
I guess range statement avoid bound checking.
@someman77 ай бұрын
Whoa, his parentheses are different colors. It only took us what, 30 years to see "such" progress in syntax highlighting?
@Zullfix7 ай бұрын
Rather than trying to optimize iterating, it sounds like his array needs to be split into partitions (or chunks if you play minecraft) to allow for multithreading his calculations.
@ErazerPT7 ай бұрын
Not everything can can be parallelized. It all comes down to whether or not your next step depends on the results of the previous step. If no, you can, if yes, you can't. Simple example, a loop that steps through a vector adding the previous element to the current one in place. You can't parallelize it because you need the previous step's result before you execute the next.
@Zullfix7 ай бұрын
@@ErazerPT You don't parallelize by computing steps out of order, you parallelize by partitioning your space into chunks and computing each chunk on a separate thread, then combine the results. Obviously nothing is that simple, but at a surface level this is *the* technique used by professionals to create multi-threaded physics simulations and it very likely can be applied to whatever the guy in the video is trying to compute.
@ErazerPT7 ай бұрын
@@Zullfix Read the example i gave you. It CAN'T be parallelized. You CAN'T partition things if ANY of the subsequent operations depend on the results of its predecessors. As for physics simulations, be it rigid body dynamics or fluid, search around and you'll find a simple "some parts can, some parts can't" answer. Think like this, in a perfectly inelastic collision of billiard balls in a perfect line, only the cue ball and the last ball will move. But the last ball can't know it will move before the "one before last" knows it should try to move. And the one before last won't know... etc. Things like for example "grass simulation" can be done IF you take some "artistic license", ie, for example assuming wind interacts with grass but grass blades don't interact with each other. THAT is when you can partition things, because wind is a common entity and each blade of grass does it's own thing.
@atijohn81357 ай бұрын
@@ErazerPT for your example, you could still optimize it by adding up parts of the array in parallel, then for each part add the final element of the previous one. this would be faster for large enough arrays, since you can further parallelize the subsequent block adds no problem. many smaller dependencies like that can be eliminated by experienced parallel programmers, it's really more common than you seem to think, you can't just say that you can't parallelize an algorithm just because "not everything can't be parallelized"
@ErazerPT7 ай бұрын
@@atijohn8135 Jesus Mary, you people really don't give up... YOU CAN'T parallelize that example (and be more efficient). THINK. If one thread is adding a[0] to a[1], another CAN'T add a[1] to a[2] because it DOESN'T know what a[1]'s final value will be before the first thread finishes. But let's say we ignore that, and have a cpu with unlimited (real) threads. So, one does a[0]+a[1],while another does a[1]+a[2], etc until a[n-1]+a[n]. When they all finished, you now have to run another batch, look at differentials from the previous ops, and do it all over again for a[1] to a[n]. But... it has a "carry effect", so, you have to do it again for a[2] to a[n], then for each element upto a[n-1]+a[n]. Congratulations, you just parallelized something and ended up being less efficient than a single thread, because you still had to run n-1 steps, do more work per step and waste more memory because you had to keep track of both old values and new to calculate differentials. Great job.
@martinpata28997 ай бұрын
I like the idea but he should be using go benchmark for this tests
@dejangegic7 ай бұрын
I guess Go really is K.I.S.S. from the bottom-up
@ninocraft17 ай бұрын
S.S.I.K. ?
@AlexanderNecheff7 ай бұрын
I used to believe the compiler knew best and especially that any bug in the code was my fault, never the compiler's fault: then I started having to use the Intel compilers... And oh boy howdy - some of the worst compilers ever. Poor standards compliance, lots of weird bugs in both the front and back end, horrible diagnostics to boot. Still generally think the compiler usually knows best, but now I have an disclaimer associated with that rule of thumb.
@mannycalavera1217 ай бұрын
We Gorupies now
@yoloswaggins21617 ай бұрын
Can't you just disassemble the binary and look at what they're actually doing?
@GermanClaus7 ай бұрын
Try out some Kotlin ;)
@alexpyattaev7 ай бұрын
1 billion elements... The problem is likely memory bound. Make use of smaller data type, it will run faster.
@Nenad_bZmaj7 ай бұрын
Yeah. Also the ratio of used to total memory influences how frequent GC will run. (kinda; it is more complicated). GC sweep for this size typically takes 200 ms (on my old comp with old single core Athlon CPU).
@alexpyattaev7 ай бұрын
@@Nenad_bZmajGC does not traverse an array of integers. As far as GC is concerned, it is one "leaf" object in the object graph, so it does not get traversed, no matter how big it is. Now if it contained other objects like strings the GC would have to walk it.
@Nenad_bZmaj7 ай бұрын
@@alexpyattaev I am not talking about GC traversing an array, but about the GC mark-and-sweep cycle occurring during the execution at some time points, at which times "the world is frozen". The way he structured this test - you don't know when the GC was active, so benchmarking three test-cases consecutively in one outer loop is not going to be informative for comparisons... I mentioned size dependency, because the program runs longer on larger arrays and GC runs more times. I see that in my programs. Although these programs involve quite complex data processing, the mere size of input array dictates how many times GC will run or for how long in total it freezes the world. When my input slices point to array of 50 Mbytes, one GC sweep takes about 50 ms on average, and when the array is 250 Mb - 250 ms. (on my very old comp.) I assume that when each innermost slice is done its array can be marked for sweeping since it is not going to be used anymore, and so on upwards.
@alexpyattaev7 ай бұрын
@@Nenad_bZmajOh I see, with more memory allocated in general it just runs GC more often. That is crazy, so the more RAM you use the slower your code runs (as it does more pauses)? Brrr....
@Nenad_bZmaj7 ай бұрын
@@alexpyattaev It is highly optimized, though, so it uses the least number of cycles possible for the amount of garbage it needs to collect. Plus, Go gives you options for GC so you can optimize it for your specific program during the build. If your program approaches the memory limit, Go GC will take that into account when calculating how often must sweep, and adjust its cycles so you don''t run out of memory. You can opt out of this behavior. When you have enough memory, GC doesn't take a large fraction of total time even if you have a lot of garbage. If you have a miltithreaded CPU, then you almost won't notice the GC.
@mohamedaityoussef99657 ай бұрын
Blue hair gone nooooooo
@lancemarchetti86737 ай бұрын
Nice
@dixztube7 ай бұрын
Man the new react next is sucks! I just popped out an htmx and go project. Silky smooth, fast and sexy. Been bumming around this react nonsense for two months now after getting decent at the last version of react
@jony17107 ай бұрын
ITT nobody knows what the stack is
@gavipk6 ай бұрын
solution: c++
@ofekshochat99207 ай бұрын
arr[a][b][c] = arr[a * (second_dim + third_dim) + b * third_dim + c]. this shouldn't have hops...
@leeroyjenkins03 ай бұрын
They're slices so you can append to them and thus they can move. They're not contiguous memory. If you want contiguous memory you use fixed-size arrays.
@bloomingbridges7 ай бұрын
🐼
@nevokrien957 ай бұрын
Pandas is slow use numpy
@BlitzDjingga7 ай бұрын
This is really shocking result, but I do believe the second one is better approach most of the time, while the third one is super optimized that it is so tiring to code and maintain. To bad the video does not show memory performance for this, I believe what is happening is that the second and third one operate on HEAP does make the performance slower. While the first one is lucky enough not to encounter memory overflow. thing is when you do `range` it make copy of the object, albeit 3D array maybe the host machine having ram big enough to account for all of that iterating. Will be more educational if can show the garbage collection and memory / space performance for each of the method.
@ryankimeu90137 ай бұрын
i am an arch user
@uzbekistanplaystaion4BIOScrek7 ай бұрын
that's so poggers
@RT-.7 ай бұрын
Here's your trophy
@cyrusturner611111 күн бұрын
,yamv
@pyaehtetaung7 ай бұрын
Black?
@deathzombee7 ай бұрын
First
@adjbutler7 ай бұрын
But Python is strongly typed and JS is weakly typed... why you hate on Python so much? It is not as slow as it used to be....
@svetlinzarev34537 ай бұрын
Golang is such a a crappy language
@spartan_j1177 ай бұрын
Hahahaha Prime! So you DO reupload your old videos! Now that's lame...