Memory, Cache Locality, and why Arrays are Fast (Data Structures and Optimization)

Рет қаралды 305,403

Күн бұрын

Why is the first loop 10x faster than the second, despite doing the exact same work?
Follow me on:
Twitter: / iced_coffee_dev
Github: github.com/simondevyoutube/
In this video we talk a bit about memory, the cpu caches (specifically the L1/L2/L3 cache), hardware prefetching, and how all this comes together for arrays. We'll be covering each of these topics in enough detail for you to get a solid understanding, working through real world examples to help illustrate the point. We'll talk about the most fundamental data structure, arrays, how they work, what situations they're great in, and when they suck. We'll also touch on some algorithmic complexity. Finally, we'll be talking about why understanding this is important and how this leads in to more advanced topics and data structures.
What's covered:
* How memory allocation works, memory addresses
* Contiguous memory
* CPU Caches, L1/L2/L3 cache
* Hardware prefetching
* Array operations, what's fast and what isn't
* Closing thoughts, why this is important to understand, how this relates to more advanced data structures

Пікірлер: 385

@simondev758 3 жыл бұрын

If you enjoyed this, help support me: www.patreon.com/simondevyt As for the question at the beginning, as a first hint, look at how the indices for the arrays are generated in both loops. Write out a few, and think about the pattern in memory. Actual answer below. Both loops touch all memory exactly once. First loop generates indices sequentially (ie. 0, 1, 2, 3, 4, ...), meaning memory accesses will also be sequential. Second loop uses a large stride (0, 4000, 8000, 12000, ...), so cache misses will be common.

@simondev758 3 жыл бұрын

@@heinzerbrew It's the indexing, just unroll the loop a few times to confirm. Sequential, predictable memory access vs jumping around, main topic of video. Php? Don't know much about it, other than it's an interpreted language. Guessing the overhead of that drowns out other things.

@simondev758 3 жыл бұрын

@@heinzerbrew Again, you can't use php to test this. And this has nothing to do with memory allocations. Additionally, please don't use the pinned comment for this.

@hww3136 3 жыл бұрын

would a modern c/c++ compiler be smart enough to detect and optimize it?

@simondev758 3 жыл бұрын

This is a pretty contrived example. I tried to think of the absolute, most basic thing that would illustrate the point without needing to build a lot of extra boilerplate code around it. But in this example, using "gcc -O3" doesn't optimize it, but gcc -Ofast does, because it disables strict standards compliance. So it does sorta catch it, but is held back because you have to explicitly tell the compiler that you don't care much about the order. Remember that floating point operations are NOT associative, so it tends to be the compiler reordering them without explicit permission from the programmer is a no-no. V8 doesn't seem to catch it, but can't say for sure why. So the difference in timings between the loops should still largely come down the difference in memory access patterns. We could make the setup a little more complex with some structs with useless info to pad them out, or an array vs linked list, or something to that effect. Maybe it would have been a better choice? Would be more legit, but harder to parse for beginners. I dunno, I'm learning how to present this as I go heh.

@Victor_Marius 2 жыл бұрын

@@hww3136 what if you intend to iterate that way? You'll get a bug

@xeridea Жыл бұрын

You didn't explain why the original code posted was 10x slower. The reason being that the method in which the array was effectively traversed was not sequential or easily predictable, causing a lot of cache misses.

@User-ty2ml 2 жыл бұрын

Best Director Award goes to Simon, i have never seen such a beautiful explanation without answering question, amazing!!!!

@simondev758 2 жыл бұрын

hah

@vitalino1981 3 жыл бұрын

"let's head to apple store. I don't actually own one, because they are super expensive and I'm super cheap, but I like to look at them" 😂🤣😆 Man this was soooo relatable on so many levels 😂

@WojtekKozowski 3 жыл бұрын

“why the code on the top runs faster? watch the video and find out” ...and no mention of the code again. Otherwise nice video.

@simondev758 3 жыл бұрын

Hah yeah, I feel kinda stupid for not looping around, live and learn. I'll continue the series and make sure I actually answer it next time!

@metagen77 3 жыл бұрын

@@simondev758 Nah you managed to explain without relying on code. If anything cut the intro.

@bonehelm 3 жыл бұрын

One is accessing the array in row major order, and the other is accessing it in column major order. The top one is faster because the memory is being accessed in a contiguous way, which means you'll get a lot of cache hits (very fast, going to main memory much less). The bottom is is NOT accessing memory in a contagious way, meaning every access is a cache MISS. en.wikipedia.org/wiki/Row-_and_column-major_order

@tagg1080 3 жыл бұрын

@@bonehelm maybe I am blind but the only thing he switched was the + and *

@bonehelm 3 жыл бұрын

@@tagg1080 That's exactly the only change. He's using a formula to treat a 1D array as if it were a 2D array. But the order of + and * determines whether it's accessed row by row, or column by column.

@Chadderbox 3 жыл бұрын

I feel like this is going to be an awesome series, especially for people like me who learnt programming by themselves and never really had a formal education teaching them about what parts of your code does under the hood.

@simondev758 3 жыл бұрын

I did study computer science, but none of this was covered. This is mostly from years of experience learning at companies and applying these optimizations.

@carterearles9528 3 жыл бұрын

Me too!

@Ainigma 3 жыл бұрын

@@simondev758 i share this experience. in computer science, we also never covered that. only when i had to do projects and facing performance optimization problems, i was motivated to tackle the fundamentals of data structures and how to squeeze more and more efficiancy out of my code.

@barddz4646 3 жыл бұрын

@@Ainigma wait really? I’m at the first semester of computer engeneering and i’ve alterady learned about this, Idk if the way they teach is different here in Brazil or not

@simondev758 3 жыл бұрын

Different universities teach different curricula. I went to a university that's considered pretty good (ie. often included in top lists), the focus was much more on theory and mathematics. I mentored a few Stanford grads while I was at Google, came off with the same impression. Super strong bases in theory/math. Also, computer engineering != computer science. I can totally believe that an engineering oriented curricula would learn way more about the underlying hardware.

@sukkrad7797 3 жыл бұрын

This cleared up so many things that my professor failed to teach me (or even understand it himself) We truly need more videos explaining how the cpu works when we code Thank you so much!

@simondev758 3 жыл бұрын

Nice, lemme know if there's other subjects I should cover too.

@Rssks 3 жыл бұрын

Wow this video is almost 10 minutes but it felt like 1 minute! I really enjoyed it (Y) will definitely rewatch this later today!

@simondev758 3 жыл бұрын

Awesome!

@Provitia 3 жыл бұрын

The video became x10 faster for you

@psychopompous489 Жыл бұрын

I was jumping around randomly and it felt like 10 minutes

@AllAboutCode 3 жыл бұрын

4:20 you ain't cheap bruh, your content is way better than expensive internet courses.

@simondev758 3 жыл бұрын

ty! If you have suggestions for future vids, let me know.

@AllAboutCode 3 жыл бұрын

@@simondev758I'd love this series just keep doing it

@valencefootball9740 3 жыл бұрын

hahaha weed number

@masterofdizzzaster 3 жыл бұрын

@@simondev758 I hope you will grow so much that you will be able to afford one of those macs without feeling its price, i watch your videos for quite some time now and you deserve it.

@cuteswan Жыл бұрын

I dutifully paused the video at the beginning and took _way_ too long to spot the differences between the code samples. (I've never been much of a programmer.) After getting it I still learned a lot more from your discussion on the caches and all that. Thanks.

@MrLazini Жыл бұрын

This channel is pure gold man. Although I already know most of these concepts, you explain them pretty clearly and in a very understandable way ! Well done friend, subbed!

@osambrojevanisam3575 3 жыл бұрын

This has got to be the best educational video on youtube, fuckimg amazing

@simondev758 3 жыл бұрын

Awesome, glad you enjoyed it!

@miteshkumar5557 3 жыл бұрын

Arrays are also slow when you have multiple threads, each bound to its own CPU, trying to access a global array. Suppose you have an array of size 4 int32_t, and 4 threads labeled 0 to i. The i th thread accesses the i th index in the array and modifies it repeatedly. This will cause major slowdowns as each modification will mark the cache of the other 3 threads/CPUs as modified, and each thread will have to do extra work to update their cache, even though the value that the thread cares about isn't being modified. This essentially makes the parallel code sequential. A linked list would be much better to use here since the memory locations will not be adjacent to each other, or each thread can modify a local variable and only write to the array at the end.

@xeridea Жыл бұрын

I have code that uses multiple threads to generate large arrays using perlin noise, and get near perfect scaling with threads. If you were doing something that multiple threads had tons of modifications, the L3 cache would still be valid, as that is shared between cores, so it wouldn't be that big of an issue. The bigger issue would be multiple threads constantly updating the same data causing race conditions, or bizarre behavior due to how memory reads and writes work.

@thegibusguy4969 Жыл бұрын

Hey, thanks for giving me a good explanation on what cache hits/misses were. I've seen it being tossed around but never knew what it meant until now.

@EpsilonKnight7 4 ай бұрын

Posting this for anyone else out there like me. As someone who knows JUST enough code to be dangerous, I feel dense for not getting it sooner. My brain absolutely looked at the two for loops, saw them moving through the same numbers, and even though I recognized there was a difference in the last line, my thought process collapsed to "alright so you move through all the same array positions eventually," and I spent ten minutes trying to figure out, on my own, why changing the order of operations would make it faster while ignoring, y'know, the actual result of iteratively running the loops. The first comment I saw mentioned row vs column major ordering, and even then, I thought "that's silly, it's a 1D array!" and failed to make the connection. Took me seeing someone list the resulting indices in order to finally click. Despite so much of video focusing on contiguousness. So for anyone else that did the same, we're a special kind of stupid, but at least not a unique kind.

@simondev758 4 ай бұрын

It's hard, which is why optimization is often a specialty.

@kosnk 3 жыл бұрын

An amazing series start, Simon! Thank you for this dive into memory allocation/access details, very helpful! Surely looking forward to a (contiguous?) continuation and your other quick projects & cool tips! Very kind of you to put in so much work and share it. Again, thanks!

@simondev758 3 жыл бұрын

np, glad you enjoyed it!

@katarinazivkovic3102 3 жыл бұрын

This is explained so clearly, I'm now going to binge watch your videos 😃 Thanks to YT for recommending me your channel. Subscribed! 🌟

@eddiemuller3157 3 жыл бұрын

Good Stuff, my man! The explanation is very clear and the dry humor is appreciated. I went into this video not knowing what a contiguous array was and now I have an idea of what it is.

@eddiemuller3157 3 жыл бұрын

Question; I understand that the L1, L2, and L3 caches reside on the CPU itself. Is there a typically a controller on the CPU like a northbridge? Or is it the CPU itself that access the caches hence the low latency; no middle man.

@simondev758 3 жыл бұрын

That's awesome! That's right, caches typically reside on the CPU. They're also built with a different kind of memory (SRAM), faster. Northbridge is the interface between CPU & main memory and resides on motherboard. That's my understanding anyway.

@danishfayaz8077 3 жыл бұрын

Absolutely loved it. Just the right way to explain a concept top to bottom. Keep it up

@tonnose 3 жыл бұрын

I don't remember I watched this video before until I saw the L1/2/3 Cache example. The drawing is making the video so interesting. Now I come back for the new videos after finishing my algorithm final exam!

@nathanp6928 3 жыл бұрын

I have a final tomorrow over this stuff and this video just happened to pop into my recommended. Probably cause I was googling virtual memory, caching, etc. Super informational video, and I was able to study and be entertained at the same time. I’m a fan 👍🏼

@simondev758 3 жыл бұрын

Good luck on your final!

@Greatbubba747 3 жыл бұрын

This is a very great high-level view of arrays and CPU caching! I wish I had this video as an introduction when I was taking my Computer Systems class in college!

@mahima7638 2 жыл бұрын

Loved this video! Will enjoy watching the whole series definitely.

@pist5343 3 жыл бұрын

Awesome! Looking forward to this series :) It's really not easy teaching about memory and data structures without the audience falling asleep, but you've done a great job! Unfortunately JS arrays are messy and not exactly contiguous... :(

@simondev758 3 жыл бұрын

Hah yeah, I actually had a small section on JS arrays in there just to clarify about them, but cut it due to time (and laziness). You're totally right, they're non-contiguous "array-like" objects. The TypedArray is closer to what you'd get in C++ or something.

@pranjalagnihotri6072 2 жыл бұрын

I love how he goes into depth of things, please continue your great efforts sir 🔥

@Dionny 3 жыл бұрын

But why is the code example from the very start of the video faster than the other example? I feel like that wasn't covered or I missed the point.

@simondev758 3 жыл бұрын

Youness covered it well, but I left it open as sort of an "exercise for the viewer". Kinda wanted the video to be in depth enough that you could answer it yourself.

@mike.emery. 3 жыл бұрын

@@simondev758 Leaving as an exercise to the viewer is fine, but as a suggestion for the format it would be good to circle back to the question at the end of the video and call it out. The way it's structured now it seems like the question is just forgotten about.

@simondev758 3 жыл бұрын

Great feedback Mike, 100% agree! Tried it out but next time I'll make sure to close the circle.

@bonehelm 3 жыл бұрын

@TiDuZzO 3 жыл бұрын

@@bonehelm I still don't get it, both FOR starts looping on x and then y. So both should have the same speed. The only difference here is that on the first one you do the multiplication before the addition... And on the second one the other way around... And it doesn't seems to be a continuous problem here....

@aufkeinsten7883 3 жыл бұрын

you are made for creating videos like these, I could watch this for hours, thanks so much for the content

@kik8bg 3 жыл бұрын

A simple and interesting explation of arrays, I came here to watch this video just to make sure I knew exactly how everything works. Having hardware classes with programming coursesreally, does help out a lot to understand computers, both on hardware and software level.

@mahmoudhammmad8089 3 жыл бұрын

AMAZING!!! Waiting to watch the rest of this series

@simondev758 3 жыл бұрын

thx!

@honey661 3 жыл бұрын

Even if I know this topic, your video was so interesting that time has passed unnoticed and I watched whole video. Wow

@pauljordan3064 3 жыл бұрын

Great explanation. Clear and concise. Thanks!

@boxicool 3 жыл бұрын

Man. Your channel (even i don't use js so much) is freaking awesome!! I would buy your course or book without hesitation! Thank you!

@simondev758 3 жыл бұрын

No courses at the moment, I just put it all online for free :)

@als5321 3 жыл бұрын

This was amazing. THANK YOU. More like this for Javascript. Keep it coming and don't stop!

@mehmedkumali1135 3 жыл бұрын

Very helpful video, you explained everything clearly.. Reminded me of my university teacher back in the day. Thank you for the upload!

@simondev758 3 жыл бұрын

Awesome, if you have suggestions on topics to cover too, let me know

@pinekel8987 Жыл бұрын

Thank you so much for creating this series, this is what ( even now ) I think more programmers ( and sometimes myself ) should pay attention to.

@sva3338 3 жыл бұрын

Holy f. This is so good, youtube finally recommend me something that worth my time. Please, release more episode!!!

@simondev758 3 жыл бұрын

Second video in the series is the linked list one.

@shaurya9055 3 жыл бұрын

If you'd have been my tuition teacher I'd have been a topper in my class Your way of explaining is so cool man👌 I'm inspired

@BipinOli90 3 жыл бұрын

Loving this series man. Please keep going. Would love to see more of real world optimization like you did previously. But yeah fundamentals need to be cleared up first. Btw I remember there was one cpu level hack which exploited this eager loading behavior in cpu. Some private part of memory was located by measuring the access speed, if the access was faster that meant the memory location just after current location was already cached by the cpu even though later on it was restricted to actually use due to other privacy constraints.

@simondev758 3 жыл бұрын

More real world optimization? Sure, I can probably swing that. Sounds a lot like Meltdown: en.wikipedia.org/wiki/Meltdown_(security_vulnerability)

@BipinOli90 3 жыл бұрын

@@simondev758 ah yeah, its Meltdown. Thanks man. Love your videos.

@nissieln 3 жыл бұрын

Superb series! thank you for the content!

@JNunoCoast 3 жыл бұрын

Already knew these concepts but still an amazing video, definitely helpful to cs students!

@tomi.mocnik 3 жыл бұрын

Got it as a recommendation, instantly subscribed.

@RobertAlexanderParraCiro 3 жыл бұрын

Problably the best vide have ever watched about how arrays works

@bakerfx4968 3 жыл бұрын

I wasn’t ready for it to be over!

@shivankchopra8552 2 жыл бұрын

Thanks for sharing this amazing tutorial!

@amimulahsan1289 3 жыл бұрын

Wow, man. I mean literally you just made it simple; not like other people who go through so much technical stuff and use buzz words.

@simondev758 3 жыл бұрын

I mentored a junior guy recently who did that, spoke entirely in jargon and buzzwords. I always wondered if he specifically practiced doing that.

@closery 3 жыл бұрын

Sir, your channel is a gem! thx for these videos

@binaryburnout3d Жыл бұрын

Here king, you dropped this. . .

@djflugel79 3 жыл бұрын

8:24 the animation made me smile

@legendary9145 3 жыл бұрын

your videos deserve a way bigger audience

@Chapali9a 3 жыл бұрын

Learned a lot. Thank you

@stLmpp 3 жыл бұрын

Your videos are amazing! Thank you for this content

@afsarzan 3 жыл бұрын

love your explanation and examples..

@ezequielgarrido51 3 жыл бұрын

Great series!

@xicheung7867 3 жыл бұрын

Your content PLUS the video production = 🙌 SUBSCRIBED

@yourfutureself4327 Жыл бұрын

6:02 snipe'ns a good job mate

@yac7571 Жыл бұрын

Great job, I like your style. Easy like and sub 💯

@_DD_15 2 жыл бұрын

Nice video, thanks for that!

@rewrittenbytes1616 3 жыл бұрын

Arrays are fast, except when they aren’t ;) Great video!

@simondev758 3 жыл бұрын

Well said!

@inconito7281 3 жыл бұрын

This video was amazing, you have my subscription

@perelmanych 3 жыл бұрын

For all those of you who are wondering why he accessed 1d array like if it is 2d, here is the real world example. Assume that in C++ you write a class that represents a matrix. You can go with 1d array or something like vector< vector< int > >, which you can think of as being 2d array. So why the hack then to use 1d array. There are many pros of using this approach. One of them is much faster work of constructor and destructor. Assume that you need to allocate memory for NxM matrix m. With 1d array it is one call of malloc (which asks the system at allocate a chunk of memory for you). With 2d array you should call malloc N times, once for each row. The same logic applies when you want to free memory. So now once we chose to represent our matrix as 1d array, assume that the user of our class wants to sum all of its elements. The user has no idea how did we choose to represent the matrix under the hood, he only knows that he can access an element of the matrix by using operator() as m(x, y). Your operator() in turn will have to calculate the position of an element by evaluating y * M + x and in the essence this will be exectly the same code as he showed in the video.

@simondev758 3 жыл бұрын

Awesome explanation, exactly what I had in mind when making the example.

@AviPars Жыл бұрын

Nice video ! Thanks

@raimanish4843 3 жыл бұрын

this was very helpful. thank you

@simondev758 3 жыл бұрын

np!

@bwanamaina 3 жыл бұрын

Well explained 👏👌👍

@IrwinRodriguez 3 жыл бұрын

excellent explanation! new sub!

@IgorSadovskii 3 жыл бұрын

Omg is very useful, thanks a lot !

@NamLe-to2db 3 жыл бұрын

I love this. Thank you for sharing.

@simondev758 3 жыл бұрын

thx

@hamsam789 3 жыл бұрын

This was really clear and easy to understand, thanks

@giorgibatsiashvili4270 3 жыл бұрын

i found this channel and its briliant

@hardwired89 3 жыл бұрын

more js content sir love it❤️ thank you

@almazmusic 3 жыл бұрын

Okey, you got me :) awesome videos!

@calebkrauter4027 7 ай бұрын

Great video!

@trieutruong2410 3 жыл бұрын

Great video. I’m your fan for now

@piyushjaware 2 жыл бұрын

Hi Man. Very nice explanation. I was curious how you made the animations in your video.

@jonasls 3 жыл бұрын

I want more!! Can't wait

@simondev758 3 жыл бұрын

thx!

@obsoletepowercorrupts 3 жыл бұрын

There are pros and cons to this worth noting. It can be a valid form of redundancy to have an array as a 'look-up' to speed up data coming from a query in a relational database (rather than hitting the query over and over). That helps performance sometimes. However, the cache vulnerabilities can be exploited. An array does not have the same benefits of a linked-list in terms of being able to specify pointers for pass by address versus pass by reference. Also though, in some languages a linked list can be a problem as it needs objects, and that means garbage collection (Java) can happen at anytime, thereby making performance hard to predict and sometimes rather stuttering.

@prodsefv Жыл бұрын

thanks simon

@DARKCOP2011 3 жыл бұрын

This is super nice!

@gligoradrian784 3 жыл бұрын

Took me a good minute to find the difference between the 2 codes.

@agsystems8220 Жыл бұрын

Great video. You may have forgotten to revisit and explain in detail the code at the start though...

@simondev758 Жыл бұрын

Yes... yes I did heh. Answer is pinned.

@ahn_buguei 3 жыл бұрын

Thanks! I couldn't understand cache misses or contiguous arrays before your video. It felt like 3Blue1Brown of Computer Science

@ahn_buguei 3 жыл бұрын

I just applied this on my internship and I solved a serious bottleneck!!! Thanks Simon

@simondev758 3 жыл бұрын

That's really cool! What kind of problem were you solving?

@ahn_buguei 3 жыл бұрын

I optimized some existent numerical C++ code (for signal processing) from 420ms per iteration to 45ms, but the goal was 30ms per iteration. There is a giant nested for loop that was taking 44ms to run. So I applied the change specified in the video to reduce cache misses. Now it's taking 22ms. Maybe I could further optimize it by writing a special data structure (probably just a linked list), but that's good enough for the manager by now and he has other priorities for me.

@mfam98 2 жыл бұрын

Had a chair on advanced computer architecture, this covered some weeks on a 10 min video. Wish I had this kind of resources back then. Also had to make a simulator with several caching sizes and strategies, good times 😂

@simondev758 2 жыл бұрын

Ooh that simulator sounds kinda fun to play with.

@irfanukani 3 жыл бұрын

Awesome content ❤❤

@nkululekomthimkulu1248 2 жыл бұрын

I just subbed. I will be checking out your videos.

@harshitjoshi3082 3 жыл бұрын

Great video 🔥👍

@simondev758 3 жыл бұрын

thanks!

@HersonBagay 3 жыл бұрын

My only formal training in programming is an elective course in C in college. I've loved it so much I end up writing device drivers for baremetal code running on top of ARM Cortex M processors as a hobby. There is no substitute for learning the fundamentals. It pains me to see some universities starting Computer Science students with a course in Java...

@simondev758 3 жыл бұрын

My first exposure to programming was Java in my intro to computer science class heh :P

@HersonBagay 3 жыл бұрын

Some people are just naturally good at logic and programming. We can count ourselves in that group

@abdelrahman5094 3 жыл бұрын

Great explanation, thanks for your effort I hope to see future videos on how javascript garbage collection work behind the scenes

@simondev758 3 жыл бұрын

Ooh that'd be an interesting subject.

@BipinOli90 3 жыл бұрын

I think it uses the reachability idea. If there is no way a part of memory can be reached it will be garbage collected. They didn't use reference count approach because that doesn't handle 2 pieces referencing each other but are completely isolated and not reachable.

@tuckerbeauchamp8192 3 жыл бұрын

Your videos are awesome

@vishwaratna 3 жыл бұрын

I liked this channel so far. Keep making please...,🤩

@tomg0 3 жыл бұрын

Awesome Stuff!

@simondev758 3 жыл бұрын

ty!

@apoorvaranjan9921 3 жыл бұрын

This blew my mind

@FullThrottleTaco Жыл бұрын

This video kicks ass.

@lilwage6910 3 ай бұрын

How does the processor look up data in the L-caches? I thought it was looking not for specific data, but for whatever data is in a specific data address. Do the L-caches keep the data alongside the address of that data in memory, where it came from?

@aj35lightning 3 жыл бұрын

Are we working our way towards ecs? ...I'm getting those vibes haha. This also reminded me of a channel 9 talk I recently saw. "Native Code Performance and Memory: The Elephant in the CPU" Blew my mind the importance of cache misses and structuring your data in continuous chunks

@simondev758 3 жыл бұрын

Oh yeah definitely want to cover ecs at some point, why major companies are going that route, etc. Lot of ground to cover between beginner and that level though heh.

@abdelrahman5094 3 жыл бұрын

Esc stands for what?

@simondev758 3 жыл бұрын

@@abdelrahman5094 Entity-Component-System

@Skyefaux 3 жыл бұрын

More of this please

@simondev758 3 жыл бұрын

Working on the next one right now actually.

@staplepin8099 3 жыл бұрын

Amazing channel

@simondev758 3 жыл бұрын

Thanks!

@dibaterman Жыл бұрын

Hur hur hur, I actually understand this stuff now. =P. I've never used Hash before though. In my use case (targeting enemies near the player), I followed Unity's design for this; they have a new array each frame that grabs all the colliders on a specific layer around the player. What I did was change that to a list and cache'd the list also giving it the size for 50 since I didn't see the game having more than 20 enemies in the scene at once within a detectable range. Lists are slower than arrays, about 10x to 100x slower depending on what I'm doing (Or so I read). I'm hoping doing it the way I did it by initializing the list in memory with 50 spaces for things makes it faster... The result is it did, changing over enemies and the player to this system gave a marked increase in performance. =D Happy Ending. Why use List over the built in Unity array? Well Lists are easier to manipulate, and later on I want to put things in order of their vector 3 magnitude with the player. So plucking things from one list and adding it to the other while possible with Array to List allows for a bit more backwards checking if things are alright as I'm developing.

@Otakutaru 2 жыл бұрын

why does the code on the top run 10 times faster? I thought you were going to answer it in this video. By looking at it I can deduce that it has to do with the nested loop. The first code goes in contiguous 4.000 data points, then moves onto another 4000 batch, which should be already cached. The second skips between batches, grabbing the first from all of them, then the second, etc. I don't see why it wouldn't be prefetched if it's repeteable.

@clearsky4049 Жыл бұрын

Yup, the bottom code block is leapfrogging by 4000 positions in every iteration of the nested loop but the top code block accesses the array sequentially. Its basically the difference between grabbing one sandwich from the plate and eating it completely before moving onto the next VS taking a bite from a different sandwich every time until your plate is clean.