This video really helped me with all the naked humans with braces who are not bees.
@tristanboyle445018 күн бұрын
🥱🥱🥱
@grimvian23 күн бұрын
Interesting video. Eskild Steenberg address lot of the topics in an 8 year old video - How I program C: kzbin.info/www/bejne/amWWhoGbfNd5pa8
@r_j_p_Ай бұрын
Basically write your code in Fortran.
@eggmeister6641Ай бұрын
Where can I access the slides from this talk?
@b1zzlerАй бұрын
This is all great, but I can't help but think of how much more painful it would be to extend and refactor code like this without breaking everything.
@dreanwaveАй бұрын
Throughout the whole talk I was thinking: What about bit fields?
@anonymousperson420Ай бұрын
Worst thing about the internet, it makes me feel old. This guy started on the sega genesis? My goodness that came out ages after my first game.
@IzopropilenАй бұрын
so after all this stand-ups Andrew Kelley sitting on stream talking about how Zig has so many bugs because its compiler is slow. DoD didn't help?
@wegvS2 ай бұрын
I‘m a web developer and will probably never use any of this stuff but it’s super interesting!
@ZarviroffSerge2 ай бұрын
Brilliant!
@EmberMage81922 ай бұрын
One thing I noticed about booleans is that in some cases people duplicate an implicit information with them. Which means such booleans can be replaced with a function. For example, with the monster struct, you can actually detect if a monster is alive by simply checking if HP > 0 (unless your game logic allows a monster be alive otherwise, of course). Sometimes booleans duplicate implicit state of a nullable field. For instance, I've seen a user account record with email and a flag called activated. But user account was only considered activated if it has an email, so you can just delete the boolean and use email != null check.
@masterchief1520Ай бұрын
My previous company has that lmao. Mobile verification and email. It was fucking integer field 🥸
@petevenuti73552 ай бұрын
How do i get a system to treat a hard drive or flash drive AS main memory, fool it into thinking the storage device is RAM, then treat the ram as l2 cache? Can it be done with a driver or does it have to be done in the firmware?
@rangeramg2 ай бұрын
isn't this just swapping/paging? why would you want to do this because it will be insanely slow
@petevenuti73552 ай бұрын
@@rangeramg no it absolutely would not be swapping paging, I am trying to avoid paging all together and just use a block device directly AS RAM! I'm trying to both bypass motherboard limits on supported ram, and make programs believe I have more physical RAM then I would even have room to hold the page file for when swapping! Asking if it was the same thing makes me wonder if you watched the video? Sorry for the sarcasm but I am tired of people repeating that same response when im explicitly saying that's not what I want to do, it's very frustrating.
@rangeramg2 ай бұрын
@@petevenuti7355 if that's the case i think the only way to accomplish that is through modifying the operating system itself, not sure if l2 is even able to be changed (i doubt it since it's inbuilt into the cpu itself). the system that enables paging essentially treats it like extra memory in the address space, so not really sure what else you want to do other than that? you can just allocate a whole device as swap space and use it as memory in linux, so there's that, and you can probably limit useable memory to almost nothing to emulate what you want to achieve and yes i did watch the video in full
@petevenuti73552 ай бұрын
@@rangeramg yeah I'm not sure how the mmu on the motherboard and the memory management of the operating system work together or are capable of overriding each other. I was hoping somebody would know. I remember really old systems where you could put RAM on a card on the bus. As for the actual l2 cash you are most likely right as it's in the CPU chip, however if the operating system has anything to do with it, such that's not just microcode on the CPU that handles it, it would be cool if any real ram that was available could be used in that role.
@EvilTim19112 ай бұрын
Man, this guy is pretty good at Zig, he should join the team developing the language
@emmanuelbyiringiro720727 күн бұрын
He's creator of Zig hahahaha
@Multigor9619 күн бұрын
@@emmanuelbyiringiro7207 r/whoosh
@shadyabhi11 күн бұрын
@@emmanuelbyiringiro7207 That was the joke here. :D
@AhmadAli-kv2ho2 ай бұрын
can someone correct me,im new to this: is he say ing that its 1+4+3+8+3+4+1 which adds up to 24? or is it not 1 and 3 for new lines?
@khatdubell2 ай бұрын
The most important take away from this is that he does't know what OOP is.
@maskettaman14883 ай бұрын
As neat as some of these tricks and techniques are, I am absolutely rejecting your PR if you do pretty much anything talked about here. Code quality is wayyyy more important than technically saving a few bytes
@rusi62192 ай бұрын
@@maskettaman1488 you mean dividing perfectly readable functions into 5 classes and 25 methods LOL
@maskettaman14882 ай бұрын
@@rusi6219 No. I mean not jumping through hoops and writing some arcane code so that you can have a bool and an int share a byte. There are very very very few cases where that kind of optimization is necessary considering the quality of the code suffers greatly as a result. But also, yes, you should be breaking up functions in to smaller units. If you have 1 function that does 20 things... you should probably look at breaking it up and organizing things a bit.
@rusi62192 ай бұрын
@@maskettaman1488 hahaha "quality of code" is about the last thing a customer is thinking about as he's staring at a frozen screen due to fifty destructors destructing other destructors
@maskettaman14882 ай бұрын
@@rusi6219 I can only assume you're new to development in general if you think "fifty" destructors would even be perceptible to an end user. Have you never seen a stack trace from a modern web server or something? Also, destructors have literally nothing to do with cryptic bit tricks to save a few bytes on your data structures like the video is discussing You really seem to not know what you're talking about
@rusi62192 ай бұрын
@@maskettaman1488 I can only assume you're new to development in general if you think "cryptic bit tricks to save a few bytes" has no effect on how well software runs at scale You really seem to not know what you're talking about
@onsearchfocus3 ай бұрын
Great talk. No fluff. Just gold!
@Ariccio1233 ай бұрын
I'm not a zig programmer but I was very familiar with llvm for a while.... This is gold!!
@RobertGardnerEngineer3 ай бұрын
"I programmed my compiler like an ECS based video game and it dramatically improved performance" was not the talk I expected to listen to today. That is a cool take. It makes me wonder if there is value in a language where this is how it stores things and processes under the hood. Like, can you write a language where ECS is not *a* paradigm, it's *the* paradigm.
@pawello873 ай бұрын
I have doubts whether the monster example is relevant in the context of DOD. Let's say I want to update the position of all bees. I would have to iterate through the whole set and use an if condition to pull out only the instances I am interested in. That is still clogging the cache with data I am not interested in (all non-bee monsters). In the DOD approach this would be split into two (or more) separate systems, which would automatically eliminate the need for encoding/tagging.
@nicolas3933 ай бұрын
great talk
@Marco-sz6mq3 ай бұрын
Nice talk. I am just wondering if someone could suggest me any book that goes more in detail for hardware related stuff, maybe to better understand compilers and low level feature. For example, like the slide at 5:20
@ChimiChanga13373 ай бұрын
Patterson and Henessy - comp arch Dragon book - compiler
@Marco-sz6mq3 ай бұрын
@@ChimiChanga1337thank you
@yapayzeka4 ай бұрын
6:16 very important gotcha! "math is faster than every memory operation. maybe you should repeat the math instead of meoizing the result."
@michaeln.81854 ай бұрын
Great talk! I look forward to using some of these strategies. Open question: 18:21 In languages with compilers that support #pragma pack, is there any advantage to storing booleans out-of-band? Assuming non-sparsity of the boolean field would you still not incur cache line misses regardless of whether the bool is in the structure or an array outside, since the cpu is forced to load the same # of bytes?
@PassifloraCerulea2 ай бұрын
The whole point of storing struct fields at their natural alignment is so the CPU doesn't have to do extra work. Back in the 90s, top-of-the-line RISC CPUs might even crash your program or trigger some OS interrupt to handle unaligned access. I believe x86 tends to be more forgiving about this, but I'm not sure. It is likely that your compiler will go to great lengths to avoid unaligned memory access, using multiple smaller reads or writes instead. As always, looking at assembly output and benchmarking your particular program will be necessary.
@tacticalassaultanteater96784 ай бұрын
Right, I guess my intuition about performance can go out the window now, because all of those steps sound incredibly slow
@zxuiji4 ай бұрын
3:09 nice :)
@tordjarv38024 ай бұрын
Damn, I should learn Zig
@yash11524 ай бұрын
30:10 the plug i was looking for
@yash11524 ай бұрын
20:27 the speaker image covered the content. and i dont understand this example too either.
@yash11524 ай бұрын
20:52 ohhhw, its about language feature where the lang allows to create array of each member ef struct.
@torgnyandersson4034 ай бұрын
About the lexer and retokenizing. That's unnecessary work for strings and other tokens where the length isn't the same for all tokens of that type. Instead, just have one token for "begin string literal" and one for "end string literal". That's what I've done in my own projects. You get tiny token struct and no retokenizing.
@sadhlife4 ай бұрын
16:03 zig has them now
@Fedor_Dokuchaev_Color4 ай бұрын
Bit fields? You don't have to use multiple arrays and enums which make the code less readable. struct Monster { unsigned int life_points: 63 unsigned int is_alive: 1 }; That's it. You can pack a lot of data if you know what range of values is maximum/minimum in each case.
@ratakor2 ай бұрын
Or in this case just check if life_points is lower than 0, but this is a very simple example. For the zig compiler I think that Andrew's strategy is good.
@minastaros4 ай бұрын
Hello, bitfields?
@YmanYoutube4 ай бұрын
Why the fuck is a talk about an abstract concept in CS have 67k views in 3 weeks, this field is GIGA oversaturated.
@rt15174 ай бұрын
In my opinion, this approach has an impact on code simplicity and maintainability. On one side you have a structure that contains what you are interested in. On the other side, well, it is more complicated. And I would even say more bug prone et harder to debug. Also, he focuses only on the size of the data, while the way you access the data is also very important. For example, it is supposed to be way faster to visit all the elements of an array than to visit all elements of a linked list. Because with the array, the memory access are very close to each other so you stay in the cache and the processor can even predict what memory address will be accessed next. With a linked list, you are accessing all over the place so most of the time the next item won't be in the cache. Don't get me wrong, it is very nice that Andrew is trying to make the compiler fast. And there are a lot of slow compiler out there. But I am not sure that many of us mortals need that level of optimization.
@Nors2Ka4 ай бұрын
I would say Andrew misunderstood the meaning of DoD and what he showcased here is straight up optimization - a step most mortals do not have to take as their code is 1000x slower for other reasons entirely, i.e. using pointers from a linked list to refer to individually heal allocated items. There is also this funny thing where he optimized the most irrelevant part of the compiler, LLVM will gobble 98% of the compile time regardless (unless Zig can use it's own fast backends for debug builds). But I do want to say that the way you refer to "readability" sounds like you don't know what you're talking about and the "readable" code you would spew out would be anything but actually readable.
@rt15174 ай бұрын
@@Nors2Ka For me, readability is about how easy it is to understand the code. There is an infinite way of implementing a feature in a program, and some of them are better than others according to some criteria, for example readability or performances. Regarding readability, I would judge it by the amount of time an effort an average developer will need to understand a peace of code. At 22:29, on the left, I instantly know and understand what is a Monster and what it is capable of from its struct definition. But on the right, If I look at the Monster struct, I don't know that it can held items. And if I find the held_items hashmap, I am not sure of the key of the hashmap. Is it the Monster index in the Monster list? Is the hashmap dedicated to the Monsters or is is shared with the NPCs or other possible item holders? Should I clean something in the hashmap when a Monster is deleted? For me, the left version seems a lot more easier to maintain and add features on (for example, looting).
@Average-Lizard4 ай бұрын
I feel the same way. I know in my typical projects (large web and console apps), the code being easily read and maintained is much more valuable than saving KB of memory or a fraction of the time of a single web request. I guess these kinds of optimizations are for specific use cases, but none the less it’s interesting to hear about and have in the tool belt.
@toby9999Ай бұрын
I think so. I've spent the last 20 years working on high-performance software where these techniques make the difference between being the best or lagging behind. For those who're developing web front ends or bloatware, or coding in java or Python, etc. not so much, if at all, in my opinion.
@TheOnlyFeanor4 ай бұрын
The example at 20:00 doubles the number of memory accesses for each struct (one with the animation id, another one for the monster type), unless the reduction in footprint makes it all fit into a single cache line.
@jaredcramsie1824 ай бұрын
I was going to comment this as well. Splitting the data like this will always require two cache-reads on the initial load *. If the out-of-band technique is applied to more data, then each one will incur an additional cache-read. When looking at an individual access, it is like it comes with a guaranteed cache-miss for each time you apply the technique. Data for arrays are stored contiguously in memory. In this example, if you expect to access several elements in a row, then much of the data will already be in cache - because the size of each struct independently is less than one cache line in size. If you expect to go through all the data in any order (and it is small enough to fit in cache), then this results in less cache-misses total, because there are simply less cache-lines to access. These benefits can artificially inflate the performance savings of the technique when people benchmark it by iterating over an entire array of data, or test it in a limited context that doesn't factor in the accumulation of external data in cache from other processes, if the data / procedure is not used often. * For completeness, there are some special cases where you don't have two cache-reads: One cache read: I assume this is what you were referring to in your comment. If you happen to be storing less than 64 bytes of total monster data, 7 or fewer monsters in this example (assuming the 'animation' data is cache-aligned to start with and the 'kind' data is not). Then you can fit 3 extra monsters into the cache-line, where it could only store 4 monsters in a cache-line previously. Three or four cache-reads: In this example his data is a divisor of the cache-line length, if it wasn't then there may be a chance of having to perform another cache-read for each access.
@Norfirio4 ай бұрын
This technique is very useful for instances where you do multiple operations on the Monster and you only need one (or a few) fields at a time. For example, in a game you will have an update function and a render function. If you iterate with the padded struct, you waste time going over the anim field in the update function and then waste time going over the kind field in the render function. You just have to be careful of your use case, as with any optimization methods.
@Oler-yx7xj4 ай бұрын
6:50 using light speed as a measure of how fast an operation executes is funny. It also makes you understand how fast modern computers are: having memory as fast as L1 cache 1 meter away from the CPU is literally impossible
@policyprogrammer4 ай бұрын
Avail yourself of bitfields. They're part of that standard and they can really save your bacon space-wise. In particular, I think they're better than encoding multiple Boolean values into an enum because of maintainability. Imagine you have these encoded enums, and you have to write a function like is_human, and it's working just fine but then someone adds a new variable and re-encodes all the enums so that you have to include some more, but you forgot to update the function. Another advantage of bitfields is you can take advantage of your knowledge of your actual use of the size of types. Like if you know you need a 24-bit, You can just allocate 24 bits, can you the next eight for something else.
@Kenbomp4 ай бұрын
Nice but lisp coders knew this for decades McCarthy
@ktappdev4 ай бұрын
When he made alive_monsters and dead_monsters I was smiling so big my gf started investigating me
@qy9MC4 ай бұрын
So (Dod) not only saves memory but also makes programs faster !? I'm in !
@tbird-z1r4 ай бұрын
So awkward seeing a nerd swear when doing a speech.
@LNTutorialsNL4 ай бұрын
Can someone link the video he mentioned by the guy in the red shirt?
@ChimiChanga13374 ай бұрын
kzbin.info/www/bejne/qImTeqeMerudfsU - Mike Acton DoD talk
@Tomyb155 ай бұрын
Before even watching, I'm gonna predict that the solution is basically using an ECS or similar techniques to make objects of arrays of data instead of arrays of objects.
@shikutoai5 ай бұрын
In Allen-Bradley/Rockwell PLC world, every atomic data type has an alignment of 4 bytes. You can initialize a BOOL and it takes up 32 bits of memory space. Way more efficient to index into individual bits of a DINT. Unfortunately, it’s way more of a pain in the ass to see magic numbers everywhere, only to find out they’re binary masks. Edit: slight correction: AB states the DATETIME data type to be a 64 bit REAL that’s a “special use atomic type,” whatever that means.
@SumoCumLoudly5 ай бұрын
what an advantage in life being put onto coding at such a young age
@thewhitefalcon85394 ай бұрын
good for your career, very bad for your s-x life
@Eng_Simoes4 ай бұрын
An advantage in coding
@GabrielxC3 ай бұрын
Yeah, I started 65 years old, too late 😢
@avwie1325 ай бұрын
His backstory is 1:1 my backstory… I feel very unique
@_mrgrak5 ай бұрын
great talk, but at 20:00 andrew is talking about memory layout and how 7 bytes are wasted - assuming the enum is a byte! in many languages an enum defaults to an int, which would be 4 bytes. imo, the enum backing type should be shown, just for clarity.
@greasedweasel80874 ай бұрын
But even if the tag is a u64 or such, those 7 bytes are still “wasted” in the sense that they don’t provide any information. They will always be 0, since the tag will only take values from 0 to 4 it never touches those more significant bits. In practice the exact type of the tag doesn’t matter because it will never take values high enough to get use out of those bytes.
@jotch_76274 ай бұрын
also what languages are you using that automatically and truly use 32 bits for an enum tag lol
@_mrgrak4 ай бұрын
@@jotch_7627 well theres this isa called x86
@_mrgrak4 ай бұрын
@@jotch_7627 the isa would determine word size, not the language. languages can target many isas. thanks for playing
@greasedweasel80874 ай бұрын
@@jotch_7627 I mean rust uses an isize by default which is ssize_t in C aka a pointer sized integer so it’s not that far out of the realm of possibility, although 1. You can easily change this and 2. That’s only for debug builds, when optimisations are turned on it immediately collapses to the smallest size possible