Data-Oriented Design Revisited: Type Safety in the Zig Compiler

Data-Oriented Design Revisited: Type Safety in the Zig Compiler - Matthew Lugg

Рет қаралды 7,166

Zig SHOWTIME

Күн бұрын

Пікірлер: 19

@edgeeffect 8 күн бұрын

In the intro to this talk... I was thinking "is he going to put the shirt on or not?!" ... I wasn't disappointed.

@N1c0_792 10 күн бұрын

Great talk, interesting and very understandable throughout

@Bvngee 8 күн бұрын

Great talk! very engaging, enjoyed

@karlohlen9320 11 күн бұрын

Nice talk! Very well done.

@sadhlife 11 күн бұрын

Where can I buy the shirt?

@CoolestPossibleName 8 күн бұрын

Are you planning on doing a talk on dod too?

@SimGunther 11 күн бұрын

There's CPU local performance, which these DOD talks discuss pretty well, and then there's inter-coprocessor and inter-computer performance. While DOD can help with data layout to make the most out of the cacheline, it can get people stuck in the weeds. It's best saved for when you know the problem requirements and the bare minimum data that must be returned at each step from the DB all the way down to the (co)processors and see whether it's too little data to make the best use of co processors that make bulk data processing super fast or if the loops make it so a more "OOP" layout makes sense or if SOA/AOSOAs are more practical for maintenance and runtime performance.

@morglod 9 күн бұрын

do you have comparison on how much cache misses you have with SOA and without? because in my mind it should be more cache misses because usually you check tag and then go to data, so you first load tags to cache than load datas to cache, then load tag etc

@mlugg5499 9 күн бұрын

It's not a problem that the accesses jump back and forth between those regions, because the CPU has lots of cache. If you imagine a hypothetical CPU with two cache lines for data accesses, then one line could store a block of tags, the other a block of data; neither gets evicted until we move onto the next block of tags/data. Of course, in reality, CPUs have orders of magnitude more cache lines than this -- so there's no problem here at all, the chance of those 2 lines getting evicted while you're e.g. in a hot loop over all your instructions/nodes/etc is basically zero. So, data fragmentation of this form isn't really an issue; it only matters once your data is fragmented enough that you can't make full use of a cache line before it's evicted.

@morglod 8 күн бұрын

@@mlugg5499 okay, thank you!

@morglod 8 күн бұрын

@@mlugg5499 cant find anywhere info about "two L1" caches. Everywhere its like L3 has many L2 which has many L1 do you have any link about it? like something very detailed

@hemerythrin 8 күн бұрын

@@morglod Comment doesn't say anything about L1 caches, so I assume there's some sort of misunderstanding. Caches are separated into "lines" (usually 64 bytes), where each line shares a single tag and is loaded/evicted together.

@morglod 8 күн бұрын

@@hemerythrin okay So it actually has L1 misses a lot But L2 is ok Understood, thank you

@ukyoize 11 күн бұрын

13:38 one could have just "flag" tag with data describing what kind of flag it is

@mlugg5499 10 күн бұрын

Do you mean collapse `plus` and `minus` into one tag which uses `data` to encode which "actual" tag it is? If so, yep, that's sometimes a helpful strategy! We don't do it for the AST, since we have enough tags from 1 byte anyway, so there's no point in it. However, for ZIR, one similar thing we do (which I didn't cover in the talk for simplicity / time reasons) is we have a tag called `extended`, which essentially sacrifices 2 bytes of our 8-byte payload for a second tag enum. The reason for this is that, broadly speaking, a small number of ZIR instructions are very frequently used, and a lot of instructions are very rarely used. So, for frequently used ones, it's helpful to have the whole 8 byte payload to minimize the need for `extra`; but for the rarely used ones (think instructions for things like inline asm), we can happily sacrifice a bit of memory.

@Bobbias 9 күн бұрын

@@mlugg5499I was wondering about why you guys did that. Wasn't expecting to just stumble into the answer in a KZbin comment, but I'm happy I now know.