if anyone is confused about 15:17 the padding is required for struct alignment, since the 1 byte tag is being stored with a type that has larger alignment requirements (u16/u32)
@narutosimas2 ай бұрын
Man, I am confused about everything here
@genericuser15462 ай бұрын
@@narutosimas I'd be happy to help but youtube isn't the best platform. Do you have a discord?
@steffahn Жыл бұрын
5:19 - the padding is actually just because the u64 is 64-bit aligned. The tag is just a single byte, Rust doesn’t make these bigger than necessary. But for the u64 to be 64-bit aligned, now the enum Foo must also be aligned this much, which is where the padding comes from.
@zactron1997 Жыл бұрын
I love Rust, but I agree I'd love Zig's comptime system as an alternative to macros. Rust macros are vastly better than C, but there's definitely utility in comptime.
@FlanPoirot Жыл бұрын
The con of zig's comptime is that writing an LSP for it is just harder. Types and fields can change based on the platform you are, type definitions can be created through a comptime thing, so your LSP has to be able to run arbitrary Zig code on the fly to be able to give you accurate autocompletion.
@comradepeter87 Жыл бұрын
@@FlanPoirotI don't see how that'd be different than running arbitrary Rust code for proc-macros.
@zzzyyyxxx Жыл бұрын
Too bad the person working on compile time macros in Rust was chased away...
@FlanPoirot Жыл бұрын
@@comradepeter87 in rust every compile time thingy is easily marked as such, they're either const or a macro, zig allows arbitrary code at compile time
@chainingsolid Жыл бұрын
@@zzzyyyxxx If your looking to generate code at compile time D has string mixins. You write some mostly regular D code that generates a string at compile time and it gets treated like source code the compiler read off disk.
@kuhluhOG Жыл бұрын
8:20 This reminds me. AT some point Gamer's Nexus was also benchmarking compile speed of CPUs because of common request. But they noticed something: It effectively just "benchmarks" how much Cache a CPU has (more = faster) with minimal differences when two CPUs have the same amount of Cache (except ofc if we talk multiple generation difference, but that's to be expected). So at some point they stopped because they also just could have given their viewers the datasheets. (They tested this by compiling Chromium from scratch.)
@yannick5099 Жыл бұрын
Zig is super cool. It's a modern version of C with lessons learned. Language, tooling and community are great. I have high hopes for the language.
@pmoohkt Жыл бұрын
So sad I'm allergic to the letter 'Z' nowadays for... reasons. I abandoned ZIO (a functional layer on top of Scala), and didn't buy a ThinkPad Z13 for exactly the same reasons. Call me crazy on that one !
@AlFredo-sx2yy Жыл бұрын
@@pmoohkt I know where your Z phobia is going and i just can't help but wonder what does that have to do with programming. Is it just me or are you Rust folk overly political?
@yevgenydevine Жыл бұрын
@@pmoohkt you could've said you are retarded in fewer words.
@pmoohkt Жыл бұрын
@@AlFredo-sx2yy Sorry, I'm not a Rust folk at all, I work in Scala. Yes I 100% agree that it has nothing to do with programming and it's a personal feeling... just seeing the letter Z inside a logo makes me want to look elsewhere... like when I see the German symbol of war last century, I tend to forget that it was (and still is !) a representation of sun and well being for more than 2500 years. Nothing political, gut feeling... I may very well be the only one 🙄 Question: how would it be if you put a sticker of the Zig language (that big Z in brackets) on your laptop and go to sit in a coffee shop? Maybe OK in US, not sure how it will work in Eastern Europe however, and you will have lots of new friends in Russia for sure !
@vikhr Жыл бұрын
@@pmoohkt I think you need to stop being addicted to politics and mainstream media, it looks like it is causing irreversible brain damage.
@FrankHarwald Жыл бұрын
21:49 custom bit width integers is something I've always wanted to do but never could - outside of VHDL, Verilog & custom Clang intrinsics.
@yaksher Жыл бұрын
@2:00 I believe Rust uses the smallest integer size that can store the number of enum variants for the tag (so, u8 almost always), not a word.
@MetroidChild Жыл бұрын
This is explained later in the video, but in order to not tank performance, types need to stay aligned to their size. This means if you have an 8 byte union (i64, i32 etc) you still need another 8 bytes for the type tag if it's stored right next to the union in memory (to keep the upcoming 8 byte union aligned), this means 9 bytes of storage becomes 16 bytes.
@yaksher Жыл бұрын
@@MetroidChild I mean, it depends, but yeah I guess it does mean that the tag is usually a word in practice. But if all of the types in the enum have 1, 2, or 4 byte alignment instead of 8 byte alignment, then it's relevant. That said, yeah, if you have a u64 in your enum somewhere, it'll probably end up requiring the tag be padded to a word yeah.
@wliaputs Жыл бұрын
@@MetroidChild Why does that tanks performance?
@MetroidChild Жыл бұрын
@@wliaputs "safe" programming languages have to add more pointer checking if a data type is unaligned, if you use "unsafe" languages without dynamic pointer checking performance won't take a hit, at least not on modern processors (10 years ago it was a different story).
@JoseColonTV Жыл бұрын
Learn Zig or Ocaml? Go for Zig man, it's the future of systems programming. I predict (crystal ball moment here) that when Zig is mature enough (post 1.0) Linus will prefer it over Rust for the Linux Kernel. Rust is great and provides many safety guarantees, but if you're a C kernel hacker, it's definitely more natural and painless to switch to Zig.
@jmickeyd53 Жыл бұрын
Learning Zig is clearly the more practical choice, but I think the real benefit of learning a new language is to learn a new paradigm. It forces you to think about solving the problem in new and different ways. And for that, I think OCaml has more value, even if you never write another line of OCaml after you're done.
@sprenten1 Жыл бұрын
In this instance It's even more painless to write the same thing in C than either Rust or Zig.
@shrin210 Жыл бұрын
@@ITSecNEO Keep your opinion to yourself. Zig is safer than C. Unsafe code in C != unsafe code in Zig. Because zig is so similar C that it will be easier to integrate if existing codebase is in C. Unlike rust which is totally different to code. Many reasons to use Zig. If they implement it or not time will tell.
@shrin210 Жыл бұрын
@@ITSecNEO Atleast make a convincing point before posting jargon like facts(opinions). You could have included points related to systems programming about why not zig? Your facts was just basic blog about memory unsafe vs memory safe 🫤
@sean_r11 ай бұрын
@ITSecNEO dude i dont think zig will be integrated either but jesus theres no need to get so aggressive about it. on top of that, you hardly have said anything that would indicate your knowledge on the subject, and particularly with yor comments on zog's safety, i find it hard to believe you know what you're talking about when it comes to zig. aside from all of that, your comments come off as just kinda bitchy and arrogant
@blinkybool Жыл бұрын
16:00 that’s the field_map array in the code snippet you skipped. The field to clusters bookkeeping is handled internally in the data structure, and it provides an interface for appending, getting a removing.
@simonfarre4907 Жыл бұрын
Which makes me wonder? Why not write this in C++ where this would be fairly trivial. More trivial than Rust, certainly, but even so, more trivial than Zig?
@lorenzodrea4640 Жыл бұрын
Idk why people sometimes present zig as an opposite to rust when they are amazing complementary languages. You want memory safety -> rust You want memory controll -> zig They both have a lot of features in common (can't wait for zig 1.0 so i can start learning it)
@baxiry. Жыл бұрын
You want memory safety -> go You want memory controll -> zig
@lorenzodrea4640 Жыл бұрын
@@baxiry. waiting for the garbage collector to de-allocate you comment /jk
@nathanfranck5822 Жыл бұрын
I'm learning now - It's fun! ZLS has an option to buildOnSave which helps me fail faster. It'll be nice when Github Copilot isn't 2 years out of date for all the stuff that's changed
@mattymerr701 Жыл бұрын
@@lorenzodrea4640 now it'll never happen because you made a reference to it
@swapode Жыл бұрын
TBF it's really rather rare edge cases where Zig offers better memory control. Like Rust's approach to ADTs is exactly what you want in over 99.9% of cases, because you'll generally wanna save CPU cycles over a hand full of bytes. And it's not that Rust doesn't allow to do the same thing, it just doesn't help you there. I'm doing some embedded stuff in Rust right now, and I'd rather write a bit more Rust code than introducing another language (besides the need for the Zig solution simply hasn't come up yet). I suspect I'd see it the same way if I was writing a compiler. I'd take a couple of hours to write the packed array data structure for that specific purpose and be done with it. Sure, it's not generic, and it'd be cool if it was, but a couple of hours saved once for a big project is actually utterly meaningless - especially when there are way more hours that Rust will save me in the overall project. But maybe that's unfair, I spent a quite a bit of time to get truly comfortable in Rust - and it seems unrealistic to expect the same from other developers. Not because it's a bad idea, but because most just accept that Rust is hard and give up before they're even halfway comfortable in safe Rust.
@jacknguyen5220 Жыл бұрын
As someone who has implemented indexing on segmented arrays using trees, I can confirm dealing with indexes can become a nightmare. For example, 1 optimization that could be considered here is optimizing the index bookkeeping for long contiguous segments with the same size or type, which when I last did it I basically had trees on top of my trees.
@JDalmasca Жыл бұрын
I think the TS equivalent to Rust enums is like passing type arguments into generics and then using a type union like the one you showed. Maybe something like this?: type Foo = undefined | Bar | Baz Doesn't seem like quite the right syntax, but probably close.
@disieh Жыл бұрын
I don't remember where I read it, possible some CppCon video. My understanding why Rust chose so explicit and painful integer conversions is learnings from C and C++. People running analyzers on C and CPP which found overflow issues often found it next to impossible to deduce after-the-fact what under or overflow behaviour was correct choice. It was so tricky that sometimes even original authors did not know. Rust tried to preempt this by being much more strict about conversions and over/underflows.
@ruanpingshan Жыл бұрын
A lot of the time, whenever I include any math/scientific C++ library, the compiler output is absolutely flooded by numerical conversion warnings, making it very difficult even to find error messages.
@bloody_albatross Жыл бұрын
That makes sense for converting e.g. u64 to u32, but why do you need to be explicit when converting u32 to u64? What possible problem can there be? Which is what he asked in the video.
@disieh Жыл бұрын
@@bloody_albatross my guess, they made it explicit so whoever is writing the code is forced to think about, hopefully reducing unintentional mistakes. Only time will tell whether this system was a good choice or not. In some ways same theme as borrow checker, attempt to turn hard-to-debug issues into compile-time issues.
@bloody_albatross Жыл бұрын
@@disieh Yeah, but what mistake can there be when converting from u32 to u64? If you need to do the cast even when there is no reason, then doesn't casting become like clicking away error messages? You're just doing it without thinking?
@disieh Жыл бұрын
@@bloody_albatross Only reason I can think of is if you do widening conversion somewhere, you realistically end up doing narrowing conversion somewhere later. Explicit conversions force you to type stuff that can be searched later --- or analyzed by a tool for that matter. Is there a less invasive way to achieve same thing? I don't know, probably. Also I think they probably thought its easier to first make something that works and is strict, only later loosen it up when its more clear what can be loosened without sacrificing correctness.
@itellyouforfree7238 Жыл бұрын
5:20 no, the padding is there to satisfy alignment requirements of the subsequent field
@apestogetherstrong341 Жыл бұрын
"comptime ... is like you write zig that writes zig" Can't wait for him to learn about lisp in like 10 years
@SaHaRaSquad Жыл бұрын
It sounds like you assume the only reason people don't use lisp is because they don't know about it.
@adamm450 Жыл бұрын
@@SaHaRaSquad arguably one of the reasons why people dont use lisp is because they know about it
@apestogetherstrong341 Жыл бұрын
@@SaHaRaSquad it is
@apestogetherstrong341 Жыл бұрын
@@SaHaRaSquad or actually, the reason they don't use it is "no i don't like the parens". that's it
@apestogetherstrong341 Жыл бұрын
@@adamm450 do you know about it?
@wesleydahar7797 Жыл бұрын
The memory saved by creating an array of vectors of arrays (AoVA) also incurs dynamic memory allocation, non-trivial member access/iteration, poor cache locality, and memory/performance overhead for maintaining the complex AoVA structure. Unless you're working with tens of thousands of elements in single (not multiple) collections, you're not going to see any benefit of memory usage over performance cost and the data access limitations from even a syntax standpoint. I'd imagine that wide references into that collection will immediately negate the saved memory, especially for tightly coupled data structures like ASTs. There will be dynamic memory allocation for a mutable AoVA (even if it is never resized) because the worst case requirement of storing only the type with the largest size and alignment will necessitate that the total worst case space is allocated at some point if many of the elements are set to those types during runtime. I think it's better all around to simply pre-allocate the worst case space and accept that alignment will incure at worst an 8 byte overhead for the tag and structure alignment and handle exceptional, large types as pointers/references instead of by value.
@ryanseipp6944 Жыл бұрын
I wonder what the performance impact of all this effort would be. On one hand, you have much less memory utilized for padding, and can theoretically fit more into a cache line. On the other hand, to iterate through, or utilize in some way seems to require quite a bit of indirection by looking up indicies and swapping between arrays in the SoA. The latter maybe kills the benefits of more tightly packing the memory layout due to cache lines getting swapped out constantly? I don't know the answer, but interesting to think about.
@aoeuable Жыл бұрын
The answer to your thoughts is: Stop thinking and benchmark. But to contribute to the speculation: SOA like this will be slower for random access, but faster to iterate through, in both cases only if memory is the bottleneck in the first place otherwise they're going to be the same (indexing maths is negligible compared to what a CPU can do in a memory load).
@MetroidChild Жыл бұрын
Really depends what you need, splitting out the tags into their own array is basically free though if you just need element sized unions, it only like 3-5 extra ops per access, prefetching on modern processors handle any remaining latency.
@evtimpapushev7137 Жыл бұрын
On the question at 15:29 - tags match the size of the fields because of memory alignment. Each data type aligns at addresses divisible by the size of the data type.
@itz_premium Жыл бұрын
I have a new favorite quote now... "If you wanna just raw dog some strings in there.."
@dealloc Жыл бұрын
The equivalent in TypeScript is to define a type which has a unique tag (e.g. an object with a "type" field, that has a unique value amongst the other variants with the same field, but different values). A union of those types would be a tagged union. You can add additional data onto each variant, even with the same values as other variants, as long as the tag is unique. Undefined, null and whatever does not need to be part of this at all. They would also be serializable the same way enums are in Rust with serde_json. You'd still need to specify to serde _how_ to serialize and desrialize, it's no different than doing it in JS.
@balen7555 Жыл бұрын
#[derive(Serialize, Deserialize)] Damn, I had to specify to serde *how* to indeed.
@FlanPoirot Жыл бұрын
let's not pretend that you're doing anything relevant with typescript, all those type definitions are an illusion, whether they're there or not does not really matter for memory alignment or layout, those definitions are just there for your own conform they're not really changing anything.
@andrueanderson8637 Жыл бұрын
@@FlanPoirot Of course TypeScript types don't change memory layout, what would be the point of using a high level language if I needed to worry about memory layout? Might as well just write the C code at that point (and, e.g., compile to WASM)
@dealloc Жыл бұрын
@@balen7555 You miss the point. Those are just proc macros that result in additional code in your build-that code is generated as compile time. The point is not what the API looks like, but that the steps to serialize and deserialize those objects to conform into and from those types is equivalent no matter the language.
@dealloc Жыл бұрын
@@FlanPoirot This was specifically pointing to Prime's example of using "undefined | string" union, which are not tagged unions (in the traditional sense). The point was to illustrate how you can create tagged unions equivalently in TS. The post itself was talking about enums, not tagged unions specifically. All tagged unions in any language is "an illusion" because they are concepts in type theory, and are not compiled into machine code. I specifically didn't mention enums, because they are specific to the implementation of a compiler.
@bloody_albatross Жыл бұрын
OT (because one of the chat messages): If you double click you can select a whole word. Keep the mouse button pressed to select word by word instead of character by character. This way you don't need mad FPS skills to quickly select text. Also triple click selects whole paragraphs.
@mage3690 Жыл бұрын
I thought triple click was for lines, unless HTML just considers everything between tags to be "lines" for some reason.
@bloody_albatross Жыл бұрын
@@mage3690 Well, it spans wrapped lines, but doesn't span over . Whatever one might call that unit.
@mage3690 Жыл бұрын
@@bloody_albatross honestly, not shocked by that. It's a wrapped line. It is a really weird unit to explain to anyone who doesn't know what wrapped lines are, though; and it gets even more confusing when you start considering certain text editors like Quark and WordPerfect (there's a Wiki article on this, shocker). Most environments do agree that triple click is for one line, wrapped or not. WordPerfect says it's for an unwrapped line, and Quark says it's for a sentence--and adds quad and quintuple clicks for paragraph and document, respectively.
@wliaputs Жыл бұрын
Wow I didn't know I can drag after double clicking
@saultube44 Жыл бұрын
@16:00 For a long time Scientists couldn't figure out how plants could grow up hundreds of meters and still get water up there, without pumps or special physical properties/mechanisms; the answer was many years later when we discovered Fractals, that are how increasingly complexity is solved: you just have to find the right Fractal Equation for it and recursively apply it, or a number of them in an algorithm; that's how Nature solves it, that's how we should: the mathematically smart way
@JoeTaber Жыл бұрын
We never got an answer on how to actually index into this (memory efficient) rats nest. Also, this is basically an ECS right? Has anyone tried ASTs with Bevy's ECS?
@SimGunther Жыл бұрын
Dlang SumTypes are a marvel in every way for how powerful, but frustrating they are to use, especially for sum types that include classes with at least two levels of hierarchy. Once you figure out how they work to shape the design of the solution to your problem, it's a glorious thing to witness how effortless maintenance for these parts of the program are.
@FlanPoirot Жыл бұрын
people still care for D in 2023? D blew their one opportunity to get popular by shipping with a garbage collector enabled by default. and from what I've heard the better C mode of D is kinda pointless if you want to use external libraries anyways cause they all rely on the GC.
@SimGunther Жыл бұрын
@@FlanPoirot Proving that Dlang is maintained by engineers not smart enough to make a sensible language for high quality business, but clever enough to make some neat conveniences for engineers who still care to use Dlang in _insert current year_
@HypoceeYT Жыл бұрын
@@FlanPoirotA GC that's guaranteed to only ever run when you allocate memory and there is none free to allocate, which the compiler will check for you in your hotspots with one keyword; that's mitigated or eliminated by the same things you do for performance in any manual memory language; and which has been used in financial trading code, the most latency-sensitive programming imaginable. Carry on.
@timseguine2 Жыл бұрын
@@HypoceeYT Dlang's garbage collector performance is quadratic in the heap size. That makes it by far the slowest GC compared to any other GC language out of the box.
@HypoceeYT Жыл бұрын
@@timseguine2Cool story. I very much doubt that next to Python, given....Python...but even if so, so what? We're assured that GC performance doesn't matter because all GCs are poison; we're following off a comment that Dlang destroyed itself by putting a GC in by default, ewww blech mah performance, it's on my hands, get it off. A GC that's guaranteed by the compiler not to run has no performance impact. You rough out your program as in (GC-tainted crippled languages that nobody uses since they blew their one chance at popularity by putting in a GC) Python or C# or anything on the JVM including Kotlin, then tag your performance hotspots nogc and put in object pools and such to tweak them, as if you were in C or C++ but sane and with a bunch of other conveniences and lovely idioms. How is doing all that flow in the same language, with a single keyword that enforces the rules at the compiler level (much like a subset of Rust's philosophy) and tells you where to look if you hit a memory problem, worse? And then there's the crazy team that needed to go big and fast and not have big GC pauses, needed GC -enabled algorithms, and accomplished their performance goal by running the GC _more often_, placing a mathematical upper bound on the amount of memory it could possibly need to clear.
@cloudsquall88 Жыл бұрын
if you have such large descrepancies between enum variants in rust, couldn't you just use a heap allocated type so that all of them would be the same size (ie the size of a reference plus whatever)?
@AttilaSzenczi Жыл бұрын
Exactly, I think this is an overcomplicated solution which does more harm than good. An interesting, but useless idea. If you need to bookkeep the ordering (which I would expect from a Vec), then the memory required for bookkeeping might far exceed the memory you win by this magic. Also any algorithm on top of that, like sorting is a nightmare. It might be useful in some very specific cases, but definitely not something I would like to randomly see in a codebase. I'd rather say, that if you have an actually big type amongst the enum variants, put that into the heap. Don't put everything to the heap because you don't want to destroy the cache friendliness of your code. Thats mostly unrelated to enums tho, big types are not belong to the stack.
@chainingsolid Жыл бұрын
I may have to sit down and try out Zig's comptime. I wonder if it can finally beat D's compile time abilities (see std.traits, mixins, and ctRegex to start) the last one writes the code to implement a given regex at compile time.
@SaHaRaSquad Жыл бұрын
comptime is amazingly intuitive to use imho. You just put "comptime" in front of a function call and that function will be executed by the compiler and the code is generated from its return value. Precompiling a regex would work exactly the same.
@JorgetePanete Жыл бұрын
You can't deny that Rust has SotA throughput in drama.
@garanceadrosehn9691 Жыл бұрын
Very well-written and obviously well-researched article. Definitely interesting and worthwhile. Thanks for picking this to review. I suspect you've already figured this out, but 'AoVA' was defined earlier in the article: _"I don't think there's a name for this collection, so I call it array of variant arrays (or AoVA). "_
@nikkehtine Жыл бұрын
Quick question: should I try Zig? I started with JavaScript, now I write mostly in Go, sometimes I try C but I can't really wrap my head around it a lot of the time...
@vyyr Жыл бұрын
You should probably stick to one language and "master" it first before moving to another one, language hopping is not good for you. As for learning zig over c, I wouldn't. Zig is great but its still too immature. And, not trying to be a dick but if concepts in C are not clicking for you, try some different tutorial, different lecture, different book, something because the complexity of C is actually understanding the basic concepts of CS and not the language quirks.
@gagagero Жыл бұрын
What's the plugin that auto-generated the match statement at the beginning? Copilot?
@VBB9999 Жыл бұрын
what plugin are you using to get such suggestions as on 2:11? it looks close to cs suggestions in visual studio.
@thedelanyo Жыл бұрын
For the next decades I guess it's gonna be Rust vs Zig
@steveoc64 Жыл бұрын
"When Zig outshines Rust" - man, any article that covered that question would be a very very looooooooooooooooong list of use cases
@curlyfryactual Жыл бұрын
13:10 yeah, of course that approach is hard ... that's almost identical to glibc small bins heap allocation implementation.
@boy_deploy Жыл бұрын
Yeaaaaah! been waiting for another Zig video ❤
@aoeuable Жыл бұрын
I don't think Ocaml has particularly much to offer when you know Rust and Haskell, practically speaking. I would say loot it for ideas: Have a look at functors and modules and variants and how you can put modules in arrays and stuff which is all very neat and nice. Otherwise it's simply a functional language, you already know GC, HM types, and strict evaluation, just (maybe) not all in one place. As to Zig... why not unsafe Rust? That's a question I have in general. If you really want to bend your mind go for Maude, something dependently typed (pick one of Coq, Idris, Agda), or maybe Prolog.
@njnjhjh8918 Жыл бұрын
Especially functors, for quite a while I got annoyed by C++. "How dare you call a functional object a functor! These OCaml functors are real functors."
@complexity5545 Жыл бұрын
Yep. Haskell > Ocaml. But evidently the newer generations are changing ocaml for the better. Zig is good because it primarily uses the LLVM language/compiler. That's a good foundation and reason to use it. You can (probably) output to LLVM then use Haskell/megaparsec to convert it to any other language/parser (that LLVM and haskell knows about). Especially if they keep adding features and the safety that C lacks. I might look into Zig. Why not unsafe rust? Because the community might annex your project and start making it "safe", slow, and not acceptable by the rust community.....actix. Prolog was a good laugh...LOL.
@AlexanderHyll Жыл бұрын
As a reference to Primes "small optimization". The default memory layout already optimizes this? That's why it's not guaranteed to be laid out in the same order as the enum unless using for example repr(C).
@MetroidChild Жыл бұрын
Struct layout optimization just means shuffling the elements within the struct to save space while still keeping members byte aligned to their size, for example making the padded 32 byte data structure {i8, i64, i8, i32, i16} into a16 byte data structure {i64, i32, i16, i8, i8}. This has nothing to do with splitting off enum unions into their own parallel data structures.
@comradepeter87 Жыл бұрын
Pretty sure that didn't even have an effect. The size of an enum is not even dependent on how the variants are arranged, it's only dependent on the largest variant, no matter its position.
@broken_abi6973 Жыл бұрын
This would be an example where C++ outshines Rust. It is ok. I will see my way out.
@sprenten1 Жыл бұрын
Absolutely not you can't leave, we need to create a C and C++ party here, because this is kind of how good C programs have handled memory, used void pointers, structs, and arrays of structs as kind of a generics or variants. Also there is a way to pack the data in C that the Zig example doesn't use which would further reduce memory and increase speed even with a pack and unpack tradeoff and I know it can be done in C++ too.
@sean_r11 ай бұрын
well c++'s lack of static reflection would make this completely impossible
@codeman99-dev Жыл бұрын
2:11 - You don't "lift" with a match binding. A lift is something like JavaScript's Array.of function.
@PaulWalker-lk3gi8 ай бұрын
"I have no idea what hi is talking about, but that sounds terrible." Thank you.
@ArachnidAbby Жыл бұрын
hah, I saw this a while ago and was just waiting for you to make a video about it. Much easier than reading the article myself lol.
@replikvltyoutube3727 Жыл бұрын
The thing missing from article is that, even though there are different size arrays, each element will also need to have a type information. That's how you do match over elements
@TheSkepticSkwerl Жыл бұрын
I like how your montanian accent stands out when you say the word tag because you're pronouncing it just like they pronounce the word bag
@saultube44 Жыл бұрын
@5:30 IIRC, 16 Bytes is a Cache Line, so is fed into the Execution Engine of the CPU through the Cache, way faster than reading 1 element at the time, in theory; I'm skeptical about it; tests needed. *This is how Software gets bloated and wasteful*
@maamardli Жыл бұрын
Is it blazingly fast though?
@ssssssssssama Жыл бұрын
So, the point of the article is that zig maintains a special data structure to hold both the tags and the type in two different arrays thereby eliminating the need for padding, but in a roundabout way.
@vantagepointmoon Жыл бұрын
No clue as to what is being talked about here, but it's very fascinating. I'll probably stay comfortably stupid about topics like that, but I appreciate people burning their brains thinking about these things
@MDarkus3 Жыл бұрын
Not related but I would love to see you talk about WebAssembly
@maksymiliank5135 Жыл бұрын
I've read this article a while ago and I thought to myself that it would be cool if Prime read it because it shows why SOA is great for caches
@debasishraychawdhuri11 ай бұрын
Have a union of Boxes in Rust. So, instead of the variant Bar(String), have Bar(Box::). Anyway you had to put some kind of indirection somewhere, just use the allocation made by Box.
@redcrafterlppa303 Жыл бұрын
11:20 but now instead of 56 bits padding per element we now have an additional 64bit pointer + fragmenting the varient tag from the data. In what world is that more efficient?
@user-py9cy1sy9u Жыл бұрын
Both tag and element arrays are the same size so you can access them both with one array index. No need for pointers. tags[3] and elements[3]
@redcrafterlppa303 Жыл бұрын
@@user-py9cy1sy9u that's fair, but I'm not sure the wasted cache locality of tag and value would be worth the space savings. At this point you could better pack the enum and strip the padding. |tg| padding 56|u64| 128 bit/element To |tg|u64|72 bit/element Sure the cpu needs to copy the unaligned u64 value to work with it. But it likely will be more efficient than the extra allocation stress and fragmenting. I would love to see a benchmark testing booth approaches against each other.
@user-py9cy1sy9u Жыл бұрын
@@redcrafterlppa303 Cache locality matters if you only do random access to this thing. If you do foreach on these elements then cache fragmentation problem is non existent. It all depends on your use case. The author of this blog post is talking about optimizing compiler and in that case you go over each element in order and you are limited by the memory speed so any savings on that front is helpful
@redcrafterlppa303 Жыл бұрын
@@user-py9cy1sy9u that's pretty true. The often forgotten fact applies that you can have inefficient code that is fine if it isn't the bottleneck of the program. If you Programm is super fast but crashes with out of memory it's not useful 😂👍 Jokes aside. This is pretty interesting on both sides for me personally because I'm currently writing a compiler and am therefore working on the compiler to work (efficiently) and language design to facilitate my language. Especially unions are interesting because my language makes even heavier use of them than even rust and if they need a variety of optimizations for different use cases I need to think about that in language design. In case you are interested the partial compiler is on github "RedIODev/Chorus". There aren't many examples, documentation or overview yet though.
@wesleydahar7797 Жыл бұрын
Compilers almost never iterate over collections of tagged unions. Random access is used by ASTs that want references to elements that contain RTTI and are allocated in pools of the appropriate 'Expr' or 'Token' type. Storing references into that complex fragmented structure has overhead which basically negates the value of the AoVA collection. The real issue with cache locality has to be the fragmented nature of the collection, but caches today are large so it's not the end of the world.
@milckshakebeans83568 ай бұрын
Is it possible to use the virtual memory algorithm for this. Maybe I didn't understand but it sounds like a vry similar problem to ram memory fragmentation.
@adrian_franczak Жыл бұрын
What about performance if you don’t have padding?
@SvetlinTotev Жыл бұрын
So what they are really trying to do is packing the memory and assuming this will improve performance. Which isn't a good assumption. Having multiple variable size arrays to store the different sized elements often *guarantees* memory segmentation. Even with very small arrays. In virtually every use case, there are better solutions. If you are going to go through the array sequentially, you could just tightly pack the data when storing it and then unpack it when reading it. And if you need random access, then the proposed solution only helps at all in the relatively small range of array sizes where the tighter packing would significantly decrease the cash misses. For very big arrays either solution would almost always miss, and for very small arrays the simple solution would never miss and the proposed solution will never miss only if all the separate arrays happen to be in cash. P.S. The reason for the padding after the tag is because all primitive types should be aligned to their size, otherwise i/o in the CPU registers would be a bit slower. Which would be a significant problem if most of the data is misaligned.
@b4rti89 Жыл бұрын
Daddy Prime, more Zig please! :)
@equivocator7727 Жыл бұрын
Zig's comptime is so beautiful because it does generics imperatively where most other languages do them declaratively. In fact, I'd go so far as to say declarative programming constructs are almost always a sideways way to think about a problem space.
@phazechange3345 Жыл бұрын
Everyone always talks about generics, but comptime is actually not even mostly used for that. Comptime can reify *any* code, not just generics. This makes it extremely easy to, for example, present a unified API to your users, but individually package optimizations that their use case needs. Efficient inversion of control is trivial. Comptime is used for all sorts of things that has nothing to do with generics. Comptime is one of those things that might just turn all of programming on its head.
@redcrafterlppa303 Жыл бұрын
23:00 I love prog macros. I wish we would get more integration with the compiler to get more information about the code and how it would be compiled to generate adjusting prog macros. Things like layouts of argument types etc. Just like he said. I feel like it should be more cyclic. The compiler should parse all the way pass it to the macro proc and then parse the result from the beginning. This way you get the full information the compiler pulls out of the syntax (like type lens, offsets etc) while you simply write rust code in that macro that the compiler is then parsing from new again with the macro fulfilled.
@TheWrapperup Жыл бұрын
Having compiler info would be very nice for sure, you could do so much more cool stuff at compile-time. This is why Jai has insane metaprogramming capabilities, they just give you imperative access to compiler information as a language feature. A lot of macros though don't require it though, so maybe some multiple stages for macros makes sense. Maybe for Rust, I think the API for it would have to revolve around passing span information around.
@SaHaRaSquad Жыл бұрын
Comptime is the kind of feature that makes you wonder why this isn't how we've always been doing it. I am certain it will inspire many future languages.
@matheusjahnke8643 Жыл бұрын
Actually... the memory layout of tagged enums is undefined.... It means it is generally like it was said... but there are exceptions. Like Option.... because we know valid null pointers don't exist. Or Option... because we know NonZeroU8 can't have a 0 value.
@yevgenydevine Жыл бұрын
“When Zig outshines Rust” Always and in everything.
@AK-vx4dy9 ай бұрын
I have no idea what i'm talking about but i think there is some chunk of comparing apples to oranges in this article. Rust enums are not thought to be constructed in vast arrays, and are thought to be used with very frequent change of containded value type at runtime and so tag is inevitable in this case. About Zig i know nothing but seems that in zig is not possible to check containted type at runtime wich is a result of tag lacking in memory. To the rest, so easiness to construct special efficent data structures of this kind by using some kind of meta programing, at this momen is clearly win for Zig. I wish the best for both lanugages and i think both have rarely conflincting niches and specialisations. Rust typesystem and possibilites to guide futher contributors to code right path is insane (in good sense). Zig almost native interoperability with C, ability to compile C, abilty to replace build system and program it in Zig and ability to easy complie Windows exe on linux are insane(in good sense).
@bloody_albatross Жыл бұрын
"Why is the tag matching sizes?" Because of alignment. 16 bit values have 16 bit alignment, 32 bit values have 32 bit alignment, and 64 bit values have 64 bit alignment. Yes, modern processors allow unaligned memory access, but usually with a performance penalty, so compilers align values (unless you specify `#pragma pack` in C or `#[repr(packed)]` in Rust). So I gather the tag is actually only a 8 bit value if it doesn't need more bits (speculation), but if there are wider fields in the enum there is padding put in between. That's why you should sort the fields of your structs from biggest to smallest in order to minimize alignment padding. (Maybe also local variables? Those might be reordered by the compiler.)
@peter9477 Жыл бұрын
If you don't specify a primitive or C repr for your struct, the compiler chooses how you lay it out (at least that's my understanding), so it shouldn't be necessary to order fields that way in the general case in Rust.
@NomoregoodnamesD8 Жыл бұрын
Alignment is important because generally speaking the processor only loads addresses aligned on those boundaries. So, on 32-bit, to read a 32-bit integer straddled across a boundary, two reads have to take place with some bit manipulation to combine the number into a single register. The space tradeoff of an aligned read is usually always worth it, so it's better to instead find ways to pack data where no padding is introduced. Even reading an 8-bit integer that isn't fully aligned may require a bitshift and a sign extend or bitwise-and. The exception? Sending/receiving data over the network. In order to save bandwidth, it may be worth considering the tightest packing and doing those bit manipulations. Especially since the latency of sending/receiving a network packet is so much greater than doing an unaligned read. It may even be worth considering to compress data before it goes onto the wire. For local variables, the order may not matter as the compiler most likely will optimize all/most of them into registers, and the CPU at run time may additionally optimize stack operations into register reads/writes.
@bloody_albatross Жыл бұрын
@@NomoregoodnamesD8 "Even reading an 8-bit integer that isn't fully aligned may require a bitshift and a sign extend or bitwise-and." How can an 8 bit integer be not aligned? At least on all platforms I know you can only address bytes (i.e. 8 bit steps) and 8 bit integers are byte aligned. So they're always aligned.
@vikingthedude Жыл бұрын
Why do we need the padding? I don't think the article brought that up. I'm not familiar with systems level programming so could someone help out?
@barterjke Жыл бұрын
Because union (or enum is rust) is basically just any type in dynamic languages. You should keep the padding, because even tho you are currently storing, for example, 1 byte, you want to store 16 sometime in the future.
@barterjke Жыл бұрын
if you are reffering to 15:20, it's because you can't store 1 byte along 8 byte, they should be alligned. The same reason bool is stored as byte, even tho it only need one bit. It's just how memory works. You can store it seperatly, the same way you can store bitmask instead of different bool variables, but it's more complicated, and in this case is considered unsafe by the author.
@YourNerdyJoe Жыл бұрын
If you're referring to the extra padding from the tag, it's to keep alignment. Aligned memory accesses are faster (has to do with how the hardware works without getting to detailed). For example the memory location of u64 should be a multiple of 8. In a tagged union where the tag is 1 byte, it needs padding b/c 9 bytes means the u64 won't be aligned to 8bytes but it will be if the total size is 16 bytes.
@parkerhuntington1360 Жыл бұрын
To add a little more on the previous answer, some architectures don't support unaligned memory accesses. For example on a 64-bit riscv system, depending on the implementation, an unaligned load of a u64 might cause an exception. Even when this is supported it tends to be a little slower. This is because rather than passing around 1 byte at a time the cpu is passing around 8. Once you get to the cache, the system may be passing around blocks of 32 or 64 byte cache lines at a time. If you support unaligned access, then it may stretch two cache lines which needs extra logic to implement, which translates into more chip area and a higher price / more power dissipation / slower speeds. Even worse, an unaligned access could straddle page boundaries. So you then need to do two page table lookups to determine if the memory location can be accessed. In this case, even though the tag only needs to be a u8, since the payload that follows is a u64, it is padded out so that the u64 is aligned.
@ymi_yugy3133 Жыл бұрын
These optimizations are super useful, but given the tradeoffs need to be weighed on a case by case basis and that they don't impact the API surface of the container, I wonder why this isn't something that can be done by the compiler with profile guided optimization. Is it just that the maintainers don't want to overburden the standard library with the complexity of these optimizations or are there other reasons I'm missing?
@cancername Жыл бұрын
Unfortunately, the compiler isn't magic. Especially memory layout optimizations are really hard to do. They might come to Zig in very specific cases, some day.
@JaconSamsta Жыл бұрын
Fantastic question! I'd say that generally with niche optimizations, you always need to be careful that they don't bleed out of their niche. The last thing you want is for an optimization to get toggled on or off, due to some (potentially distant) change in another part of your code which affects the validity of the optimization. C++ has a rather famous example with std::vector, where they introduced a niche optimization that packs 8 bools as single bits into a byte, leading to a bunch of headaches (though this isn't as much a niche compiler optimization as it is a specialization problem). In my opinion it's way more important for a language to offer you the tools to do these optimizations yourself, when you need them. So while I love Rust, I think this is a *very* compelling feature for Zig.
@LetYourLightShine5218 Жыл бұрын
You're forgetting that one of the guiding principals of Zig is that there are no hidden costs, no behind the scenes resource usage. Give it some time and people will build wrapper libraries to handle all the optimization for different cases of usage.
@ikhlasulkamal5245 Жыл бұрын
Bro used the First Chat wisely
@ragectl Жыл бұрын
This article feels like an exercise in optimising for performance and potentially getting optimisations in the wrong places, while making it harder to track values. In which case just keeping the original structure and wearing some memory cost might be better. The performance could vary greatly based on system hardware factors.
@k98killer Жыл бұрын
Primeagen skipped the one sentence that explained keeping pointers created at insertion in the vector implementation.
@knifefest Жыл бұрын
I think upcasting ints can't be implicit because big-endian systems require an actual move in memory.
@YomiTosh Жыл бұрын
All the time, you have to leave the space. - F. Alonso
@weeb3277 Жыл бұрын
total zigger victory
@0oShwavyo0 Жыл бұрын
Thumbnail backwards? Just a heads up I think someone put the rust/zig logos in the wrong place when slapping that together
@aanchaallllllll Жыл бұрын
0:00: 📚 Enums in Rust can lead to significant memory fragmentation due to their varying size. 7:45: 📚 Reducing memory fragmentation in ASTs can improve compiler performance and memory usage. 10:19: 📝 The video discusses small string optimizations and the issue of fragmentation in container types. 13:47: 🧩 The speaker discusses the trade-off between collapsing data structures based on size and losing type safety when co-locating different variants of the same allocation. 17:13: 🔒 Rust proc macros have limitations in accessing type information and conditional implementation of generic containers. 20:30: 😕 The speaker expresses frustration with the need to constantly specify types in Rust and the limitations it imposes. Recap by Tammy AI
@necoji4910 Жыл бұрын
your recap sucks .
@davidbriggs8109 Жыл бұрын
I want to use ocaml, but when I do open Base, it throws errors
@victorpinasarnault9135 Жыл бұрын
I ❤ Zig
@CakeIsALie99 Жыл бұрын
Why does a static string have an into() method
@wedoscao Жыл бұрын
Because rust impl Into on it
@peter9477 Жыл бұрын
Do you mean why as in "for what purpose"?
@ahdog8 Жыл бұрын
When Prime does .into() in the video, he is copying the `&static str` (stored like a C string literal) into a `String`, which is stored on the heap. The into method is useful because you might want to mutate the static string, so you copy it into a not-static string.
@avishjha4030 Жыл бұрын
What happened to the Like and Subscribe agen or Zig outshine Rust agen?
@dezlymac Жыл бұрын
😭 Man I understood like 2% of this... I need to learn a low level language like Rust or Zig. JS ain't it
@_start Жыл бұрын
Learn C
@dezlymac Жыл бұрын
@@_start I'm looking for something more modern
@0netom Жыл бұрын
it's starting to look like it's already okay to learn Zig as the 1st low-level language... though learning C shouldn't take longer than a month or so, including the core of libc. it's hard to make such recommendations though, because i had decades to learn these things, since i was growing up in parallel as the computer industry :) i was born in 1977 and wrote my 1st for-loops on a ZX81 with 1kilobyte RAM, when i was 6 years old.
@dezlymac Жыл бұрын
@@0netom 😲 6 years old! Whoa that sounds freaking badass. As you mentioned you had decades to learn. I'm already studying Typescript, and Solidity, so realistically I would only be able to add one low level language to my arsenal. And to master it to a proficient. It's a bet which one would pay better dividends in the future. 5 Years experience with Zig, or 5 years experience with Rust...
@mgord9518 Жыл бұрын
Zig is a pretty incredible language, but it's still fairly unstable at this point which might be confusing at first. I had a bit of a rough start learning Zig coming from Go as it's a bit harder to find articles explaining it. Most of the learning is gonna have to come from official documentation (which is quite good) and reading the stdlib source code. Although I haven't looked back since. Zig's advantages greatly outweigh the issues of it being a relatively less common language. Zig's C interop is literally so good that it's easier to use C libraries in Zig than it is to use them in C.
@iaconst4.08 ай бұрын
good video , many persons say that zig is the fastest language program, is that true?
@ablobofgarbage Жыл бұрын
shoutouts to Orzgorth
@4sat564 Жыл бұрын
ghostty terminal emulator presentation on zig showtime shows lots of interesting zig patterns
@computationaltrinitarianism Жыл бұрын
If you curious about Zig's comptime and Ocaml, I would recommend a cursory glance at Lean 4. It is mostly focused on theorem proving, but is quite interesting as a general purpose language (though not as a systems langauge).
@noredine Жыл бұрын
This part 5:31 read like that "Retro Encabulator" video
@neo3290 Жыл бұрын
My only problem with comptime is when people go crazy with it. Every time you want to read someones Zig code, you are gonna need to understand the way they used comptime to reinvent their own tiny language with their own types, memory layout, footprint etc. OTOH Macros in Rust are hard to write and is a last resort. That effectively discourages people from using them without a firm use case.
@balen7555 Жыл бұрын
What? Comptime is essentially just any other function. So your entire comment translates to "you're going to need to understand their code". Zig's comptime by design can't introduce new syntax or any of the shenanigans you can with proc macros in Rust.
@neo3290 Жыл бұрын
I get what you're trying to say. But most people never have to understand how macros are implemented in order to use them. It's usually a hidden detail that is not needed to know unless you're building something along that line specifically. With comptime, sure its just Zig and that's nice. But you cannot directly use the knowledge and patterns that you learn from other codebases without making sure that the comptime hasn't been used to write it differently. So you start from scratch and make sure someone hasn't overridden how the enum packs the data and so on. The code symmetry can be lost easily.
@TheSulross Жыл бұрын
there's a great YT video on a shell program (e.g., something like bash) being written in Zig where the author has some select areas where comptime and comptime-generated types is used. Very beneficial overall to the program and its clarity/maintainability/efficiency, but the comptime code itself is definitely for Zen masters - opps - meant Zig masters
@balen7555 Жыл бұрын
@@neo3290 This just isn't true for comptime. You seem to have misunderstood as to what comptime is. If I have a comptime function "MultiArrayList" that takes in an argument (a type) and spits out another type, then how is that any different than a normal function "Foo" that takes in an argument (value of type X) and spits out a value of type Y? The knowledge is transferable from codebases the same way any other construct in any other programming language is. What you're quoting applies to macros that introduce new syntax, which leads to people having to learn DSLs specific to a codebase with often bad "intellisense". It doesn't remotely apply to comptime. What you're quoting directly goes against the zens of Zig, and if it were true for comptime, there would be no comptime.
@neo3290 Жыл бұрын
@@balen7555 That's exactly the problem right. The whole paradigm where I have to question whether a type is a derived version or is pure adds another level of complexity that can vary based on how crazy the author has gone wrt to the type generating code. Sure its nice, and I like it for a lot of use cases. But its also something that is far too powerful for people not to misuse in the wrong way. In this case of Rust enums. I'm aware of their memory layout, footprint and what all is needed to keep them efficient. But I cannot say the same about an enum type in Zig and will have to look out for any comptime code that has modified it in a way. Like I said the problem is only when people go "Crazy" with it.
@BosonCollider Жыл бұрын
This is also something that Julia has (and is significantly better at than Zig since it also has lisplike code generation which is *allowed* to use type information, and on top of that it has a low barrier of entry to write compiler extensions). The fundamental reason why Rust can't easily have it is because it has return type polymorphism, which means you can't treat the types as arguments and run arbitrary functions on it. Zig has some level of compile time evaluation on types, while Julia can run arbitrary code at any time and use it to generate code. I've written an achetype-based ECS system (struct of arrays, packs archetypes efficiently but falls back to hierarchical bitmaps if there are too many variants, and allows new type defs at runtime with the Julia JIT) in ~50 lines of Julia. It used exactly this kind of tricks (compile time type reflection) combined with one generated function (i.e. generated AST depends on type input).
@olekollo7875 Жыл бұрын
Julia is only good for date science
@pacotrim Жыл бұрын
So... Prime spent the first few minutes explaining Monads?
@nobleradical2158 Жыл бұрын
The idea of a monad is more abstract than a tagged union, but yes, they are monads.
@yakman8681 Жыл бұрын
Fine I’ll try zig…finally got me🙄😭😭
@footballdudes33942 ай бұрын
why not just use pointers for enums, then theyll be same size? the compiler could handle the optimization for it not acting like a pointer?
@bryanleebmy6 күн бұрын
You'll then have to add a level of indirection which isn't ideal i.e. instead of accessing your data with one memory access, you have to access the pointer in the array, then access the data in another memory location. Keep in mind that we're dealing with dynamically sized arrays, so collocating the pointer with its data is not trivial (and is the point of the article). Without colocation, it's likely you'll have cache misses, which would incur roughly 100x access speed degradation, stall pipelining, and worsen memory bottlenecks.
@TremereTT Жыл бұрын
comp time in delphi/free pascal is super fast ...its a single pass compiler
@corley-ai Жыл бұрын
Props on that thumbnail.
@rainerwahnsinn3262 Жыл бұрын
3:17 shasawn? SHASAWN!
@Ben_EH-Heyeh Жыл бұрын
Zig is cool. Elixir is an abstraction of Lisp. 😂
@thekwoka4707 Жыл бұрын
Not everything is 16 bytes in js. A boolean is 1 byte
@vxmediaxv Жыл бұрын
While the article sound interesting it don't provide any evidence or benchmarks to prove if that have a real impact in memory
@CielMC Жыл бұрын
Me who enjoys writing proc macros: wait I'm the weird one?
@ThePrimeTimeagen Жыл бұрын
You're not a weird one, you're just one of 16 people that know how to do it
@iamworstgamer Жыл бұрын
dude how your nevim intellisense is so much much better than mine, I am also using lsp
@Kabodanki Жыл бұрын
I ❤ zig.
@marcsh_dev3 ай бұрын
Comptime is LISP macros (which is why I like them) Ide love it if C# (my daily driver for various reasons) stole... borrowed rather, comptime
@AK-vx4dy9 ай бұрын
I thought Zig one main of founding points was *no implict*.... "you were the choosen one..." Or maybe this was in other new language ? 😅
@FrankHarwald Жыл бұрын
@ThePrimeTime Elixir is mostly a modernized Ruby, but at least it's trying to be a compiled language (not interpreted) & trying to allow for an efficient & simple usage of multi-threading (never seen that done well in Ruby). However, I don't like Ruby & its core language philosophy itfp so I wouldn't recommend it. I used to be a Python guy until I found Nim - which is basically a also Python, but it also compiles to native code & also allows for easy & efficient multi-treading (which I haven't seen in Python either without bending the spoon backwards). Plus: Nim is more mature then Elixir, outstanding cross-platform & cross-language support & already has existing real-world applications.
@newsofthenerd11 ай бұрын
He said Lift and patern match but not Monad. Everything he just showed there looks like it was taken straight out of Haskell.