Tsoding has finally heard us, Gentiles, giving us the content that's on our level. The Lord has heard my prayers, amen. 😭😭
@pinniporker5 ай бұрын
What do you mean, Gentiles? Is he Jewish? ✡️
@maxmustermann55907 ай бұрын
I was asking myself some time how he got this good, but when implementing hash tables in C reminds you of your childhood that's probably the answer
@snw40411 ай бұрын
"Implementing hashtables in C kinda remind me my childhood" 💀💀
@caiodavi9829 Жыл бұрын
finally back to C
@anon_y_mousse Жыл бұрын
My goto hash table implementation is to use an array of arrays to handle the collisions, but I've also used an array of trees and an array of sorted lists and all manner of other combinations. I also always use a power of two for the table size so I can just do a `bitwise and` to normalize the hash and store the unnormalized hash with the keys to make collision checks faster. I also like to use a handy trick to speed up deletions, if that's an operation you even need to support, whereby I swap the key to be deleted with the last key in the table and just fix up the indices for that key then decrement the length of the table. It only reorders two elements with each deletion so it's a lot faster if you don't need the data in a sorted order after every modification, or for that matter need to preserve the order. I've used a DJB hash function every time I've needed string keys and I kind of flip flop on whether *31 or *33 is better, but I'd say play with it for whatever data you're using and either it won't matter or one will be just a bit better. Though, I always initialize it with 5381 as I figure that constant is what designates it as a DJB hash. Maybe I'm just weird.
@ElPikacupacabra Жыл бұрын
At 43:43, the real reason is that the subtraction trick only works for integers less than around half the value of the maximum absolute allowed. If you subtract `SIZE_MAX` from `-SIZE_MAX`, you will get overflow and your comparison will fail. In your case, it's irrelevant because there are not that many items in the hash table. `SIZE_MAX` is very large, practically. A better way of coding the logic without risking overflow and without too much branching is `(b < a) - (a < b)`.
@jadetermig2085 Жыл бұрын
Branch != cmp. There are zero branches in (b < a) - (a < b). There are two extra comparisons but no branch. A branch is a conditional jump. Another example would be cmov. That allows the compiler to compile something like if(a < 0) a = 0; without branches / conditional jumps.
@ElPikacupacabra Жыл бұрын
@@jadetermig2085 Good point! I misused the word 'branching'.
@divingeveryday Жыл бұрын
I am now subscribed. I just found these streams and absolutely love them. The rant at the end sealed the deal.
@jolynele2587 Жыл бұрын
i learnt to make a hash table in harvard's free cs50 course and it's pretty similar to this so it's pretty cool. writing collections is pretty fun
@i007c Жыл бұрын
you can also use the clock() function for cpu time. it will give you the total cpu cycles that took between the operation
@robmorgan1214 Жыл бұрын
Wow, nob looks pretty freaking cool! Make sure your clock is just a wrapper for rtdsc with calibration. Also to preserve precision subtract prior to conversion to double and then do your multiplication*. (Edit*)It's a minor detail that rarely matters but it's best to be in the habit of maximizing both accuracy and precision while ALSO maximizing performance (fewer FP ops = less rollover error and less fp mantisa mismatch error eg subtract a very small fp from a big fp will always = big fp no matter how many times you do it)
@anon_y_mousse Жыл бұрын
Using rdtsc is specific to x86 platforms, and can be locked by the kernel, and thus must be accounted for beforehand on such platforms. The subtraction won't work prior to conversion because he's attempting to express the value in seconds and the tv_nsec field expresses the number of nanoseconds, so it has to be converted first.
@robmorgan1214 Жыл бұрын
@@anon_y_mousse the conversion is trivial it will work with or without the subtraction. As for x86 etc., any generic code you're gonna write in c or c++ will require define guards all over the place to do anything useful and platform independent. This isn't Java or python. Either embrace the metal or or not but the critique stands: when using floating point you need to have good habits or you won't get the answer you want. The reason you use rdtsc or it's local HW equivalent instead of a clock is to avoid measurement jitter or to detect runtime jitter and other non deterministic things happening to your code. This can be very important in many applications as core hot behavior is very different than one shot performance with cold cache. Interrupts can also make important performance tests non deterministic. C is an abstraction that let's you decide when it's time to program the machine, not the abstract machine. It goes without saying that you should pin and isolate your process or execution thread to a specific core if you care about accurately benchmarking a key section of a program... especially in C/C++ where memory performance and cache strategies often create the biggest bottlenecks unless it's unusually mathematically complex (rtdsc is local to your core)
3 ай бұрын
I know it's not your goal. But with every video I get the idea that you're getting more and more into engineering standards.
@AlmogD Жыл бұрын
This is actually an exercise in The C Programming Language by K&R
@harleyspeedthrust4013 Жыл бұрын
perfect timing, i have been implementing some based data structures in c++ for fun these past few days
@puzzlinggamedev11 ай бұрын
I like the exploratory style. Very interesting lessons learned!
@araggohnxd Жыл бұрын
gotta love this man's passion
@rodelias9378 Жыл бұрын
Great stream and really good pieces of advice at the end!!
@blackhaze385611 ай бұрын
This is a masterpiece of programming.
@afterschool2594 Жыл бұрын
Let's Go we're back to C again
@s4n7r0 Жыл бұрын
you absolutely murdered that semicolon backseating guy
@kossboss2 ай бұрын
lololol
@loic.bertrand Жыл бұрын
43:20 also subtration can cause an overflow, for exemple if you're doing (-big_number) - (big_number)
@kossboss2 ай бұрын
thought -- what do you all think -- time delta in seconds using double might introduce some precision issues. maybe long double would be better or your original implementation in int nanoseconds. in the end you still convert to seconds; but its better to save the converting to the end. also another factor is that potentially these dont run long enough to counter act the precision.
@mire6134 Жыл бұрын
Are you going to participate in the AoC event this year?
@sanjaux Жыл бұрын
Keeping my inspiration up with these vids 🫡
@felixbilodeau-chagnon4781 Жыл бұрын
Love how the djb2 hash function is basically the same thing he did except it starts at 5381 instead of 0 and multiplies by 33 instead of 31
@monad_tcp Жыл бұрын
That was probably exactly how the writers of djb2 came up with those numbers, just by throwing shite at the wall and seeing if it does better or worse. Which ironically, if you document it , is the scientific method. And thats literally why it's called computing science .
@demolazer7 ай бұрын
When asked why you are coding something, "Because I can" is an acceptable response.
@slava6105 Жыл бұрын
Поставлю лайк, наверняка в универе пригодится
@TsodingDaily Жыл бұрын
Лайк тебе никак не поможет в универе. Надо смотреть.
@lievenpetersen Жыл бұрын
28:15 all hail the CLITERAL! (although the extra parens are not necessary afaik)
@u9vata Жыл бұрын
Awsome t-shirt. I have some like that too - just regionals though, and much better I am at home too because I am more like a slow water makes interesting algs than always in competitive environment ;-)
@RandomGeometryDashStuff Жыл бұрын
2:00:46 because terminated string?
@agx1397 Жыл бұрын
From where does actual implementation of hash table start?
@RandomGeometryDashStuff Жыл бұрын
1:43:44 why cap in parentheses when assign but not multiply?
@monad_tcp Жыл бұрын
Multiply doesn't suffer from priority inversion of token precedence. Or probably just a bug. The idea of parenthesis is that if you put an expression in the macro instead of a single token, you don't break the precedence rules or even the syntax. Makes the macro a bit safer, not by much , it's still CPP , C preprocessing macros .
@RandomGeometryDashStuff Жыл бұрын
@@monad_tcpwhat is priority inversion of token precedence?
@hbobenicio Жыл бұрын
Just curious about the performance of these algorithms when compiling with optimization flags (like -O3 -march=native)
@datadriveAshura Жыл бұрын
Next pointers in c pls
@RandomGeometryDashStuff Жыл бұрын
43:14 big count overflow?
@alexloktionoff68338 ай бұрын
How stb hash does perform compared to this DIY?
@leewei9117 Жыл бұрын
I like your channel very much. Can you upload English subtitles? Is it convenient to translate into my mother tongue? Thank you very much!
@caiodavi9829 Жыл бұрын
👎
@jamesphilemon801011 ай бұрын
The auto-generated captions are actually very good quality (probably because of his enunciation), so just turn them on.
@yrpsa Жыл бұрын
can anyone tell me how do you mark the text like that in 2:38
@anon_y_mousse Жыл бұрын
It's just highlighting the thing he's searching for. I don't know how it's turned on in Emacs, but in Vim you'd set hlsearch in your ~/.vimrc or if you just want it for the current session by typing it in directly at the command prompt.
@Ледимир Жыл бұрын
Great video as always!
@jannemyllyla1223 Жыл бұрын
sum makes it's distribution very non-uniform. (basically it is very similar to average)
@glowiak3430 Жыл бұрын
Great, but why complicate it so much? My approach is to just have two arraylists, one with strings and one with objects with 1 to 1 correspondence.
@Exatio Жыл бұрын
Optimization. In hash tables, the index of an element is created from itself, so when searching, you just have to hash the element you're searching, and then check the index that was generated. This is O(1). In an arraylist, youd have to use bilinear search which is O(n)
@pierreollivier1 Жыл бұрын
@@Exatio his approach can be more performant, because you have two different compact data structures for different things, such that you have less risk of cache eviction on certain operations. It's called data oriented design, you make sure to have compact data structures, to maximise the use of the cache. Which is by far one of the most important performance predictor in almost any programs.
@jacksonlevine9236 Жыл бұрын
@@pierreollivier1How does this invalidate hash table as a data structure though, it sounds like you are just poking holes in his arbitrary use case he chose. Hash tables are a thing, they have their place, no knock on data oriented design, but one doesn't invalidate the other.
@pierreollivier1 Жыл бұрын
@@jacksonlevine9236 It doesn't that's why I've said that "his approach can be more performant" It's not necessarily the case, a good old hash table might be faster, but a DOD hash table might also be faster depends on the use cases, but both are valid variation of the same model.
@kasuto-no-machi Жыл бұрын
Because having two 1:1 arrays/lists is linear time lookup, insertion and retrieval. That’s fine if you have very few keys, but for a hash map those operations are all constant time, and for large datasets it’s far more performant.
@j.u.g.y Жыл бұрын
Wouldn't it be accurate to lowercase first.
@ailphaune Жыл бұрын
so the smallest string (using only lowercase letters) that has a djb2 hash of 69 is "glidrn" Edit: i'm pretty sure it's the smallest string even if you allow the full range of ascii characters
@arabusov Жыл бұрын
А переполнение беззнакового в сях разве не UB?
@DeathSugar Жыл бұрын
Nob release when?
@MatthisDayer Жыл бұрын
How does one write "azozin"?
@HuntingKingYT Жыл бұрын
Missing so many semicolons... really feels like python
@caiodavi9829 Жыл бұрын
😮
@opsJson_ Жыл бұрын
now i'm happy.
@0ne87 Жыл бұрын
delta_secs sounds like something that happens at super elite masquerade parties.
@rian0xFFF Жыл бұрын
I wish I had time to fully watch your videos, they are so cool.
@tigranmartirosyan7111 Жыл бұрын
Thank you for the excellent content! I would like to suggest recording another session that delves into consistent hashing, a technique actively employed in network load balancers.
@jadetermig2085 Жыл бұрын
I think you missed an opportunity to call it nobs = no build system
@UlvicanKahya Жыл бұрын
The guy who suggeted 31 as the multiplier is 100% a Turkish guy
@vishwanathbondugula4593 Жыл бұрын
Hey a quick question, why don't you use make or cmake as your build too
@cobbcoding Жыл бұрын
go rebuild yourself is still the best macro name i've seen
I recently downloaded the musializer project, compiling with nob is pure delicacy ./nob :) I have been experimenting with nob.h, we hope to have an official version of that promising project. Thanks tsoding
@DelgardAlven Жыл бұрын
at the dead end, semicoloned code much more readable and much more brain computable, from my conservative point of view. but yeah, for "shitting-on-the-go" type of activities python is pretty nice 👍🏼. Good stream, thanks!
@ludwintor4986 Жыл бұрын
I love how you after the message from chat "you forgot a semicolon" just go and remove the only one semicolon. great troll
@Bambuchaa5 Жыл бұрын
SAIAGO FEZ ANTES DE VOCÊ !
@Vulto1665 ай бұрын
I still using nob.
@xspager Жыл бұрын
I'm glad Python exists.
@marcelijankowski9593 Жыл бұрын
Perfect timing! I just finished implementing hash table in x86-64 asm for my brainfuck interpreter. Although there is a bug in automatic resizing, slots are populated with some random gibrish after couple resizings. Gonna have to look into that. Edit: Ha! I managed to solve it, I had a label "ht_address" with address of hash table (statically stored, cause only one table is ever instantiated in that project), and I had statements like this one: leaq ht_address(,%rax, SLOT_SIZZE), %rdi The thing is that here the address at which the value of ht_address label is stored is used, and not the value itself! Silly me!
@Shywizz Жыл бұрын
And here i am wondering if ill ever even be on half that level.
@Bluesourboy Жыл бұрын
You will be on that level as long as you try to be. The hardest problems are always solved with persistence.
@kingfrenchtoes5769 Жыл бұрын
drop the github link
@marcelijankowski9593 Жыл бұрын
@@Shywizz With enough effort you'll definitely get there. There's nothing special about tsoding guy, me, or anyone else for that matter (mostly).
@marcelijankowski9593 Жыл бұрын
@@kingfrenchtoes5769 this interpreter is literally the first thing that I ever wrote in assembly, so I didn't initially put it on github. I definitely will once it's finished (around 2 weeks from now most likely). I'll post the link here, if I won't forget. Now with the hash table, memory allocator, linked list, and so on so fourth implemented I just gotta finish the main interpreter part and it'll be over.
@keremardcl6759 Жыл бұрын
Yet you dont like PHP just having the feature natively :D
@AlameenAdeyemi8 ай бұрын
Bro said he his not a computer science person 😂 btw how can i join the discord server?
@davydorynbaev Жыл бұрын
hello hello zozzing
@orzklv Жыл бұрын
FINALLY! Some normal fucking content!!!
@orzklv Жыл бұрын
@TigranK115I’m not about tsoding only, I mean, all dev channels combined.
@samuraijosh1595 Жыл бұрын
Yeah finally some content I'm hoping us mortals can understand.
@trashcan3958 Жыл бұрын
Coding in C is so painful...
@adriansyahkadir1548 Жыл бұрын
1w upload?😂
@boogly3716 Жыл бұрын
Вот это возвращение к истокам... Кстати, насчет программирования в молодости: Недавно смотрел интервью 46-летнего Мурыча, где он рассказал об одержимости каждого программиста из 90-ых написать свою виртуалку. Не было ли у тебя такой шальной молодости?))
@brissance Жыл бұрын
i have no programming socks .
@realfootball338 Жыл бұрын
Hello
@____r72 Жыл бұрын
Gluggalog vandierbloggagglen
@soyitiel Жыл бұрын
Ayo, if the compiler can tell us where we forgot to put a semicolon then it shouldn’t need us to put a semicolon, it should just infer it, I mean js already does it
@jadetermig2085 Жыл бұрын
This is a nonissue to C programmers. I never forget semicolons and type them without thinking about it / noticing it. It's an issue to people that regularly switch between C and Python/Go/JS. AFAIK semicolons are optional in js but it's idiomatic for people to use them. Does no semicolons even work with minified javascript?
@soyitiel Жыл бұрын
@@jadetermig2085 Having acquired the habit, I guess that's alright Still, as you said, it'll always be an issue for polyglots Also I don't think there should be any merit in developing the habit to always do something that could and should already be done, it'd be like developing the habit to always putting the toilet seat down or closing the fridge door because someone else in the house always leaves those open, it just shouldn't be necessary. In any field, the goal should always be to alleviate unnecessary tasks so processes may be optimized. PD: About js and semicolons, that's exactly the point I'm trying to make: Do you want to write semicolons? Knock yourself out. Do you not want to? I got you, hun
@jadetermig2085 Жыл бұрын
@@soyitiel Sure but js syntax is already C-like and (like C) very sprawling and not elegant or minimal. For something so "trivial" as semicolons it's a tradeoff between familiarity and "correcting mistakes of the past". I'm not even sure semicolons were a mistake for the context C was designed for (a PDP-11 in the 70s). JS syntax was obviously inspired by C so an argument could be made that having to unlearn a habit is just as counterproductive. Semicolons are kind of special since they're optional so a C programmer in the 90s didn't have to unlearn it to use js. But he still has to adapt to it not being there even though the syntax otherwise feels like C. For example nobody complains about the lack of semicolons in Haskell because the syntax is nothing like C. All in all I feel it's no big deal. I still type semicolons in js (as do most people in my experience) and for languages that are nothing like C I don't even think about it because everything is different.
@balijosu4 ай бұрын
The point time semicolons ever annoy me in C/C++ is with struct/class defs. Feels so unnecessary.
@xmoex6393Ай бұрын
1 hour of talking before actually going into the hash-table details, mixing up the "build system" code with the application, not impressed... The comparison between linear search and the hash-table was kinda cool though...
@cronosmain Жыл бұрын
C# of a normal person
@vbachris Жыл бұрын
very dumb hashtable, 100% time improvement!
@Purkinje90 Жыл бұрын
Have you accepted Rust as your lord and savior?
@desertfish74 Жыл бұрын
Knob
@careneso6502 Жыл бұрын
Хоршо что Rust существует , в его стандартном крейте уже есть хеш таблицы. Ни один гомосексуалист не будет оскорблен тем, что он вынужден писать свою хеш таблицу. Воистину лгбт-френдли язык, не то что С\C++
@ariabk Жыл бұрын
rust is better :3
@enclave2k1 Жыл бұрын
As someone who loves rust, why not just view languages as tools to complete a task? Hammers aren't better than screwdrivers; use the best tool for the job at hand. People treat languages like political affiliations these days. (sorry to go off, I know this is likely a tease/troll comment for tsoding - just see a lot of this kind of stuff, which confuses me)
@dupdrop Жыл бұрын
@@enclave2k1 Programming languages are just tools, but C is like that hammer with a handle the stubs your hand with nails. Friends don't let friends use bad tools. But hey, if you like it, go ahead and use it :) If there is one language that isn't going to ever disappear, it's C.
@jadetermig2085 Жыл бұрын
@@dupdrop To ppl that have coded in C for decades, Rust is far more likely to feel like getting your hands stubbed with nails. When you have mastered a tool you begin to see it as a means to an end and you become frustrated when another tool gets in your way. I.e. I know what I mean and I know exactly what I want the machine to do, so don't get in my way. All languages influence the way you think about programming and problem solving through their syntax and semantics but rust is especially onerous in this regard.
@ariabk Жыл бұрын
@@enclave2k1 hi, im really not trolling. im a fan of rust, but this is kind of a joke. if you saw me as trolling, im sorry i made you feel that way. also tsoding has made really great apps in rust (noq, seroost). anyways, that was a joke, and a language is just a tool, it's just that rust is the tool that i enjoy using most. rust is obviously not objectively better! anyways, thank you for your reply!
@enclave2k1 Жыл бұрын
@@ariabk Cool! Thanks for clarifying. Sorry for reading too much into your comment. I did love those series, I'm actively following noq in particular.
@0x80-linux_user Жыл бұрын
Nice video, but i can better
@monad_tcp Жыл бұрын
Actually GCC should be part of the shell so it can run C code directly #! /bin/gccsh Even better, someone should propose an RFC for POSIX to have GCC as a shell.
@monad_tcp Жыл бұрын
I was a dummy. There's this thing called C shell , they tried it , not good enough thou
@munawarcheema899111 ай бұрын
go_flags="-g -Wall -include allheads.h -O3" alias go_c="c99 -xc - $go_libs $go_flags" where allheads.h is the aggregate header you’d put together earlier. Using the -include flag means one less thing to think about when writing the C code, and I’ve found that bash’s history gets wonky when there are #s in the C code. On the compilation line, you’ll recognize the - to mean that instead of reading from a named file, use stdin. The -xc identifies this as C code, because gcc stands for GNU Compiler Collection, not GNU C Compiler, and with no input filename ending in .c to tip it off, we have to be clear that this is not Java, Fortran, Objective C, Ada, or C++ (and likewise for clang, even though its name is meant to invoke C language). Whatever you did to customize the LDLIBS and CFLAGS in your makefile, do here. Now we’re sailing, and can compile C code on the command line: go_c