How do dictionaries (hashmaps) actually work?

  Рет қаралды 94,363

A Byte of Code

A Byte of Code

Күн бұрын

Пікірлер: 118
@proloycodes
@proloycodes 2 жыл бұрын
i know quite a bit abt hashmaps, and i have to say, you did a very good job explaining it. except the part abt dealing with hash collisions, which it seems you made it look like that's the only way, when in reality it's only one way of dealing with the problem. for starters, it can be a tree or even not any structure at all: whenever a hash collision occurs, it puts the value in the next free slot, or it can do some moving items around before putting the value. in fact, there are many algorithms to decide how to move stuff around and even how to get a free slot to put the value in. see for example, Robin hood hashing.
@AByteofCode
@AByteofCode 2 жыл бұрын
Very true, I guess I kind of tunnel visionned into just one possibility in order to finish the video as soon as possible. I'm really gonna need to improve my fact checking, the only video of mine I i think doesn't have holes is the flask one :| I'll pin this so others can see the nuance.
@proloycodes
@proloycodes 2 жыл бұрын
@@AByteofCode i don't think the "tunnel visioning" is a problem as long as you make it clear that this is just one possibility.
@AByteofCode
@AByteofCode 2 жыл бұрын
What can I do to make my future videos better? Any feedback (even negative) is greatly appreciated!
@zyansheep
@zyansheep 2 жыл бұрын
_Blazingly Fast_
@AByteofCode
@AByteofCode 2 жыл бұрын
@@THEROOT1111 Yep, sorted that one out lol. Monad and future will be at least not "whispering and reading an erotic novel at the same time" (some other commenter)
@proloycodes
@proloycodes 2 жыл бұрын
@@AByteofCode perhaps increase the sound levels. Reducible might be a good reference (speaking for myself, dunno abt others)
@proloycodes
@proloycodes 2 жыл бұрын
@@zyansheep yeah im not sure its actually _that_ fast in practice
@AByteofCode
@AByteofCode 2 жыл бұрын
@@proloycodes I was speaking pretty quietly in those videos, from monads on I'm gonna be nearly shouting at my mic, it sounds a lot better!
@bash0985
@bash0985 2 жыл бұрын
This is really helpful! Very reminiscent of Fireship's 100 seconds videos, I can totally see you becoming big like that. Also the animation is great.
@AByteofCode
@AByteofCode 2 жыл бұрын
Thanks man! Yeah I'd started this channel after watching his second channel video on how he made content, and I wanted to give that style a shot (omg the ctrl-z trick is hot as f*ck tho).
@abdelhakimkhabir
@abdelhakimkhabir 5 ай бұрын
and i'll see you in the next one 😄😄
@chudchadanstud
@chudchadanstud 2 жыл бұрын
Hashmaps and Dictionaries are not the same thing. Yes a hashmap is a dictionary but a dictionary is not a hashmap. You can implement a dictionary with just about any data structure that lets you insert and remove data. e.g. an array of key-value structs. You can simply traverse using linear searching find the key you're looking for. It's not efficient but it works and is a dictionary. A hashmap on the other hand uses a hash function with the key is a parameter to compute a hashcode. The code is used to decided where to store the data in your chosen data structure.
@AByteofCode
@AByteofCode 2 жыл бұрын
That is an excellent point! I had cut the difference out of my script but I should have left it in
@samarthjain5015
@samarthjain5015 Ай бұрын
A dictionary is implemented using a hashMap. Here, the keys are considered like elements of a hashMap, which then fetch the value. You get the desired key from dictionary in O(1) using hashing, then you fetch the corresponding value in O(1). I'm talking about the dictionaries in python. I name my dictionary variables "hashMap", if I can't find anything else.
@erikwg3814
@erikwg3814 2 жыл бұрын
I'm a bit tired of hearing hashmaps are "O(1)" when any implementation I've seen is dependent on the hash function as well as the collision handling. To my understanding, those two cannot possibly be implemented in O(1) unless you have an internal array large enough to uniquely cover every possible hash value thus avoiding collisions.
@AByteofCode
@AByteofCode 2 жыл бұрын
You are absolutely correct! Then again, our computers and all are turing complete, even though they do not have enough memory to actually satisfy the definition. These classifications are mathematical estimates assuming the algorithm is executed perfectly, but the list thing is just to overcome technical limitations.
@turboblitz4587
@turboblitz4587 2 жыл бұрын
Yeah I stumbled over that one too. It's weird, the computation time is not directly dependent on n but kind of stochastically, although a scenario is possible where it is stochastically 1. I guess one limitation of hash maps is knowing the best size of the internal array beforehand. Too big and you waste space, too small and you decrease runtime. How collisions are handled also play a major role in this.
@AByteOfPi
@AByteOfPi 2 жыл бұрын
ah yes, smoking hash to do fast coding
@Rudxain
@Rudxain 2 жыл бұрын
0:12 lmao, I got the meme references (except Mike, who's Mike?) 0:19 to anyone at this part, please note that he said "interpreters". Compiled langs (like Rust and C) don't need hashmaps to lookup vars, because all the variables that will ever be used are known and declared statically at compile-time, so the compiler can replace vars with raw pointers to memory addresses, completely invalidating the need for a "middle-man". Dynamic langs like Py and JS usually require hashmaps, because the halting-problem prevents the bytecode-compiler from predicting assignments and declarations
@AByteofCode
@AByteofCode 2 жыл бұрын
FINALLY! Out of 8k+ views you're the first person to comment about the meme names! Let's just say Mike's last name is "Oxlong" :) Also second point very good and true.
@Rudxain
@Rudxain 2 жыл бұрын
@@AByteofCode thanks lol
@5014eric
@5014eric 2 жыл бұрын
I had normally imagined they were done with trees and therefore O(log n) for both reading and writing. That allows for scaling up without reorganising as it grow bigger. The next question then is what size is used when creating the new object? Too large and creating a lot of them uses lots of memory. Too small and it has to rejig itself a few times if more and more elements are added. I assume the size used would go up by powers of 2 or 4, putting a limit on the proportion of 'wasted' space. Although that extra space makes collisions less frequent. What is the most efficient algo may depend on the ratio of reads & writes, not to mention knowledge of how many elements there are going to be.
@AByteofCode
@AByteofCode 2 жыл бұрын
I'm actually not too sure about specific implementations but I do know python does the rejigging with different sizes thing
@Slackow
@Slackow 2 жыл бұрын
I remember learning that prime numbers are specifically good at being the size of a hash map, their lack of factors helps avoid collisions, so it's probably not as simple as powers of 2.
@samarthjain5015
@samarthjain5015 Ай бұрын
So, it still performs a linear search?
@WolvericCatkin
@WolvericCatkin 2 жыл бұрын
From my understanding, most practical implementation of hash maps, append additional colliding key-value pairs, as nodes of a linked list...? This has the advantage of not requiring any costly reallocations when inserting a value under a colliding key, whilst still managing to provide a close to `O(1)` time complexity, even with the random memory access...
@AByteofCode
@AByteofCode 2 жыл бұрын
Good point! I didn't do as much research as I should have for this video so I definitively missed a fair amount of details...
@ahjsbkdjhavkcjhvac
@ahjsbkdjhavkcjhvac 2 жыл бұрын
these videos are so high quality, im genuinely surprised you dont have more clout
@AByteofCode
@AByteofCode 2 жыл бұрын
Considering my sub count of two weeks ago (30), I'd say that's getting resolved right now :)
@mjcal96
@mjcal96 29 күн бұрын
Hey these videos are great. What software do you use to create them??
@endemwone
@endemwone 2 жыл бұрын
These videos are great. Keep them coming.
@AByteofCode
@AByteofCode 2 жыл бұрын
Shall do! Also thanks for the kind words!
@matheosmattsson2811
@matheosmattsson2811 2 жыл бұрын
Good content. Only improvement I suggest is mic quality. It sounds a bit like you are way too close to the microphone
@AByteofCode
@AByteofCode 2 жыл бұрын
Yeah I figured out a bit about how to use the mic I have so should be better in future videos (and the monad one)
@MeNowDealWIthIt
@MeNowDealWIthIt 2 жыл бұрын
If you have to iterate through all of the keys that collided at the mod of the hash, then isn't accessing still linear time (albeit with a nice small leading constant)? IIRC when I implemented hashmap for my data structures class I had to use a tree at each index of the internal array for logarithmic time, but that's still not the constant time you mentioned.
@shaileger
@shaileger 2 жыл бұрын
I guess he meant to say that the average complexity is O(1). The case you are describing is the worst case scenario O(n) which shouldn't happen very often if the hashing algorithm is decent and even if its happens the lists wouldn't be that long and that's why is usually considered as O(1)
@MeNowDealWIthIt
@MeNowDealWIthIt 2 жыл бұрын
Ah, I hadn't accounted for, the internal array will usually have more spaces than the number of total items stored.
@AByteofCode
@AByteofCode 2 жыл бұрын
If your hash map is going at O(1) time you're doing something seriously wrong. Most languages either force you to say how big it is or automatically readjust for you when it grows. Needing to loop through the end array is just a contingency plan that even when needed, as Shai Léger said, isn't much overhead.
@vikingthedude
@vikingthedude Жыл бұрын
Can we deal with hash collisions using a hashmap? Like hashmaps within a hashmap
@AByteofCode
@AByteofCode Жыл бұрын
Generally most hash collisions come from just rounding the hash result, so the solution is to have a bigger table. The odds of two different objects with actually colliding hashes is so low that you're better off with a regular list. However, using a different hash function for subtables is a very fun idea :)
@blazingazong
@blazingazong Жыл бұрын
I mean, if my array is the exact size of the largest prime number known to man, it may take up way more storage than necessary, but I think I can avoid most collisions. On another note, doing that might be catastrophic if hackers can find the array size trivially, but at least it’s free of most collisions, right?
@AByteofCode
@AByteofCode Жыл бұрын
I don't immediately see the danger of hackers knowing the size of your hash map, but I'd be interested to know if you don't mind sharing :)
@blazingazong
@blazingazong Жыл бұрын
I just mean that is they know the size then they have a clue as to what the key will be since the key is usually related to taking the remainder of any value with the given prime number value, and if the size is the same as the key so that there are’t any sizing issues, if you can somehow find the array allocation size then you also know a big chunk of the key.
@AByteofCode
@AByteofCode Жыл бұрын
@@blazingazong Worst case scenario, the hacker could know a chunk of the key, which is a hash. Already, if you know a hash, unless the key is tiny or you have a lot of computing power, you won't get anything out of it. So if you only know a chunk of the hash, its pretty much worthless. And that's assuming knowing a key is useful. If you have a password dictionary, the actual password hash would be in the value, not the key, and if you have access to the file anyways, you have all the keys already :)
@blazingazong
@blazingazong Жыл бұрын
@@AByteofCode I see what you’re saying on both of those fronts. I suppose you’d have to run the hash code on the client side in order to see any of the code, and at that point you would be able to see the entire hashing algorithm anyways- and also you’d have the entire array of passwords sent to your client which would be hilarious- but even if you did know the key, you still wouldn’t have access to the array behind the hash map anyways. Thanks for the learning!
@AByteofCode
@AByteofCode Жыл бұрын
@@blazingazong Absolutely no problem! That's why im here :)
@LeonardoPinto
@LeonardoPinto 2 жыл бұрын
It also makes you to subscribe subliminally
@AByteofCode
@AByteofCode 2 жыл бұрын
Shhhh don't let the others know :)
@TheonlyNonlethalninja
@TheonlyNonlethalninja 2 жыл бұрын
Whens next video, one tip it sounds like you're whispering from behind me lol
@AByteofCode
@AByteofCode 2 жыл бұрын
Definitively this week, I've nearly finished the editing for it! And yes, Im whispering a bit less in this one :p
@motonoob-i2d
@motonoob-i2d Жыл бұрын
Thanks Bro, I'm trying for the n+1 time to get into APL
@ArchonLicht
@ArchonLicht 2 жыл бұрын
A dictionary doesn't have to be a hash map. Red-black tree can also be used, for example, as in Java's TreeMap (as opposed to HashMap). Yeah, in Java they call Dictionary a Map - because it's shorter I guess. While C#, Python, Objective-C and Swift call Dictionary a Dictionary - some more limited languages like Perl and Ruby just call it hash, because hashtable is the only kind of Dictionary they have built in (and I don't even know what do you call it in those languages when it's a hash function like SHA1 or MD5). But truly special is PHP which calls a Dictionary just Array. That's it, Array. I mean Dictionary is also not a proper term - proper one is Associative Array. But calling Associative Array just Array is like calling JavaScript Java.
@AByteofCode
@AByteofCode 2 жыл бұрын
PHP always finding a way to make a non sensical decision smh. But yeah you are right that dictionary != hashmap, although the most efficient dictionary algorithm is the hash map
@ArchonLicht
@ArchonLicht 2 жыл бұрын
@@AByteofCode Well, I guess we expect too much from a technology called PERSONAL Home Page :-)
@AByteofCode
@AByteofCode 2 жыл бұрын
​@@ArchonLicht i thought it stood for pile of hot poop 🤔
@ArchonLicht
@ArchonLicht 2 жыл бұрын
@@AByteofCode Yeah, that sounds about right :D
@kelton5020
@kelton5020 2 жыл бұрын
Good video. Only critique is that you skipped the part about how it actually finds the element by its hash value.
@AByteofCode
@AByteofCode 2 жыл бұрын
If the problem is I didn't explain the hash function, tbh I don't understand it yet but when I figure it out I'll make a video about that. If the problem is how it turns a hash into an index, note that a hash is just binary and can be read as a number, which can be modulo'd with the length of the list. That gives an index in the list and it uses the native array[index] method. Would it have helped if the video contained a re-write of dictionaries?
@kelton5020
@kelton5020 2 жыл бұрын
@@AByteofCode Thanks for the reply. Yeah I was referring to how the actual lookup happens. How you find the hash without just looping through until you find a match
@AByteofCode
@AByteofCode 2 жыл бұрын
@@kelton5020 Oh the hash is basically just a fancy mathematical function that takes a binary input and turns it into a semi random output of fixed length. I don't know much more for now (will be researching it soon) but its just a ton of modulos and bit magic.
@TheDuckPox
@TheDuckPox 2 жыл бұрын
​@@kelton5020 I believe you would just plug the result of the (hash % num_of_indices) into the array index and then search the actual id or hash linearly in the inner array/linkedlist. Although, I wish the video also explains the possibility of changing the array size.
@drsensor
@drsensor 2 жыл бұрын
Just food for thought on your next video: "dictionary vs inline cache"
@TomtheMagician21
@TomtheMagician21 2 жыл бұрын
Hold on I'm a bit confused with the collision aspect of it: So you could have like a class Cheese, then have an inherited class called Cheddar, then put it into a hashmap which shows its popularity and maybe with whom it is popular idk. But if Brie and Cheddar collide, idk how to find out which is the correct popularity of each Cheese. I tried Googling but as you might've guessed, I can't find anything answering about Cheese-related Hashmaps 😂
@AByteofCode
@AByteofCode 2 жыл бұрын
I don't understand your question but basically if two keys get the same hash, their values are stored together as a list of tuples, meaning you can look through and every value will be written next to its original key. so if Chedder and Brie both get the location 4, you'd have 4: [(Chedder, 78), (Brie, 52)]
@TomtheMagician21
@TomtheMagician21 2 жыл бұрын
@@AByteofCode Ohh ok thank you that mostly makes sense, do tuples only store 2 of the same datatype though or could it store a class and an int? Wait could you not store the whole array as a list of tuples or is that what a dictionary is? Wait nvm that would be O(n) wouldn't it ok thank you 😂.
@AByteofCode
@AByteofCode 2 жыл бұрын
@@TomtheMagician21 Tuples (in python at least) can store whatever the f**k you want. Other languages may require certain type consistencies, but considering that classes have their own type, you could have (Class, Int) tuples (im guessing, not too sure)
@TomtheMagician21
@TomtheMagician21 2 жыл бұрын
@@AByteofCode Oh that's cool then, is a tuple basically an array that you can't change?
@AByteofCode
@AByteofCode 2 жыл бұрын
@@TomtheMagician21 Exactly!
@zebraturner7103
@zebraturner7103 10 ай бұрын
Still have no idea what hashmap is, but very interesting 🧐
@SumanRoy.official
@SumanRoy.official 2 жыл бұрын
Try speaking a bit louder please, it will help
@AByteofCode
@AByteofCode 2 жыл бұрын
Yep, did that with my monad video and I am very happy with the result so will be doing that in the future!
@abdelhakimkhabir
@abdelhakimkhabir 5 ай бұрын
hey but here how you've came with the length 0:56, why you've set it to 5 1. i thought that it's the current lenght + 1, but you have an empty map. 2. i thought it's just a huge number => lead to collisions maybe, and then we can solve them with the table 3. i thought that the element just appears out of thin air! 🤣
@sanjaux
@sanjaux Жыл бұрын
brb fixing my multiple (lots of) ui elements highlight & select code now
@AByteofCode
@AByteofCode Жыл бұрын
wdym?
@sanjaux
@sanjaux Жыл бұрын
​@@AByteofCode If I could select elements with a selection box and update their positions when dragging them by looking at a hashmap of their positions rather than looping through an array of all the positions finding the i-th index to update it would probably be faster when dragging a lot of elements at the same time right?
@AByteofCode
@AByteofCode Жыл бұрын
@@sanjaux I'm starting to see it, but wouldn't a position be an (x,y) coordinate amongst a large number of coordinates, so I'm not sure going through a hash table of every coord in your selection area is faster than checking every object, unless you have an absolutely insane number of objects per pixel. (or maybe I didn't see it)
@sanjaux
@sanjaux Жыл бұрын
​@@AByteofCode ​ I meant a hash table that stores the coordinates of each object's position and its name/ID *after* the selection was made, not specifically during the selection surrounding it by the selection's coordinates. I'd store the position of the objects afterward so when I need to access them for a position update it would have lower time complexity, at least that was my thought process 🤔
@AByteofCode
@AByteofCode Жыл бұрын
​@@sanjaux I see, but what key would you use for the hash table?
@ouadieelouardy1171
@ouadieelouardy1171 Жыл бұрын
The Next Fireship channel keep up ❤❤
@AByteofCode
@AByteofCode Жыл бұрын
Thank you!
@dan110024
@dan110024 8 ай бұрын
Yeah the 'blazingly fast' comment made me alert and then the presentation style and memes put me off watching it seeing it's pretty much a direct rip of Fireship. Be unique. Make your own style. Don't blatently rip it off someone else.
@bigchungus2667
@bigchungus2667 Жыл бұрын
Did an exam on hashmap while on hash this afternoon
@AByteofCode
@AByteofCode Жыл бұрын
Thematically consistent :)
@jeremydepassorio664
@jeremydepassorio664 6 ай бұрын
more videos on DSA
@AByteofCode
@AByteofCode 6 ай бұрын
In the works ;)
@Alysia.in.Wonderland
@Alysia.in.Wonderland 9 ай бұрын
Love the ethereal sounding voice
@Haru02w
@Haru02w Ай бұрын
Clone of fireship. I liked it
@MaxPicAxe
@MaxPicAxe 2 жыл бұрын
I'm getting NoMagic vibes with this channel
@AByteofCode
@AByteofCode 2 жыл бұрын
Idk who that is so I'll watch some of his stuff and get back to you about it
@MaxPicAxe
@MaxPicAxe 2 жыл бұрын
@@AByteofCode haha. That's awesome that you replied. Great channel! I'm looking forward to new content
@AByteofCode
@AByteofCode 2 жыл бұрын
@@MaxPicAxe I'm looking forward to making new content! Ngl thats a good combo. And thanks for the kind words!
@damonfernandez3051
@damonfernandez3051 7 ай бұрын
thank you that was so helpful
@navidjalilian5087
@navidjalilian5087 2 жыл бұрын
Great video, low voice volume
@AByteofCode
@AByteofCode 2 жыл бұрын
Sorry about that. I fixed it for monads and future though, so shouldn't be an issue anymore!
@TomtheMagician21
@TomtheMagician21 2 жыл бұрын
Wow I actually understand it!!!
@MaximeFRYSOU
@MaximeFRYSOU Жыл бұрын
Great job 👍🏽. Just advice but maybe it's me, you may try to articulate and speak a little louder cause it sounds like you're mumbling
@PecPur
@PecPur 5 ай бұрын
nice
@brandonmrchew
@brandonmrchew Жыл бұрын
twink femboy
@marknn3
@marknn3 2 жыл бұрын
Why you don't have ANY punctuation marks in your script. So tiring to listen to.
@AByteofCode
@AByteofCode 2 жыл бұрын
I had edited the spaces out but yeah won't be doing that any more lol
@rithulkamesh
@rithulkamesh 2 жыл бұрын
Try using innovate...
@AByteofCode
@AByteofCode 2 жыл бұрын
god damn
A brief Introduction to Flask (Python Web Framework)
2:08
A Byte of Code
Рет қаралды 70 М.
Hash Tables and Hash Functions
13:56
Computer Science Lessons
Рет қаралды 1,6 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН
Правильный подход к детям
00:18
Beatrise
Рет қаралды 11 МЛН
Naming Things in Code
7:25
CodeAesthetic
Рет қаралды 2,3 МЛН
Learn Hash Tables in 13 minutes #️⃣
13:26
Bro Code
Рет қаралды 402 М.
Large Language Models explained briefly
8:48
3Blue1Brown
Рет қаралды 943 М.
Hashing Algorithms and Security - Computerphile
8:12
Computerphile
Рет қаралды 1,5 МЛН
Map and HashMap in Java - Full Tutorial
10:10
Coding with John
Рет қаралды 616 М.
LeetCode was HARD until I Learned these 15 Patterns
13:00
Ashish Pratap Singh
Рет қаралды 661 М.
Why You Shouldn't Nest Your Code
8:30
CodeAesthetic
Рет қаралды 2,8 МЛН
How to treat Acne💉
00:31
ISSEI / いっせい
Рет қаралды 108 МЛН