To learn more about Fundamentals to Database Engineering course Head to database.husseinnasser.com for a discount coupon Link redirects to udemy with coupon applied.
@monishchhadwa77711 ай бұрын
Right in the beginning of the video, I had many doubts. But only a genius teacher knows how to drive the lecture so that by the end, all doubts are cleared; and I found one!
@deadvortex80754 жыл бұрын
This video, I watched right before a midterm in my systems security course and it helped me so much with the entire midterm! Thank you!
@hnasr4 жыл бұрын
All the best 😊 glad it helped
@KostasOreopoulos4 жыл бұрын
This was a great article. Going over to the Wikipedia site and over the mathematics behind the Bloom filters is very enlightening, and I would recommend that to everyone, because you get an idea of how many items you need in order to fill this Bloom filter and make false positives very probable
@hnasr4 жыл бұрын
Kostas Oreopoulos thanks Kostas! I agree the math is brilliant
@abhishek-agarwal4 жыл бұрын
Woah!! 🤯....I can use this simple logic in so many places.
@BinaryIgor2 жыл бұрын
Another problem is that we have a state here. When we delete an user, we need to make sure to update bloom filter accordingly and it can get quite messy quickly, because each bucket represents a cluster of users not only one so then you also need to synchronized that, somehow.
@pran7119 Жыл бұрын
If each bucket stores count then this can be solved, but updates has to be synchronized. So I think if you create list of AtomicInteger then this would solve this problem in java
@TheSoulmate24min2 жыл бұрын
Hey. I was looking up what Bloom filters is. I was at a cafe without any earphones, so i was watching it with just subtitle, and your face popped up near the end hahaha. Great video man. Keep up the good work!
@KendrixTermina3 жыл бұрын
thanks this was a whole lot more helpful than whatever my professor was talking about love from germany
@hnasr3 жыл бұрын
❤️❤️
@avogadro1023 Жыл бұрын
I just found Gold Mine. Immediately subscribed. Thanks Man!
@abdocharrade18113 жыл бұрын
Finally someone who could help me in this issue. A video in Comp. Science area which finally can understand fluently :). Thumbs up @Hussein Nasser
@darbyburbidge89762 жыл бұрын
I'm late to the party, but thank you! This is one of the first of your videos I've watched and it really got me interested in Bloom Filters.
@whatsaappstatus6452 Жыл бұрын
Thanks, finally i got the concept cleared.
@尼安德鲁-n6j2 жыл бұрын
I have the feeling it’s pretty easy to make all bits set pretty soon even large number of bits used which would make the bloom filter useless. The video helps me get the big picture, but I still don’t get it why bloom filter is useful
@dipeshdulal22944 жыл бұрын
Wow! Beautiful algorithm, never heard bloom filters but looks great. Great explanation Hussein.
@hnasr4 жыл бұрын
Me 2 I heard about it around a year ago when I was researching Apache Cassandra.
@dipeshdulal22944 жыл бұрын
@@hnasr I thought they used tree data structures of some sort (like characters) to search unique usernames and have logarithmic complexity. But this seems so much clean and great
@ritwickdey974 жыл бұрын
yay... (Still starting ads are running)... I need the video 😅😅😅
@blair96593 жыл бұрын
What a brilliant example! Thank you so much!
@dorcohen35223 жыл бұрын
It's a great and a simple explanation. But a general bloom filter works over k hash functions, not 1 like shown here, as it decreases the false positive rate on existence test
@indp87525 ай бұрын
when you restart the server, do you need to calculate the bitset for all users ?
@praveenmenon36214 жыл бұрын
Thanks for sharing the videos. The topics are really interesting. Keep up the good work.
@hnasr4 жыл бұрын
Thanks praveen! 😊
@bradlanducci90112 жыл бұрын
Thanks for the visual example! This helped a ton.
@rahuldeepattri92444 жыл бұрын
So hashmap in java is/uses bloom filters?
@bhavyabansal11432 жыл бұрын
What happens if machine crashes and we loose the bloom filter in RAM? Do we create it again by reading data from DB? that will be too expensive right?
@harshavardhan99914 жыл бұрын
I find following information related to Bloom and Cuckoo filters from my notes today, ( not sure, where I copied this text to credit that person :) ) - Bloom and Cuckoo filters utilizes multiple hash functions 00:08:06 - Bloom and Cuckoo filters can efficiently save the cache from a significant amount of this storage waste. 00:01:58 Thanks @Hussien for Great Content as usual !!
@MegaArti20002 жыл бұрын
Nice thought path! Very helpful
@skumakerguitar87084 жыл бұрын
Sir why dont you use Local Cache memory?
@yapziyan69182 жыл бұрын
Perfect explanation! Thankss
@Babe_Chinwendum Жыл бұрын
Really so helpful!
@Saurabhandsonu19944 жыл бұрын
I don't get how this could be useful after 64 usernames. Also, where is the bloom data stored? Redis?
@hnasr4 жыл бұрын
Remember the 64 bits will not be necessarily filled after 64 user names. User 1 could hash to bit 17 User 2 could hash to bit 60 User 3 could hash to bit 17 again User 4 could also hash to bit 17 and so on .. The 64 bits can be stored in redis or in memory of the server. Another example , we have a possible 365 birthdays only right? If you brought random 365 people will all of them have unique birthdays? No of course most of them will be born on the same day.
@CharlieArehart14 жыл бұрын
Saurabh, your first question is understandable, but listen again at 7:03. The key is: the bloom filter will only track if a given string, once hashed and mod'ed, does or does not resolve to one of those 64 slots. If it does not, then it gives that as quick confirmation.
@mageshp59982 жыл бұрын
Wondering, how the bloom filter is recreated on the event of application restart with already existing set of db entries?
@ricky999pal3 жыл бұрын
This was brilliant
@manasyebukit4 жыл бұрын
can you make video about mutex and semaphore, thank you
@CharlieArehart14 жыл бұрын
Thanks for sharing that topic, Hussein.
@hnasr4 жыл бұрын
Charlie Arehart sure thing thanks for your comment Charlie!
@vaibhavvashisht79712 жыл бұрын
Great explantation
@oussamaaouinti95003 жыл бұрын
Fantastically explained .. Well done :D
@mohammedsallam98413 жыл бұрын
So they don't handle collision in bloom filter ?
@JaePlay3 жыл бұрын
Do bloom filters effectively delay when you will have to do a hard check every time then? Just thinking about when most or all of the bits are set, then its not doing anything basically for the operation anymore.
@Finn-jp6pn3 жыл бұрын
Thanks, Hussein. But what happens if the entire bloom filter is filled with 1s? We'll be back to querying the dB for each request 🤔 Oh...nvm. Got my answer at the end. I posted this question halfway through the video 😅
@hnasr3 жыл бұрын
Exactly that is the drawback of the filter it helps until it doesn’t, but doesn’t necessarily harm because its very cheap
@user-ej7ss8ei2g3 жыл бұрын
what hash does that
@gurjarc12 жыл бұрын
good explanation. I have a good question. Given in todays world, we have multiple servers behind a load balancer writing into db, how will a write of name="Bob" appear itself in other peer servers. They should through some event notification mechanism, other wise, when the second server (who did not serve the write request of "Bob") gets a query for Bob, its boom filter will say "Bob does not exist. In other words how can we have a bloom filter that is global to all the underlying servers behind a LB?
@saralk182 жыл бұрын
Multiple options, one of them is distributed cache.
@gurjarc12 жыл бұрын
@@saralk18 whole point of bloom filters is to avoid network calls... querying any dB or redis was always there.. then why use bloom filters then
@webtechs87934 жыл бұрын
Are some bits more probable than others?
@hnasr4 жыл бұрын
Web Techs interesting question, I would like to say no. But it really depends on the input and the hashing algorithm used.
@satwikburman68415 ай бұрын
Google has so many users and millions of new users are added almost everyday, how many bits do you think they use for this hashing algorithm? I mean no matter how many bits you use, wouldn't it always end up hitting the database?
@leoantony723 жыл бұрын
how do i implement this in nodejs with redis?
@thearchibaldtuttle4 жыл бұрын
I definitely have missed that class! Looked it up and BAMMM WHAT was invented in 1970!?
@hnasr4 жыл бұрын
Archibald Tuttle lol Didn’t know it was invented in 1970 🤣
@cryptoconnex214 жыл бұрын
Good explained :) Thank you a lot!
@mihaifumarel33494 жыл бұрын
Super clear, thanks
@hnasr4 жыл бұрын
Mihai Fumarel glad it is 🙏
@shankartreeks90734 жыл бұрын
Great explanation bro. I am wondering why u haven't made any video about celery.
@hnasr4 жыл бұрын
SHANKAR TREEKS not sure what celery is ☺️ need to research! Thanks for the suggestion
@elbobinas11 ай бұрын
Very nice
@Dal.alef.4 жыл бұрын
Finally! thanks Hussein
@nashwan88292 жыл бұрын
Thanks a million
@sevvalizmirli73084 жыл бұрын
Thank u for this video !
@hnasr4 жыл бұрын
Thanks for watching 😊
@aviad_hayumi91664 жыл бұрын
What you suggest to do when we more than one service ? (I think that the best way is to use cache in reverse proxy). By the way , love your videos !!
@hnasr4 жыл бұрын
You have multiple solutions .. when you have more than one service you can store the bloom filter in the : 1) reverse proxy it self 2) Or in each server 3) or in a centralized cache (redis or memcachd) All these have pros and cons best one is 1) in my opinion as you suggested
@marsilinouzaky27484 жыл бұрын
Thank you this is the first time to hear about this. But I'm wondering if we speak of systems with thousands of usernames, wouldn't this become useless so fast as this 64 bits array will be filled so quickly even if we double that array size we still eventually going to fill it and now it's more of overhead than it helps?
@hnasr4 жыл бұрын
Marsilinou Zaky correct the 64bit was just an example, with large systems you will probably need something bigger than 64bit. And this is where the challenge lie. How do you know how big should you make the bloom filter? Too big than its just cache, too small than as you said its useless.. you need trial and error Good points!
@uzdik.student4 жыл бұрын
2:49 Why exactly 64 bit? Why not 32 or 128?
@hnasr4 жыл бұрын
No reason you can pick any length
@Kiet13 жыл бұрын
good stuff!
@420_gunna4 жыл бұрын
Whatuuup playa any chance you're interested in making other videos on probabilistic data structures? I'm studying for system design interview stuff and the Count Min Sketch datastructure came up -- halp
@420_gunna4 жыл бұрын
Or even weird stuff like hyperloglog : P
@hnasr4 жыл бұрын
Yikes never heard of them 😅 are they even used in real life projects? Ill add them to the list
@420_gunna4 жыл бұрын
Hussein Nasser i dont know if theyre as widely used as bloom filters, hmm - CMS is for counting the number of occurrences of different elements in a stream of data, which seems applicable - and the HLL is for assessing the cardinality/size of a set of unique elements
@420_gunna4 жыл бұрын
Hussein Nasser by the way, would love to exchange lists! I also keep a queue of stuff that I need to either brush up on or learn outright. Ill post later from my desktop, maybe itll be useful to ya
@420_gunna4 жыл бұрын
* CQRS + Event Sourcing * 2PC, 3PC, Saga * CRDT and other Consistency Patterns * MVCC (Multiversion Concurrency Control) * Different degrees of Isolation… (www.geeksforgeeks.org/transaction-isolation-levels-dbms/) and Consistency * BTree+ and LSM (for log Compaction) * How database replication actually works (From master to slaves) + WAL * Different types of Caching (write-through, write back, etc) (+ “Thrashing”) * Lambda Architecture * VIP + Keep-Alive + VRRP * How do databases manage concurrent connections? Pooling... * Dynamo * Columnar datastores vs Row-Oriented Datastores * How do stream processing “maintain” local aggregation state in the event of a crash? How Kafka uses RocksDB * Bulkhead Pattern (isolating elements of application into pools, so that if one fails, others can still function) * Client-side service registration vs Server-side registration * Semaphore, counting semaphore? * Probabilistic Data Structures: Countmin Sketch, Bloom Filters, HyperLogLog * Linearlizability, Serializability… isolation (dddpaul.github.io/blog/2016/03/17/linearizability-and-serializability/) * How distributed file storage systems work * How object storage like S3 works * Time Series Databases (See InfluxDB)
@PaulFidika2 жыл бұрын
Yes Hussein, I do exist
@syedmuizzahmad2 жыл бұрын
Nice
@davidpaulo33793 жыл бұрын
" paul doesn't exist" Me: guess I'll die
@annalisetrite72813 жыл бұрын
Damn good video!
@reactdeveloper2368 Жыл бұрын
If anyone has worked with mongoose implemeting bloom filters for full text search it will be nice! Thank you in advance
@kurianbenoy24 жыл бұрын
Bloom filter as you mentioned, when hash values for both Jack and Paul are 63. The first is being stored in database and when a collision happen, you retrive it from DB. So to detect duplicate text from a dataset, I feel we can use bloom filters itself.
@JayMikB4 жыл бұрын
First 💥
@ZelenoJabko4 жыл бұрын
This is not a bloom filter. This is a hash set. Bloom filter uses multiple hash functions.
@wugu424 жыл бұрын
No.
@todaytomorrow13524 жыл бұрын
If you can't find it at the first time, then visit the database, it defeats the purpose the bloom filter.
@vishnupv20083 жыл бұрын
The ego in the voice discouraged me from watching this video.