Here's the video which explains how to calculate the optimal value of k and m in Bloom Filters. Please watch the video as an extension to the above video. kzbin.info/www/bejne/q6DVlYWCrbRjrbc
@GOZENIEINNOVATIONSАй бұрын
Helpful
@ashraf_isb9 күн бұрын
good one, thank you Preetam Keshari sir
@preetam.keshari9 күн бұрын
Thank you 😇
@phoneix2488619 күн бұрын
You can use two level indexing (and keep the second level index as a shard key) to make faster searches instead of using boom filters. The only catch is that you got to have a common index for almost all of your data that needs to be joined and picked up.. in that way they should belong to the same shard.
@preetam.keshari19 күн бұрын
This might cause bottlenecks while scaling up. However you can find some practical use cases of bloom filters here redis.io/docs/latest/develop/data-types/probabilistic/bloom-filter/
@PrasunRoyPlus14 күн бұрын
Most modern distributed databases address this issue with a Partition Key, which is nothing but a way of sharding the DB into a number of partitions that are rarely hot. Depending on your application find a suitable Partition Key, such as First letter of Last Name, DOB of users or Department of users, then you would only be searching in that small partition not the entire DB.
@preetam.keshari13 күн бұрын
That’s true. But still it will involve db queries for both true negative and true positive cases. We would like to avoid db queries as much as possible.
@PrasunRoyPlus13 күн бұрын
@preetam.keshari but we are not completely avoiding DB calls by using Bloom filters, infact with a user base of 1 billion it is more likely to get hits than misses with Bloom filter. This would result in DB calls more often than not. A partition key would also result in a deterministic query performance, unlike the probabilistic nature of Bloom filters.
@preetam.keshari13 күн бұрын
Totally agree. However you can always choose a reasonable error rate and determine the values of k and m so that number of false positives are reduced. Please watch part 2 of this video on my channel for this. Also redis has mentioned some of the practical use cases of bloom filters in their website. You can check those out as well. 😇
@mahajveemahajvee8941Ай бұрын
Redis supports bloom filter we can use redis to implement on our backend
@preetam.keshariАй бұрын
That’s right.
@yasararafath352420 күн бұрын
We are using MongoDb Atlas M80 tier, So its so easy to search in this level tier, For bigger Database first you need higher tier version of DB so it can handle any queries, Not only finding user also simple find query also will be slower so API's will be more slower, Ideally if database size is more, then it will be scaled enough to handle any queries, And definitely we should have indexes for them
@preetam.keshari20 күн бұрын
Thanks for the insights
@pritamsarkar993121 күн бұрын
And also if we try to use the partitioning mechanism in the DB to store the usernames, the search cost can be reduced when there is a probabilistic "YES" and direct query is made.
@preetam.keshari21 күн бұрын
Yes, sharding you mean? It can be an effective strategy for reducing search costs.
@pritamsarkar993120 күн бұрын
@preetam.keshari yess sir.
@theunusual456626 күн бұрын
Very informative and something new. Thanks for the video.
@preetam.keshari26 күн бұрын
Thanks, I'm glad it was helpful! 🙏
@theunusual456626 күн бұрын
@@preetam.keshari :)
@ashwinnema062 ай бұрын
pretty informative video. I am not sure why you do not have much views but I have sybscribed to your channel
@preetam.keshari2 ай бұрын
Thank you :) It means a lot.
@thegreekgoat9827 күн бұрын
Bhai maja asigala... Good explanation
@preetam.keshari27 күн бұрын
Dhanyavad 😇
@IAmUsingAndroid17 күн бұрын
I think we achieve it using database sharding along with caching.
@preetam.keshari17 күн бұрын
Without involvement of db queries for both true negative and true positive case, we cannot achieve it using sharding + caching. You can find some of the real life use cases of bloom filters here: redis.io/docs/latest/develop/data-types/probabilistic/bloom-filter/ Hope that helps 😇
@SheetTalkHub20 күн бұрын
Can you talk about some use-cases where it is being used or it can be used ?
@preetam.keshari19 күн бұрын
Hi. You can find detailed use cases of bloom filters redis.io/docs/latest/develop/data-types/probabilistic/bloom-filter/
@damarajusushanth570215 күн бұрын
Wherr does the bloom filter value comd from
@preetam.keshari15 күн бұрын
You mean the has values? If yes, those come from the hashing functions If you mean about the bloom filter bits array, we can implement this bits array using redis
@kesavanarayanaanaparthi642818 күн бұрын
Nice information 👌, thanks for your effort 👏👏, I have a query 😇. What if the user updates their name and also if the user gets removed from the system?, What is your solution in this scenarios? The condition is to maintain the some consistency across the operations such as add remove or update 🙋🙋.
@preetam.keshari18 күн бұрын
Hi. 1. Bloom filters will be used to check let’s say an user id (unique identifier) exists in system or not. So changing names doesn’t have any effect on it 2. If user is deleted, we can pass its user id through the hash functions again to retrieve the indices of the filter and mark it as 0 Hope this explanation helps 😇
@ananychitranshi9408Ай бұрын
Great content bro. Subscribed. Can you please make a video to determine the optimal size of m & k ?
@preetam.keshariАй бұрын
Hi. Video preparation is under progress. Stay tuned.
@preetam.keshariАй бұрын
Hi. Here's the video explaining your query kzbin.info/www/bejne/q6DVlYWCrbRjrbc Let me know if you found it helpful :)
@kartik989221 күн бұрын
If there are millions of users, I believe all indexes can become 1. It seems there is where size of m is to be considered.
@preetam.keshari21 күн бұрын
Hi. That’s correct. I have created a separate video on how to determine the optimal size of k and m. Please go and watch it. Here’s the link: Part 2: Maths Behind Bloom Filters | Efficiently Checking an User Among Billions #systemdesign kzbin.info/www/bejne/q6DVlYWCrbRjrbc
@pranayanahak4708Ай бұрын
Can we use bloom filter to search matching user ids?
@preetam.keshariАй бұрын
No. It can be used only to detect the existence of a user
@jayantgupta531029 күн бұрын
What is the collision factor of it ??
@preetam.keshari29 күн бұрын
Hi. By collision factor do you mean the rate of false positives (error rate)? If so, please watch the Part 2 of this video on my channel, where I have discussed the mathematics behind it. Hope that helps 😇. Let me know if your query is something else. Thank you 🙏
@mohitgupta7613Ай бұрын
what will happen if all the elements in bit_array is set to 1 after multiple insertions... is it going to give "user_exists" for all new users?
@preetam.keshariАй бұрын
Yes. And that’s where you need to determine the optimal size of k and m based on the requirements like number of users in your database, estimated increase in that number etc. I will create a separate video on how to determine the optimum value of k and m soon.
@AdityaRaj-rv5jsАй бұрын
Great Question mohit,I too had this after his explanation. now I have few more queries, where is this bit array came from, who created it and where is it stored for next checks to be made. Now, that's aside what is the size of k and m in general, let say for a billion users of insta and fb, what is the size. Good explanation but left many loopholes. Probably in next video he will clear these doubts.
@preetam.keshariАй бұрын
@@AdityaRaj-rv5jsHi. We can use redis to implement bloom filter. It supports bloom filter. Now coming to the topic of finding optimal value of k and m, please watch Part 2 of bloom filters which I have released already. Thanks 😊
@dakshbhatnagar17 күн бұрын
wont indexing combined with binary search help?
@preetam.keshari17 күн бұрын
That's a great approach, but Bloom filters offer even faster lookup times for large datasets. However you can find some of the real life use cases of bloom filters here: redis.io/docs/latest/develop/data-types/probabilistic/bloom-filter/ Hope that helps 😇
@TouqeerShafi22 күн бұрын
From my pov its useless, eventhough bloom filter can tell you user exists or not. At the end of the day you still have to query the database
@preetam.keshari22 күн бұрын
Not Exactly. Bloom filters produce 100% accurate true negative. It means, if the output of bloom filter is "User doesnot exist in the system", then it's 100% accurate. So it saves us from making an extra DB query. Also, redis supports implementing bloom filters. Hope this explanation helps.
@TouqeerShafi18 күн бұрын
@preetam.keshari redis uses the bloom filter make sense because redis has a job to check if the cache key exsits but if you apply the same bloom filter use case for username exists wont do a job because auth system can be complex specially if you have a billion of user. In my view adding bloomfilter for username or email check is just over engineering.
@whoami5955Ай бұрын
when one fox howls all others join in, this video is a example of it
@sudheerkumar-tp1mg25 күн бұрын
If already billions of users present then how to add all the users to Bloom filters, and where we can store this array in a database or in java memory suppose if a pod is crashed we will loose everything and again we will try to add in to bloom filters how google will use please suggest.
@preetam.keshari24 күн бұрын
Hi. We can implement bloom filters using redis. It supports bloom filters data structures. I will be creating a video on how to implement bloom filters using redis soon. Stay tuned.
@aryabhat_onlineАй бұрын
Good job, any tutorial on how to implement.
@preetam.keshariАй бұрын
Hi. Glad that you liked the video. Redis supports bloom filters. It can be implemented using Redis. I will create a video on it soon. Stay tuned 😇
@chetankomal5016Ай бұрын
please make a video on how to determine the optimal size of m and k and the maths behind
@preetam.keshariАй бұрын
Sure
@rapoliitАй бұрын
I second that. Thanks for the videos bro.
@preetam.keshariАй бұрын
@@rapoliit Thank you 😇
@preetam.keshariАй бұрын
Hi. Here's the video explaining your query kzbin.info/www/bejne/q6DVlYWCrbRjrbc Let me know if you found it helpful :)
@ranuchakravarti3277Ай бұрын
Suppose return index is 0,4,6 Then it means name doesn't exists in db, then how to set index to 1 in bit arry, as 0 index is already 1 in arry????
@preetam.keshariАй бұрын
If a bit is already set to 1 in the bit array.. you don’t need to set it again. You just need to set the 0 bits to 1.
@jeetchheda8916Ай бұрын
I thought here you chose the value of m = k * number of usernames to store 💀 It felt correct for this example you used 😂
@preetam.keshariАй бұрын
No.. it was just random.. there’s a separate algorithm to determine the optimal value of m and k. I will create a separate video on this topic
@RahulkrishnaKurapati28 күн бұрын
"what is bloom filter how can it be implemented in Laravel for checking username while registering a new user" This prompt gave me more clarity than your video.
@preetam.keshari28 күн бұрын
Thanks for checking out my video. The goal of this video is to give the BTS working of a bloom filter and not code approach to implement it. It’s practical implementation differs from tech stack to tech stack. I will be creating a video on how to implement it in nodejs with redis. Hope that will help. Thank you 😇
@RahulkrishnaKurapati28 күн бұрын
@@preetam.keshari Yes I get it, but you are missing something. I follow bytebytego for similar tech stack explanations. I get things in one go from there. Anyway, thanks for allowing me to go a bit deeper about the bloom filter.
@riyansh33962 ай бұрын
Why every youtuber is talking about this suddenly?
@preetam.keshari2 ай бұрын
It’s a quite interesting topic to learn
@sirfinsaanАй бұрын
Every one want viewers
@xtan-ioАй бұрын
@@preetam.kesharilearn something useful, you don't even get access to a platform with a billion users.
@laxmikantsaraswat6319Ай бұрын
You watch one video then the youtube algorithm knows what is your area of interest , that's why 😅😂
@SBamniya24 күн бұрын
Looks like one of the MAANG company asked it in their interview
@deeptansupatro74212 ай бұрын
Damn😵😵
@preetam.keshari2 ай бұрын
😁
@lastbenchers019Ай бұрын
200th subscriber
@preetam.keshariАй бұрын
Thank you
@Tyrus42920 күн бұрын
Not a very convincing solution, only solves one part of the problem, still gives false positives and has to use some other approaches as well to be double sure. By the way, the information u provide was good, thanks
@preetam.keshari20 күн бұрын
It's true that Bloom filters have limitations, but they can be a powerful tool if you compute the value of k and m smartly. Please watch the part 2 of the above video where I have discussed the mathematics behind calculating k and m. Hope that will help. kzbin.info/www/bejne/q6DVlYWCrbRjrbcsi=P4cn99Ub-zvLpEcA
@codecrash_t132Ай бұрын
subscribed👍:349
@preetam.keshariАй бұрын
Thank you 😇
@31shivasilmawala3026 күн бұрын
Impressive video, @preetam.keshari! Adding some code examples would greatly benefit for newcomers by providing clear guidance. Keep up the great work!.
@preetam.keshari26 күн бұрын
I am glad you liked the video. I will consider adding code examples in future videos. 😇