I giggle everytime I see thumbnail for this video xD. Great sense of humor and content.
@treylitefm4 жыл бұрын
This channel is a gold mine
@大盗江南4 жыл бұрын
Thank you for this amazing video. :D This bitmap looks like bloom filter :D
@dataguy70134 жыл бұрын
Naren, DAU and MAU are key metrics online copmanies calculate using Data warehouses. You just simplified the calculations for that significantly! Nice explanation.
@sumonmal0093 жыл бұрын
THIS COMMENT IS FOR MY PERSONAL REFERENCE. TO UNDERSTAND PROPERLY WATCH THE FULL VIDEO -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- concept 2:45 bitmaps array 5:23 active user count 6:56 user login everyday 8:17 different device count 10:07
@10clicks-freesoftwareengin514 жыл бұрын
Good One!
@umalaxmi9723 жыл бұрын
Thankyou Narendra, very good explanation. You always explain DS as though it's a cake walk. Why have you stopped posting new videos? Hope you are doing safe and healthy. I was just thinking, why don't we store the data to day-on-day information in database with 4 bit integer for logged in per day basis or it can even be an entry into a row type DB for the last logged in date with a bit 0/1 per user. Anyways we would need to think of a purging system as we can't retain this for long even in case of bitmap array
@marriagedance86574 жыл бұрын
Great Video Sir, But I have few questions. Where do we store this bitmap? In Db or Memory? If in Db, do DBMS providers provide this functionality inbuilt? If in memory, are there any client libraries providing this? Also, what about persistance/fault tolerance if we are storing in memory?
@b.ravirao16554 жыл бұрын
Hi Narendra , That was very helpful. I was thinking of an improvement in there. How about using multiple bit map arrays instead of one single bitmap array ? Say , we can use 200 bitmap arrays for 200 M users and another array of 200 size which will store count of active users in ith bitmap array. This way, we can reduce time complexity by not traversing ith bitmap array if the active users for that bitmap array is 0. Correct me if I am missing any edge case. Thanks.
@Emmanuel-px9lk4 жыл бұрын
Perhaps I missed this, but how would you actually traverse the array to find all active users? Would you have to do a linear time scan and count how many '1's are there ?
@Chikaros4 жыл бұрын
Yes.
@mitsukiorichimaru45114 жыл бұрын
correct me if i got it wrong, this bitmap array index went with the assumption that we would've users ids as integers right? But that's not the case in most dbs, we use mostly the uuid's then how would we correctly map/say array bitmap index corresponds to this user id? do we use some kind of hash function? if yes, then how we'd go write the unique hash function as it has collision problems ? Thanks
@bishnuagrawal8283 жыл бұрын
how would we handle case where userid is alpha numeric? How would we map a userid into a particular index uniquely?
@chidov4 жыл бұрын
One question about the userId, as we know some system will use UUID as user id, in this case how could we set the order of the user in the bitMap?
@shrutikamboj46073 жыл бұрын
We can have a separate mapping for that.
@daleffi Жыл бұрын
hmmm i don't know about that. Imagine the concurrency problem you would have to update this map as the logins are happening like crazy! Also, How would you efficiently traverse the array to update the bit correspondent to the user logging in?
@shiwang7892 жыл бұрын
1. our user ids might not be auto incrementing integers. in that case, I will need to store the mapping of a user id with auto incrementing integer first and for every request have to check the mapping first to make updates in bitmaps. I guess we can use something like in memory datastore like redis for that. 2. Also, we could have used count min sketch to get approximate numbers for these analytics. Thoughts?
@_modiX4 жыл бұрын
The last ID of your Bitmap is 999.999.999, not 1 Billion. For archive purposes this is storage efficient, but a simple SQL-DB wouldn't take much more. You can make relations and store its statistics this way, you don't have to store the entire user data every day. Also how is the system able to determine if the user is the same user or a different user? Where is the information stored? I still need a solid SQL DB, because I need to store key information to identify the user non the less. But other than that, yes, bit shifting is very fast to compute active users from specific days. As soon as you have to plot those, it gets slower, though.
@TechDummiesNarendraL4 жыл бұрын
Good catch about 999..., Yes you are right we need a persistent db to store who is who, this is about collecting metrics in lighting fast with less overhead. I am sure not many people will use this in there daily life. But some systems do implement. Eg: Routers to find the unique combination of source vs destination IP address for various security implementation. citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.11.724&rep=rep1&type=pdf
@_modiX4 жыл бұрын
@@TechDummiesNarendraL Thanks for reply, yes this is a good reason and would've been a good example in your video as well as the clarification that this is not a replacement for a database to address interested beginners in this environment. Good video nonetheless. 👍🏻
@praveenjain1833 жыл бұрын
Liked the video, 1 question: what if my userIds are not integers but guids? any efficient way for the same?
@patrick79320033 жыл бұрын
You can perform 2 things: 1. Use Row Number if there is any (recommended) 2. Use a Hash Function
@prakashsharma914 жыл бұрын
How come iteration over 1 million bits is taking only 100ms?
@krishnaMurari484 жыл бұрын
we will need to load of all the 125 MB data of last n days in memory to count active users?? can we do it more efficiently??
@patrick79320033 жыл бұрын
Hyperloglog can do this in a more space-efficient way (1-64 KB in size). However, it would sacrifice a little bit of accuracy.
@Rxlochan4 жыл бұрын
Do bitmap have only 0 & 1 ?
@TechDummiesNarendraL4 жыл бұрын
Bitmaps are made of bits and a bit can only have 1 or 0
@a.yashwanth4 жыл бұрын
@@TechDummiesNarendraL *binary.
@girig64214 жыл бұрын
Thanks for the video. Can a system like this account for users deleting their account? If the user was removed from the bitmap, it would mess with the indexing.
@TechDummiesNarendraL4 жыл бұрын
Not really, even if the user is deleted the IDs of the existing users remains same.
@girig64214 жыл бұрын
@@TechDummiesNarendraL If you had 10 users in your bitmap and you wanted to remove the 5th user, your bitmap would a length of 9. However the IDs of user 6,7,8,9 would then become 5,6,7,8.
@a.yashwanth4 жыл бұрын
@@girig6421 🧐 why do you change user id for users whenever some user deleted his account?
@subbucs56744 жыл бұрын
@@girig6421 It is expected that the user-id is auto-incremented. We can keep separate bitmap of user_deleted. So, total-active-users = total-logged-in-users - total-deleted-users. The use-case mentioned by Tech Dummies is for statistics purpose.
@ResonantFractal4 жыл бұрын
@@girig6421 User ids should all be unique or strictly monotonically increasing. Else the design becomes practically unusable, viz., wrt (historical) metrics.
@bowang18254 жыл бұрын
this is useful but how come it can be used in real world example?
@marlegagaming12743 жыл бұрын
Analytics probably, "Get top K items purchased today", you could do all this in Redis, instead of an MR/Spark Job.
@paulhetherington38543 жыл бұрын
Count = French for: I'm already, a damn terroriste! How are we, dummies then? ie = #4, in XP- Orient!