Relational database index vs. NoSQL index

  Рет қаралды 83,877

Gaurav Sen

Gaurav Sen

Күн бұрын

How is indexing done in NoSQL databases? We talk about choosing good partition keys and appropriate sort keys to optimise query times.
The general idea is centred around denormalisation and global secondary indexes. Sort key is based on query access patterns and chosen to optimise their read times.
References:
www.tutorialsp...
www.bmc.com/bl...
www.elastic.co...
• AWS re:Invent 2018: Am...
blog.couchbase...
System Design Website: interviewready.io
System Design Playlist: • System Design Playlist
Become a channel member!
/ @gkcs
You can follow me on:
Facebook: / gkcs0
Quora: www.quora.com/...
LinkedIn: / gaurav-sen-56b6a941
Twitter: / gkcs_

Пікірлер: 89
@prabhatkumarkaushik8124
@prabhatkumarkaushik8124 5 жыл бұрын
Now you are looking like a prefect indian teacher bro❤
@alexmercer5870
@alexmercer5870 4 жыл бұрын
"its going to work out! Just hang in there and stay patient"
@vjar1000
@vjar1000 4 жыл бұрын
Your videos are awesome as always! Hitting "like" specifically since I am a big fan of Pink Floyd! Great reference dude! I really admire the effort you put in for content creation and editing as well!
@gkcs
@gkcs 4 жыл бұрын
Thank you!
@mayank.ranjan
@mayank.ranjan 4 жыл бұрын
We can use BTree as well. That would be faster though. Indexing also internally uses that.
@raghvendramishra9187
@raghvendramishra9187 4 жыл бұрын
That's exactly what Java 8 has done with HashMap implementation which now converts the bucket into a Red Black Tree after reaching the threshold of 8 elements and converts it back to a linked list after reaching 6 elements with better performance.
@kumarc4853
@kumarc4853 3 жыл бұрын
top tip , tq
@santoshbhatnagar2155
@santoshbhatnagar2155 Жыл бұрын
@raghuvendra mishra how can we see the internal implementation of this?
@reallifegambits
@reallifegambits 5 жыл бұрын
Great work Gaurav, Can you try to cover following design pattern topics (not able to find materials on internet like the way you explain): - Chess - Ludo - Elevator design - Playstore/App Store - Parking Lot etc.
@gkcs
@gkcs 5 жыл бұрын
I'll look into them 🙂
@MO-hq4iz
@MO-hq4iz 4 жыл бұрын
In a SQL database,s you can combined index of India, gender, city like so: CREATE INDEX index_name ON table_name(India, gender, city); Then you have the index you will be looking for and your statemen is no longer true.
@akshayakumart5117
@akshayakumart5117 5 жыл бұрын
Screwed up an interview today, that last bit about interviews was all I needed to hear to get back on track.
@gkcs
@gkcs 5 жыл бұрын
😁
@rakshithyadhav7274
@rakshithyadhav7274 5 жыл бұрын
Hey gaurav can you mention few resources to learn system design apart from your playlist. It would be really helpful and thanks
@ohileshwaritagi2971
@ohileshwaritagi2971 5 жыл бұрын
You can use a trie instead of a hashmap and red black tree combination. The value stored in the trie node can be a red black tree. In this way you need not blame the dev who selected the sorting key.
@gkcs
@gkcs 5 жыл бұрын
You can always blame the dev :P
@gomzysharma
@gomzysharma 2 күн бұрын
Ah! Something similar i have been asked in GoldMan Sachs interview and i could come up with the same solution but i didn't know that java internally uses Self binary search tree. So interviewer told me this at the end.
@paperguns115
@paperguns115 2 жыл бұрын
Another excellent, fun, and engaging lesson. Thank you again. I laughed several times and to enjoy the learning is a bonus.
@ladySVV
@ladySVV 4 жыл бұрын
"for example gender.. if u take this channel there will be 90%/10%.." tell me about it, yesterday i had one of the weirdest experience of my life lol: seeing 'balls deodarant' ad on youtube, while watching your system design playlist lol
@gkcs
@gkcs 4 жыл бұрын
Wow, hahaha!
@naveengupta6878
@naveengupta6878 3 жыл бұрын
@@gkcs System Design Video idea for ballwash.com
@박종선-v8i
@박종선-v8i 4 жыл бұрын
Please enable CC button for auto-generated subtitle so that I can understand more.
@cocorita
@cocorita 4 ай бұрын
ah... nothing like blaming co-workers for errors to be a proper Indian developer LOL
@varunmundale4627
@varunmundale4627 4 жыл бұрын
Wow! Why can't we apply same concept on traditional SQL to collapse complexity from O(n) to O(log n) ?
@vinitdantkale4858
@vinitdantkale4858 5 жыл бұрын
Hey, I had kind of same interview question. I wanted to ask if B-Tree or any modification in B-Tree is a good choice?
@gkcs
@gkcs 5 жыл бұрын
A b-tree is difficult to implement in an interview. But if you know how to, it's as good as a red black tree in terms of retrievals.
@harisridhar1668
@harisridhar1668 3 жыл бұрын
Hi Gaurav - thanks for reviewing the hashmaps again in the context of quick memory retrieval; I've seen a lot of emphasis across places on coding a good hash function, but not making sure your keys are randomly uniformly-distributed!
@aryanrawther3758
@aryanrawther3758 21 күн бұрын
hi 90% of males 0:54
@AkhileshPandey-bv5ni
@AkhileshPandey-bv5ni 5 жыл бұрын
Amazing video Gaurav! The part about augmenting it with a red-black tree was amazing! Really it gave me food for thought on other similar augmentations! Good Work man!
@Garentei
@Garentei Жыл бұрын
What is the point of having a balanced BST inside every bucket of a hash map? Why not just have one balanced BST and not use a hash map at all? In the worst case you're still having O(log(n)) time. Is it because in the best case you have O(1) time? That seems kind of weird.
@RisingDad
@RisingDad Жыл бұрын
He's right about hanging in there. I knew a lot of people in college who basically didn't know anything about software development. I mean, they didn't know what a pointer was and all the classes were in c++. They didn't know what a dot did in C++ or Java in " value".trim(). I asked how they were going to get a job and they said if they apply 100 places, someone will take them. Sure enough, they were right.
@alokporwal007
@alokporwal007 4 жыл бұрын
At 1:55 'we are getting all records from INDIA and then filtering on FEMALE is done' - 1. This is true because are using NoSQL DB and we have index only on country? 2. Had it been MySQL, we would have got only refined record sets at the first place? Request anyone to please help me understand this :)
@a-h9068
@a-h9068 Жыл бұрын
OMG I've been asked the hashtable implementation question in one of my interview. If I've encountered this video earlier last week, I would have a better solution for that question :/
@ameetmonty
@ameetmonty 5 жыл бұрын
Hi Gaurav, if you are taking request for new topics, it will be very helpful to cover Concurrency issues. Also Reddit System Design!
@gkcs
@gkcs 5 жыл бұрын
Nice!
@r1jsheth
@r1jsheth 5 жыл бұрын
So, I understood how you used, multi column keys for indexing for a particular use. My question is can we create multiple 'multi column indexing'? What if the next sequence of queries tells me to find all males living in x or y city. How would you select a generalised indexing key?
@gkcs
@gkcs 5 жыл бұрын
We could duplicate the data and make multiple sorting keys, based on the query access patterns. So for finding all males in city x, I would insert the data with two sort keys: country+gender+city and city+gender. Important thing to note is that we make sort keys only if the queries are frequent, which justifies the higher insertion space and time for the index.
@r1jsheth
@r1jsheth 5 жыл бұрын
@@gkcs Got it, thanks :)
@priyankasahoo9346
@priyankasahoo9346 5 жыл бұрын
There is list of offers and it's criteria, which looks like this. Each offer will have list of product ids, if a cart has all these products, then the corresponding discount would be applicable. Now for a given cart, multiple offers may be applicable. In that case, offer which returns the max discount should be returned. Example Lets say the Cart contains these products [1,2,3,5,7,6,9] List of offers are: Offer 1- Product I'd [1,3,8] 10% Offer 2 product ids [1,2,5,9] 15% Offer 3 product ids [2,5,6] 25% Offer 4 product ids [1,2,3,4] 30% Output : 25% For the cart, offer1 is not applicable as product I'd 8 is not there in the cart. Offer2 is applicable as product I'd 1,25,9 all are available in the cart. Offer3 is also applicable Offer 4 is not applicable as product I'd 4 is not present in the cart. Thus output would max discount from offer 2 and offer 3. Which is 25% Question is how should we store these offers so that we should get the offer amount with minimum time. Assume this is for a big retail company, where the request comes in large number. Can anyone please let me know the right data structures for this?
@abbashussain7298
@abbashussain7298 3 жыл бұрын
the wallpaper at the back is upside down and i can’t unsee it
@Rahul-lg1nw
@Rahul-lg1nw 4 жыл бұрын
Sometimes it seems like I'm watching it in 1.25x😂 you're really fast
@thesoftwareengineer17
@thesoftwareengineer17 4 жыл бұрын
Hi gaurav, In a interview i was asked a question- Q1: A file have 1 billion records(key,value)pairs and we have 4gb RAM how do we store in a hashmap, Q2 : A file have 1 billion records(key,value) pairs. if the user have to access a records with key somewhere in middle of file, how can we access it faster?
@masterprattu
@masterprattu 4 жыл бұрын
Commenting for better visibility :) Even I am interested in knowing the answer to this question.
@abisheksoni3354
@abisheksoni3354 2 жыл бұрын
Logical partition can help here .... That will load the partition specific indexes only. But you have to be careful on partition key(on the basic the partition happens)
@shubhamkumar6383
@shubhamkumar6383 Жыл бұрын
still searching ans after 3 years 😂
@cengizandak4241
@cengizandak4241 5 жыл бұрын
For instance we want to partition data based on a user id, but what happens if we want to search something based on the photo id or any id (for instance you shard your date based on block ids, but you want to search for a driver as well not only for blocks) different than user id, do we need to scan through the shards ?
@gkcs
@gkcs 5 жыл бұрын
Interesting practical problem to solve. I'll try and get to this in the later videos 😁
@cengizandak4241
@cengizandak4241 5 жыл бұрын
@@gkcs Thanks :) I have heard that, an interesting way to do it is appending the partition key with photo id (driver id etc), but I am letting you to find the optimized way :p
@AkshayAradhya
@AkshayAradhya 5 жыл бұрын
Can you explain INTRPATH from CodeChef June 2019 ?
@karanpandya7613
@karanpandya7613 4 жыл бұрын
Amazing video.... nice background music..
@sagarkashyap7572
@sagarkashyap7572 4 ай бұрын
nice shirt bro...
@gkcs
@gkcs 4 ай бұрын
Thanks 😁
@cengizandak4241
@cengizandak4241 5 жыл бұрын
I think it is time to make a system design video after that, by using those partition/sorting logic :)
@kaustubhkislay3126
@kaustubhkislay3126 5 жыл бұрын
Yes,C++ also implements its map using self balancing red black trees. One doubt, why not use AVL trees? I looked up for the difference and found an article on geek, here's the link www.geeksforgeeks.org/red-black-tree-vs-avl-tree/ , here they do say that use AVL for DB retrievals and R & B trees for better insertion and deletions.
@gkcs
@gkcs 5 жыл бұрын
Java uses red black trees for it's ordered maps too. I guess the practical speed of a red black tree is higher compared to an AVL tree.
@puneetgarg6069
@puneetgarg6069 3 жыл бұрын
Number of rotations are less in red black tree compared to avl tree. So red black tree is used
@deepsikhakar9166
@deepsikhakar9166 5 жыл бұрын
very good explanation
@PrateekHegde
@PrateekHegde 5 жыл бұрын
How to handle concurrent database requests? For example 5000 requests at a time
@gkcs
@gkcs 5 жыл бұрын
Look at the load balancing video.
@PrateekHegde
@PrateekHegde 5 жыл бұрын
I mean, if 5000 users are signing up at a time, does the database handles all the write request?
@tridipchakraborty89
@tridipchakraborty89 Жыл бұрын
Thanks a lot for the motivation at the end - "Hang in there and stay patient". Made my day. ( wrt to my current job hunting situation)
@gkcs
@gkcs Жыл бұрын
All the best!
@tridipchakraborty89
@tridipchakraborty89 Жыл бұрын
@@gkcs thank you very much !!!
@shivendrapratapsingh15
@shivendrapratapsingh15 5 жыл бұрын
What to do if the interviewer is stuck on why you used NoSql over Sql? (Assume you have given all the differences between them.)
@gkcs
@gkcs 5 жыл бұрын
Has it happened to you?
@shivendrapratapsingh15
@shivendrapratapsingh15 5 жыл бұрын
@@gkcs twice!! Some people consider RDMS as religion.
@nkalra0123
@nkalra0123 5 жыл бұрын
By the way in Java this threshold is very very low.. in fact it is 8 ( static final int TREEIFY_THRESHOLD = 8)
@gkcs
@gkcs 5 жыл бұрын
I don't think so...the map should be going at 100 at least. Interesting find, any reason why it's so low?
@nkalra0123
@nkalra0123 5 жыл бұрын
@@gkcs The smallest table capacity for which bins may be treeified is 64, The bin count threshold for using a tree rather than list for a bin is 8. I don't know the reason. You can check the code here hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/util/HashMap.java#l250
@sayanbiswas9059
@sayanbiswas9059 5 жыл бұрын
How important do you think are the design patterns of object access, like singleton, facade etc? Do you think you can cover something like that in the System Design video if it is relevant?
@gkcs
@gkcs 5 жыл бұрын
Quite important. If I get to low level design, I will pick these concepts first 😁
@sayanbiswas9059
@sayanbiswas9059 5 жыл бұрын
@@gkcs Much appreciated
@genuineprofile6400
@genuineprofile6400 5 жыл бұрын
Have you not figured out the 10 min video length benefits yet ?
@gkcs
@gkcs 5 жыл бұрын
KZbin promotes videos with high watch time. Shorter videos are less likely to be recommended by the algorithm. And so we have the 10 minute min time barrier. I don't care much about all this though. My viewers time and comprehension is more important to me. Hence the video times 😛
@uneeb123
@uneeb123 5 жыл бұрын
Nice touch with Led Zeppelin ;)
@nareshgb1
@nareshgb1 4 жыл бұрын
I thought I heard only pink floyd.
@i_youtube_
@i_youtube_ 5 жыл бұрын
I wonder, what is song 1:12?
@cengizandak4241
@cengizandak4241 5 жыл бұрын
Another brick on the wall
@i_youtube_
@i_youtube_ 5 жыл бұрын
@@cengizandak4241 Thanks
@uneq9589
@uneq9589 5 жыл бұрын
First page of amazon dynamo db tutorial😂. Like the second part though
@gkcs
@gkcs 5 жыл бұрын
That's true, it's heavily inspired by the amazon video in the description.
@uneq9589
@uneq9589 5 жыл бұрын
@@gkcs i have watched it too. It entirely changed my perspective how i viewed nosql databases. That's must watch for everyone. Good job putting a link there.
@puravshah2342
@puravshah2342 5 жыл бұрын
Bro u look like jeetu bhaiya 😎
What is an API and how do you design it? 🗒️✅
15:26
Gaurav Sen
Рет қаралды 735 М.
Database Sharding and Partitioning
23:53
Arpit Bhayani
Рет қаралды 87 М.
An Unknown Ending💪
00:49
ISSEI / いっせい
Рет қаралды 57 МЛН
Every parent is like this ❤️💚💚💜💙
00:10
Like Asiya
Рет қаралды 18 МЛН
РОДИТЕЛИ НА ШКОЛЬНОМ ПРАЗДНИКЕ
01:00
SIDELNIKOVVV
Рет қаралды 3 МЛН
Database Indexing for Dumb Developers
15:59
Laith Academy
Рет қаралды 61 М.
7 Database Paradigms
9:53
Fireship
Рет қаралды 1,6 МЛН
What is DATABASE SHARDING?
8:56
Gaurav Sen
Рет қаралды 930 М.
System Design: TINDER as a microservice architecture
36:41
Gaurav Sen
Рет қаралды 1,2 МЛН
21. Database Indexing: How DBMS Indexing done to improve search query performance? Explained
1:23:52
How do indexes make databases read faster?
23:25
Arpit Bhayani
Рет қаралды 67 М.
An Unknown Ending💪
00:49
ISSEI / いっせい
Рет қаралды 57 МЛН