What makes db a DB? It is the underlying database engine. SQL is 5% of all the Databases used. Myths With NoSQL: NoSQL because SQL doesn't scale It doesn't scale with constraint. If you shard SQL, it will scale. By default, NoSQL is sharded so people claim it is scalable. SQL Db Structure Myths: SQL uses B+ Tree. You can write your storage engine. You can store data anywhere. Popular and default engine MyISAM engine and InnoDB engine use B+ Tree. Reason: they give log(n) lookup. No SQL Structure It Depends on the use case as it is having no Standardization. Few Types: NewSQL, InMemory, Key-Value, Columnar DB, Hybrid Dbs. Document DB It is very close to Relational Databases with a change of modeling layer. MongoDB uses wiretiger engine. The underlying engine can be the same for SQL and NoSQL. If they have the same underlying engine the difference can be with the guarantees they offer. Some can be distributed, some in memory, centralized, or embedded. A database system typically has several abstracted layers that handle different aspects of data management. These layers include: The Physical Layer: This is the lowest layer and is responsible for managing the actual storage of data on disk or other storage devices. It handles tasks such as allocating space for data, reading and writing data to disk and managing data files. The Storage Engine Layer: This layer sits above the physical layer and is responsible for managing the storage and retrieval of data. It handles tasks such as indexing data, managing data structures, and providing an API for querying data. The Query Layer: This layer sits above the storage engine layer and is responsible for parsing and executing queries. It provides an API for querying data and translating high-level queries into operations that can be executed by the storage engine. The Application Layer: This is the highest layer responsible for interacting with the user or application. It allows the user or application to interact with the database using a query language or an API. These layers are abstracted from each other so that changes or updates to one layer do not affect the functionality of the other layers. All of these are plug-and-play. DB is as performant as its storage layer. We can see JSON at the top, but beneath the layers, it is a highly complicated way of storing the data. What does the node of the B+ tree contain? In relational DB, it contains the exact row, as it has a fixed width, so it knows how much data it will require in one row. However, It is not necessary to have a single row in a node, it can have multiple rows. Indexing: It is similar to SQL and NoSQL. It makes reading faster. (lookups) Sparse Index: Indexed Value + Offset. Smaller Index Sizen Dense Index: All the words in the index Why are we not able to do joins in NoSQL even if the underlying data structure is the same? Join runs on compute side, not on the storage engine side. Databases need to be in the same machine to join. In sharded db, you need to bring data in one very costly machine (network overhead) so people say there is no join in NoSQL. So people tend to do Approximate Join or Partial Join. Geo-sharding: Geo-sharding is a technique used to distribute a database across multiple geographic locations to improve performance, scalability, and availability. Master-Slave architecture This is done to scale the reads. We do write in master. Pulls write periodically, called replication log. We are more likely to have read. Multi-Master Architecture Problems of Conflict Resolution, How will you handle ID's? Conflict Logic First Write Wins Last Write Wins Concat Not Accept Any Distributed Databases Masters are independent, as they have shards Joins in Sharded DB: All the relevant data from the databases will arrive at a single machine then the join will happen. It then computes the result and sends it back to all machines. These queries are good for analytics but not ideal for real-time use cases as it is very expensive. Use cases The strength of SQL DBs is ACID compliance, some distributed claims ACID compliance, which means they are having distributed transactions which will result in them becoming slower. If we want strong consistency we need a single node.
@abhiksarkar9278 Жыл бұрын
If you feel there is some inconsistency in my data, please do comment :)
@hardikmenger4275 Жыл бұрын
I think btrees or lsm tree or any other type of index ds will store the memory location and not the whole row.can you help me fact check that information?
@SocraticDebates Жыл бұрын
Hats off to the effort. Thanks.
@dumbcurious45011 ай бұрын
This is pure gold discussion. right questions were asked during discussion and great answers were given. Please do such podcast more frequently.
@Techie00711 ай бұрын
Thank you so much. Will surely do.
@sumitrawat2037 Жыл бұрын
Really engaging talk. Thanks Sukhad for bringing in Arpit !
@Techie007 Жыл бұрын
Thanks for watching. Glad you liked it.
@prtk2329 Жыл бұрын
Wow, what an eye-opening session!
@coderkashif6 ай бұрын
Amazing...Arpit bhai is Gem
@shubhammishra1225 Жыл бұрын
Gyan hi gyan !!!, Need more such podcast.
@Techie007 Жыл бұрын
Coming your way
@mohammadwaseem9267 Жыл бұрын
Very well articulated . Thanks for bringing this
@Techie007 Жыл бұрын
Thank you. Glad you enjoyed it.
@abhiksarkar9278 Жыл бұрын
Learned more than college Database course.
@Techie007 Жыл бұрын
Thanks. Glad you found it useful.
@gmmkeshav Жыл бұрын
Just amazing !!! free me itni knowledge mil gai itna muje bhut expensive course se an mile I love this channel do bring such staff software engineer who have such a great experience
@Techie007 Жыл бұрын
Will definitely bring.
@vaibhavhvr Жыл бұрын
Thanks for the podcast..very quality discussion
@Techie007 Жыл бұрын
Glad you enjoyed it!
@10renganatha4 ай бұрын
Good podcast, love to hear arpit talking on system design and also a fun fact, Depends is the most word in the whole discussion😃.
@soumavabanerjee5925 Жыл бұрын
Quality podcast! Cleared many of my sql/no-sql misconceptions!
@Techie007 Жыл бұрын
Thank you so much.
@prasundas4155 Жыл бұрын
Quality Podcast learned a lot of new things
@Techie007 Жыл бұрын
Glad you liked it.
@abhiksamanta3424 Жыл бұрын
really great....learned a lot.....
@Techie007 Жыл бұрын
Glad you liked it.
@software-engineer-98 Жыл бұрын
Do more podcast like this on decoding tech. setup like beerbiceps might take you to next level.
@Techie007 Жыл бұрын
Will do more podcasts.
@chandrachurmukherjeejucse58168 ай бұрын
This is what we need ❤ and people are still getting views on dsa vs dev.
@Techie0078 ай бұрын
Glad you liked the video.
@animeshsingh4290 Жыл бұрын
Good Podcast. I've been a fan or arpit for a long time. His BitTorrent playlist was very interesting. About the podcast, I'd prefer if you put your and arpit's video side by side, it would give a conversation feel rather than this.
@Techie007 Жыл бұрын
Thank you for the feedback.
@rajeshkumar1149 Жыл бұрын
Would a postgres master slave architecture be eventually consistenet even with physical replication rather than logical replication on the storage layer? For example aurora postgres database
@Techie007 Жыл бұрын
Yes. Since in this case also a WAL file has to be written to the destination which will take some time.
@Raju-qi3up Жыл бұрын
Citus helps to distribute postgres DB, right? Now does that become eventual consistency?
@Techie007 Жыл бұрын
Yes.
@asranand711 ай бұрын
Postgres doesn't store rows in B+ Tree, it stores in heap files
@hardikmenger4275 Жыл бұрын
I think btrees or lsm tree or any other type of index ds will store the memory location and not the whole row.can you help me fact check that information?