Amazon DynamoDB - Paper Explained

Рет қаралды 53,542

Күн бұрын

Пікірлер: 122

@shoebmoin10 4 ай бұрын

This is the first 90-minute tech video I've watched straight through without taking a break. Thank you for making learning so engaging and enjoyable!

@Muigoku49 4 ай бұрын

Hey Arpit, loved the way you presented the intricacies of this system. Excellent video! , its an absolute knowledge bomb. It would be great if you continue this series and talk about more such systems.

@joobis.b4568 4 ай бұрын

Thank you for doing this, the important thing to note here is that all the concepts explained here can be applied to design various types of systems. You are not only getting the knowledge to know how DynamoDB is built, but the knowledge that can be used to design a wide variety of systems. I hope you will continue to do such long form in-depth contents

@karribalasubramanyam5463 4 ай бұрын

I laughed out loud at "Drawing boxes is easy, Implementing them is hard" So true! Great Work!

@somakchakraborty2830 4 ай бұрын

Excellent breakdown of the paper. The way you conveyed it with examples and diagrams made it so easy to understand. Thanks Arpit. Keep these dissections coming!

@AsliEngineering 4 ай бұрын

Thank you 🙌

@MayankSharma-wx4mf 4 ай бұрын

Greatly simplified 🎉 One another interesting thing to note is that each partition has fundamental max limits (3000 RCU, 1000 WCU), no matter how much you provision your table, fundamental limits always apply to partition. So, based on this if we end up with partition recieving high requests there will come a point where there will be no solution even if we increase table provisioned units. There can be more partitions created as a result of over provisioning table, but unless there's capacity associated to them (writes to that partition), adaptive capacity (from new partitions) will not come as rescue. Data distribution becomes the key for hot keys, we somehow have to ensure it being present on multiple partition and read from different partitions.

@mihirprajapati9854 3 ай бұрын

I found this very interesting and challenging that how dyanmoDB handles allowed requests to the table and limits to the partition level. Have to spend some time understanding things in depth.

@cantwaittowatch 2 ай бұрын

It’s my understanding that the storage node will also have a token bucket and units in it that can be played to make some room which is the case for long spike requests, which is the case where Arpit was talking about when requests get approved at the Global Admission Control level. Now this entity allows because, it must be having a visibility into all nodes (token buckets units) and the partition requested (which would have the burst specific token bucket and the allocated token bucket units). Please correct my understanding Arpit! Thanks

@mayankkumar7225 13 күн бұрын

By far most detailed and clear explanation of how db works. Thanks a lot.

@cantwaittowatch 2 ай бұрын

Arpit, hats off to you - Excellent talk and usage of the "the beauty" which you have quoted so many times in your presentation are the key pillars of distributed systems which you compare to the thinking of humans :) - awesome!

@amitrastogi1405 2 ай бұрын

Thanks Arpit for covering the DynamoDB paper. Lots of stuff for learning. Concepts around throughput dilation and read write deployments were new to me! Thanks a ton!

@SangeetaJain-fd6pf 4 ай бұрын

Awesome... Awesome explanation... Just remember replica at partition level not node level. This is first time I am able to understand a high level paper.

@manikantapunnam1189 2 ай бұрын

Loved it 🥰, please continue posting such dissection videos

@abhisekdas7474 3 ай бұрын

Hey Arpit, the way you explained the internals of DynamoDB is awesome! It was an absolute delight to watch the entire video and absorb so much information. Keep up the great work! 😃

@aniketmahangare8333 4 ай бұрын

I love the way you relate distributed systems with humans. A great video again Arpit. Thank you so much.

@RaviKumar-eq5iw 2 ай бұрын

Hey Arpit, such a mind blowing explaination of DynamoDB, it would be great if you talk about more storage services of AWS like EFS(Elastic File Storage), AWS Aurora. Thank you! for such an amazing contents.

@AbhiprayaDash 4 ай бұрын

Loved the content Arpit, This is really helpful getting into the Internals. Looking ahead for such more amazing content.

@vishalmahuli86 3 ай бұрын

This is amazing! So beautifully explained. Thank you so much Arpit!

@krozaine 4 ай бұрын

This is an amazing! thank you for crunching so much of information and research into this. I have heard a lot about another data store - Aerospike. Please cover the architecture for that also if possible, especially the comparisions with Dynamo (and current/future DiceDB) since comparision to Redis for some of the data stores seems unfair sometimes, especially around claims for horizontal scalability. Great content!

@sheetalrani2748 8 күн бұрын

loved it, thanks a lot for such amazing content

@sri504 2 ай бұрын

Awesome video.. we need more for other databases as well🎉

@pednekarrs Ай бұрын

Great explanation! Thanks Arpit for the same.

@faaranmohd 27 күн бұрын

Awesome explanation.. Thank you so much for your efforts.

@lokesh2608 14 күн бұрын

I loved your video. I am a Google engineer researching various systems and this was the most in depth video about dynamo db. One suggestion. I would encourage you to keep things a little less verbose at times to make your explanations efficient while remaining dense. But just awesome work😊

@advaitdocvita 4 ай бұрын

Waiting for DiceDb research paper dissection video soon 🤩🥳😁

@AsliEngineering 4 ай бұрын

Hahaha. Soon soon. Preparing a pretty detailed arch video.

@omkrishna7847 4 ай бұрын

This was great! Please do Cassandra and Yugabyte as well

@sarthakgarg5816 3 ай бұрын

Hey Arpit, Loved your efforts and content as always. 1. Kindly Update the description of the video as current description is from AIRBNB HTTP Streams current: In this video, we understand what HTTP streaming is, how to implement it, and how Airbnb uses it to improve its user experience. 2. Also it would be helpful if you can give me reference for this paper as i want to deep dive more on adaptive capacity. Thanks you so much. Sarthak Garg

@shubhamgupta1056 4 ай бұрын

Thanks Arpit for such an amazing video , looking forward to more of research paper explanation videos😄

@Amansharma-zj3se 4 ай бұрын

Next Level Arpit, this video brightened my day ❤

@nuthanreddynewton 4 ай бұрын

It's been an absolute delight to watch your explanations. Truely packed with every question and answer that one needs. Very rare for any popular "System Design" Channels. You should change your channel name to "Tech Orgasm" 😀

@maheshnagarwal4379 4 ай бұрын

Thanks for making this video. Planning to enrol into masterclass.

@padigelahritik9053 4 ай бұрын

Nicely done. Appreciated 👏

@rishabhgupta4106 4 ай бұрын

Thanks man, Loved it.

@hrithikraj4277 4 ай бұрын

God level explaination.

@sushants523 4 ай бұрын

May be next on Cosmos deep dive

@DrKratos-lg5zi 3 ай бұрын

hi arpit one small suggestion can you add chapters to your upcoming videos . folks like me who have to work for long hours dont get time to finish the whole video in one sitting

@nirmitshah7660 4 ай бұрын

Very Nice Walkthrough for the architecture and design, thanks for the video :) , the video description seems a lil off though.

@ssh22-x7g 4 ай бұрын

Loved it!

@YemaneHadis 4 ай бұрын

Excellent Work

@yashwanth.chaudhari339 4 ай бұрын

Very much needed.Can you do this for some more famous services?

@Binary2044 4 ай бұрын

Excited++ 🚀🚀

@wennwenn1422 4 ай бұрын

Didn't understand how gac helps with adaptive capacity like even if gac okays a request, adaptive capacity would have throttled a given partition that is getting hotter. So what does gac help with incontext of adapative capacity?

@RONAKLakhwani 4 ай бұрын

How does Metadata Service is implemented. Which data storage engine does it uses? As it can not use dynamo DB and it has to be highly available(inspite of having MemCache), I wanted to know how AWS DynamoDB have implemented Metadata Service

@srivathsaharishvenk 3 ай бұрын

great video, just a note that cassandra was the pioneer of this architecture and dynamo just improvised from that

@jrajesh11 3 ай бұрын

But Cassandra was developed much later at Fb.

@srivathsaharishvenk 3 ай бұрын

@@jrajesh11 cassandra paper was published in 2007 and initial version was in 2008, dynamo was in 2012

@priyajaiwal8072 Ай бұрын

Booootifulll❤

@manishrawat5253 4 ай бұрын

this is so good 😍😍

@advaitdocvita 4 ай бұрын

Amazing video 🎉

@imsiddhants_official 3 ай бұрын

Really good content.

@omprakashyadav-hf4pg 3 ай бұрын

Amazing efforts to provide high level of dynamoDB paper. Engaging video overall. Is there any plans for MongoDB?

@HighPulp382 21 күн бұрын

Loved it, how would the global admissions control work with burst capacity. If router layers doesn't allow more than 1000 req/sec why is there a need for burst capacity?

@letsmusic3341 3 ай бұрын

Great video. I have fxcxew questions: 18:30, Would leader node applies the updates to the table in realtime or async updates like followers in case of eventual consistent reads. Also, what happens to writes in consistent writes? Who updates partition data in Metadata service? What will happen if my data storage node has capacity of 1000IOPS and I have two partitions P1, P2 of 400 each of two different tables T1, T2(of different customers). Now client requests for 800 IOPS on T1 for P1. Now the required IOPS are 800(T1, P1)+400(T2, P2)=1200IOPS but server only supports 1000. How the server would handle this load in case of adaptive load? Also, please consider the case when partition 2 also needs at-least 400IOPS at any time. The customer don't want any latency as I've paid.

@__VishalSharma 4 ай бұрын

At 17:55 don't you think waiting for the replicas for the response makes it little slow because what if the request dies or we don't get the response back

@ankursao7 4 ай бұрын

We need to wait for replies from at-least some replicas to make sure system is fault tolerant.

@sivaprabhaharan3475 4 ай бұрын

1:20:20 what if the connectivity is down for both the readers will they still initiate elections? 1:31:00 what if the deployed changes in the read systems (consumers) are not able to understand the write (producers ) which are not yet deployed?

@SEDHURAJANALLAYAN 25 күн бұрын

7:38 index value + pk is bit misleading / confusing.. btw pk as partition key would be more apt. in case of LSI, its same pk + different sk in case of GSI, pk can be any new col or old pk + (with or without new sk)

@prashantkaushal4057 4 ай бұрын

just had a question that how asynchronous calls to memds help it in handling traffic in case of failure?

@parasmehndiratta 4 ай бұрын

+1 cant we keep memds over provisioned? Why do we need to do async calls?

@sarthakgarg5816 3 ай бұрын

@@parasmehndiratta How you know how much load would be there in real scenario ?? without async call ?

@bhavyapatel1863 4 ай бұрын

would love to know how to data internally transfer between nodes when partitioning happen btw love this ep ✨✨

@usmanrahat2913 3 ай бұрын

Hi Arpit, Which app you using to take these notes? Thanks

@divyankpandey3541 3 ай бұрын

How are we splitting the partitions?- it is anyways decided by partition key logic?

@sumanshverma1234 3 ай бұрын

Quick question : How can I keep fetching live data from an API endpoint and ingesting it into a database . Right now , I use a cron job to fetch via a script the newly updated records once a day . How can this be done in realtime ?

@VisKu-ck1wr 2 ай бұрын

in python varibale inside if-else or loop is treated as global varibale when they are not written indise a function

@sameerkumarkushwaha8214 3 ай бұрын

It's not clear to me if everything is being fetched from MemDS, what's the point of having metadata store?

@Brahmdecoded 12 күн бұрын

Everything is fetched from memory of vm, memDS is there to avoid single point of failure while still being fast. Redis is something similar to MemDS. MetaDB provides persistent view of meta data.

@adilsheikh9916 4 ай бұрын

The last example...it shows that readers have backward compatibility...means they can process previous messages also...

@himanshuladia9099 16 күн бұрын

You specifically called out that the partition replicas here are not equivalent to a postgres read replica. How so? I can setup a synchronous replication of the primary to standby servers on postgres too. All the write requests will go to the primary which returns after quorum across replicas. Seems exactly the case with dynamo as well. Am i missing anything?

@ronakshah725 15 күн бұрын

You can set it up that way, but I think the intended usage is to separately handle read and writes, when it comes to read and write replicas for Aurora Postgres. Conceptually similar to CQRS, where reads are always routed to read replicas so they can be read optimized. In the DynamoDB scenario, the replicas handle both reads and writes and are there to introduce data redundancy rather than separation of read/write concerns. At least that’s my understanding

@himanshuladia9099 14 күн бұрын

@ronakshah725 Interesting, from what i understood from the video - Writes are sent to the primary only in Dynamo db as well.

@ronakshah725 13 күн бұрын

@@himanshuladia9099 nodes are homogeneous, ie. Anyone can be a leader

@ArjunRMenon-kl2oo 3 ай бұрын

Hi Arpit, Following you for a year now, thanks for all the beautiful content you post. One doubt, why can't we directly overprivision the Metadata store and keep it ready for war incase all the routers go down instead of having MemDS?

@bhavukkalra1786 3 ай бұрын

Hey there Arpit! Amazing presentation. Is there a way we can download the notes in some kind of pdf if available?

@djinnGamingLive 4 ай бұрын

Hi Arpit, quick question, where do you find such interesting papers ? By the way really love the content !!

@digitalnomad303 4 ай бұрын

Bhai, 26 minutes meh 4 advertisements dekh chuka hu abhi tak No No not complaining.

@AsliEngineering 4 ай бұрын

I do not control them. it is YT who is showing it.

@digitalnomad303 4 ай бұрын

@@AsliEngineering Nvm, it is still worth it given the amt of effort goes into creating such content.

@daanyaalkapadia 4 ай бұрын

Thanks for posting. It would be helpful if you could share your notes on what you have written.

@nitinchauhan9334 4 ай бұрын

At 59:00 how would the burst technique work if partitions p1 and p2 are on different storage nodes?

@nitinchauhan9334 4 ай бұрын

or is it only possible for partitions on same storage node?

@suryaakella4508 3 ай бұрын

Hey Arpit, Can you please make a video about Aws aurora ?

@AjitKumar-zz7kh 22 күн бұрын

Hi Arpit, loved the explanation. Will you please do one on netflix engineering blogs. It would be lovely to get your insight on thumbnail lazy loading strategy in Netflix ui. Thanks a ton in advance.

@AsliEngineering 21 күн бұрын

added to my list. thanks for recommending.

@satyamjha68 4 ай бұрын

Amazing video ! Loved it and understood it completely ! Can you explain Google's Zanzibar paper in similar fashion as well?

@MitaliNeerPatel 7 күн бұрын

You mentioned between 17-18 minutes that there will be 3 write-ahead log files for each of the Replicas. Why so ? Wouldn't that create too much redundant data on each on disk ? Because, all these 3 write-ahread log files are having same data.

@RohitSharma-w5k Ай бұрын

Can you please share the link to these notes?

@codefast93 4 ай бұрын

Amazing video!!! Not sure if I understood the real reason why we send requests to MemDS even when it’s cached?

@RudeusGreyrat-q6i 4 ай бұрын

There is a possibility for the router service to restart or fail leading to cache invalidation. In that case the whole load will be transferred to the Metadata service which could break the service, hence Memds was introduced to optimize range queries.

@SaumyaranjanMishra-m5n 4 ай бұрын

Can you share your notes that u r presenting

@utkarsh09raj 4 ай бұрын

How the router cache is updated when partition happens?

@mrigankdubey1365 4 ай бұрын

Is DynamoDB still following CAP theorem with its "Strong Consistency" promises?

@jayeshdalal7 4 ай бұрын

I must say, I found this to be really great, and I was able to see and learn about it . I do have a question, though. In the burst implementation, when the capacity is full and we need to utilize another capacity to fulfill the request, what happens in a scenario where the other capacity also doesn't have full capacity (let's say 30), but instead has 15, and another one has 15? Which data structure (DS) would we choose in this kind of use case?

@anuragtiwari3032 4 ай бұрын

Really loved these long range videos. I had one question currently i know java and i want to understand the dice db code, so how much go is required to understand that. I mean the nuances like the concurrent framework etc. Will any introductory course be enough or is it too complex to understand even after knowing Go. Can i learn go for one month and then be able to navigate around the code

@AsliEngineering 4 ай бұрын

Thanks. Yes. You can totally learn go on the go and contribute.

@RONAKLakhwani 4 ай бұрын

GAL controls the rate limiting at the table level however if all the request are going to the same partition(in your example P2), Adaptive one will be reactive and will take some time. Does Dynamo DB rely on retry before returning throttling to the Client?

@bharadwajpoduri3349 4 ай бұрын

That is right from what I understand. GAC is also needed to know whether an Adaptive Capacity needs to kick in or not in the first place since it tracks table level throughput at a global level. But since adaptive capacity takes times, GAC can only do some retries

@deeppande4454 4 ай бұрын

Does some partitions and replicas are pre created or created at time write request comes

@saurabh0113 3 ай бұрын

Next 1. Bigtable 2. MapReduce

@charan.sunkara 4 ай бұрын

Can you explain why Dynamo dB is called as serverless, as per my understanding serverless is something which is provisioned as per demand but that's it what happening exactly here right. Correct me if i am wrong with my understanding

@physicsakhada592 9 сағат бұрын

Please on ElasticSearch

@adilsheikh9916 4 ай бұрын

The last example...seems readers have to be backward compatible to process the old messages...

@anirbanb1908 2 ай бұрын

Hey Arpit, please share the notes as well

@parthivreddy7989 2 ай бұрын

thanks

@premvishwakarma6409 4 ай бұрын

Please make a video on neo4j.

@shubhammurarka6589 4 ай бұрын

Next on zookeeper!! no proper content available online.

@random4573 4 ай бұрын

1:02:32 NOTED

@paiameya 23 күн бұрын

But why do they limit and then skip? Doesnt make any sense..

@AsliEngineering 23 күн бұрын

Consistent performance at any depth of pagination.

@194sandy 3 ай бұрын

Next Cockroach DB please

@AsliEngineering 3 ай бұрын

Already have a podcast with Ben - kzbin.info/www/bejne/r5nUiGtogMZgndksi=JyFRYVnOMFl34jmq

@himanshukabra4387 4 ай бұрын

Video is great. Obviously. But description is irrelevant. Looks like you unknowingly copied pasted previous video description :)

@akashsharma9674 2 ай бұрын

Hey sir 🙏🙏🙏🙏🙏🙏🙏🙏🙏🙏🙏🙏 sir plz plz plz tell me about yourself that do you live with your family(parents) . will i ever be able to live with my parents if i continued in software engineering as no one is telling about this . i get very depressed by this fact that software engineers are not able to live with family for average engineer