S3 system design | cloud storage system design | Distributed cloud storage system design

  Рет қаралды 103,679

Tech Dummies Narendra L

Tech Dummies Narendra L

Күн бұрын

In this video, lets understand system design for
AWS S3 system design
Azure Blob storage system design
Distributed cloud storage system design
Distributed data store system design
#s3systemdesign #cloudstoragesystemdesign #blobstoresystemdesign

Пікірлер: 115
@kumarc4853
@kumarc4853 3 жыл бұрын
I interviewed a candidate recently and he mentioned to me about your channel. Thank you for the good content and teaching lot of people and helping them crack system design interviews,
@metalalive2006
@metalalive2006 3 жыл бұрын
20:28 overview of the design with example * 22:04 partition layer * 23:40 stream layer * 26:34 different partition strategies 27:34 stream layer * 28:06 store new file in append-only fashion * 29:00 seal file server that is full * 31:24 monitor space of all these file servers * 32:36 garbage collection performed on sealed file servers * 34:30 replication * 38:01 health check on the file servers * 40:32 block group 45:27 partition layer 48:56 performance improvement tips
@metalalive2006
@metalalive2006 Жыл бұрын
At 28:06, you mentioned that spinning hard disk was a cheap feasible hardware solution for scalable storage service like S3 and SSD disk was expensive, I am interested to know if that is still true in 2022 since I know very little about detail architecture and marketing of SSD storage .
@kumarc4853
@kumarc4853 3 жыл бұрын
A friend of mine got into FB and APPLE. He found your channel (and couple of other SD channels) very helpful in his prep. We can do this! Thank you
@kunchasaikrishna
@kunchasaikrishna 4 жыл бұрын
Really your channel content not less than any other top online education platforms. Appreciate your content 😊 Thankyou so much🙏
@Siddharth42280
@Siddharth42280 3 жыл бұрын
@Tech Dummies Narendra L: Could you please make videos on a centralized logging system and a distributed job scheduler?
@renon3359
@renon3359 3 жыл бұрын
Your channel is priceless brother, thank you.
@bhavyamishra3502
@bhavyamishra3502 4 жыл бұрын
Nice content....keep it up👍👍
@forgotten225522
@forgotten225522 3 жыл бұрын
Most valuable information ever on your channel.
@rohitsharma-rp2jh
@rohitsharma-rp2jh 3 жыл бұрын
shandaar zabardast zindabaad!
@abrarisme
@abrarisme 3 жыл бұрын
this was great, can't wait to see more videos!
@gunhound45
@gunhound45 4 жыл бұрын
Just want to say that I really love watching these videos. Even if I'm not preparing for system design interviews, its fun to do these thought exercises to design a big system.
@asahikitase5398
@asahikitase5398 3 жыл бұрын
you got recognized by Kim Jong-un!
@pramodsingh4668
@pramodsingh4668 2 жыл бұрын
This channel covers a lot of ground and probably one the best channels. But...and a big but...It takes 2-3 times more time than needed. A lot of duplication, unrelated content which turns a 20 minute video into an hour video. For example, everything before first 20 minutes could have been finished in just 2-3 minutes. Please keep it short and precise. Appreciate all the hard work you put and the knowledge you are sharing. Keep going.
@__abhiruchigupta__
@__abhiruchigupta__ 3 жыл бұрын
Really appreciate the level of detailed information provided in this video. Thanks a lot for your hard work and creating such awesome content !! :D
@sureshnathann8360
@sureshnathann8360 4 жыл бұрын
Hi Narendra, You awesome man! Keep posting ! Keep learning!!
@amanpervaiz2843
@amanpervaiz2843 2 жыл бұрын
This channel is gold!
@trybeingakr
@trybeingakr 4 жыл бұрын
Appreciate the drastic improvement in delivery style.
@balakrishnan3725
@balakrishnan3725 3 жыл бұрын
Thank you Naren! Nice video. I could feel the effort which you have put to create such video.
@tanayakarmakar2407
@tanayakarmakar2407 Жыл бұрын
great content
@ullas06
@ullas06 4 жыл бұрын
Thank you for your time and efforts ,Its very helpful.
@pravaskumar7078
@pravaskumar7078 4 жыл бұрын
awesome...very helpful
@fendyuchiha
@fendyuchiha 3 жыл бұрын
Really Appreciate your video here. You explain it very clear.
@sushantasaha9938
@sushantasaha9938 3 жыл бұрын
Appreciate your hard work behind it
@amlanch
@amlanch 2 жыл бұрын
Terrific presentation! Love your videos
@prasadg9583
@prasadg9583 4 жыл бұрын
loved it mate!! thanks ❤️
@ramakrishnanvisvanathan3378
@ramakrishnanvisvanathan3378 2 жыл бұрын
Really liked this comprehensive design session, great keep it up and all the very best. I really appreciate the the work you have done towards bringing such wonderful to us.
@aneksingh4496
@aneksingh4496 4 жыл бұрын
Must say ,it would have taken much time for you to prepare this content , kudos !!!
@kirankothandan5529
@kirankothandan5529 3 жыл бұрын
You are an amazing teacher bro. I am a frontend folk but I am still interested towards system design because of you. How the design are made the way you explain makes me very curious. Thanks for the big efforts. Cheers 👌
@harishkrish14386
@harishkrish14386 4 жыл бұрын
Very nice videos including ur perspective on how to get jobs in germany, kerp going bro 👌🏻👌🏻
@mattleahy3951
@mattleahy3951 3 жыл бұрын
Great video! Only question I had is in the table you showed for the Stream manager, where it tracked the Start and stop offsets for the primary, it also had fields for the secondary and tertiary replicants, but it didn't separately track their offsets; that would need to be included as well, right? Thanks.
@icey3080
@icey3080 3 жыл бұрын
this is very useful, thank you
@ankita8867
@ankita8867 3 жыл бұрын
Thanks for posting!!
@ravitandon9351
@ravitandon9351 Жыл бұрын
Very well done!
@kveldgorkon4611
@kveldgorkon4611 2 жыл бұрын
Thank you .. Great Explanation
@JashanPreetsingh-mi2nl
@JashanPreetsingh-mi2nl 3 жыл бұрын
Nice
@a.yashwanth
@a.yashwanth 4 жыл бұрын
Amount of work you put in making these 50 minute long videos is insane.
@kumarc4853
@kumarc4853 3 жыл бұрын
phenomenal work. we dont have to read books, they are for dummies :p
@Vendettaaaa666
@Vendettaaaa666 3 жыл бұрын
Mind blown!
@anuragagnihotri5238
@anuragagnihotri5238 2 жыл бұрын
Thanks a lot for putting effort and providing design details of the distributed cloud storage. Although I had few questions:- 1. I see Cluster manager is SPOF, how do we handle if the CM is down ? 2. Why do we use DNS approach to update available Region routing ? Usually dns resolving is cached with few minutes or so, which will increase the downtime ? 3. How do we handle concurrent update(not append) for same file from different users ?
@adithyaks8584
@adithyaks8584 3 жыл бұрын
Wow!! simply wow... Now I can cross question managers at Amazon during interviews
@boombasach
@boombasach 2 жыл бұрын
Really appreciate you putting up quality content. Very insightful . Couple of suggestions thougth - may be starting with high level user flow which you started talking at 21.00 will be useful. Also I am not sure both API server and Cluster Mgr two separate component talking to one DB is a good idea.
@TechAwesomeMemories
@TechAwesomeMemories 3 жыл бұрын
good
@praveenjain183
@praveenjain183 3 жыл бұрын
Great Stuff Narendra, I appreciate the effort you make in gaining all this knowledge from multiple sources and sharing with us. Thanks a lot.
@vigneshrajarajan6724
@vigneshrajarajan6724 4 жыл бұрын
Hi Naren, thanks for your work. I have a question on uber/ food delivery design, from what i collected most of the applications rely on state machines to proceed to next step, could you please explain us how this Finite state machine is used in food delivery/uber designs
@asahikitase5398
@asahikitase5398 3 жыл бұрын
thanks buddy, I do prefer the way you started with a simple architecture, and improve the system while increasing the traffic.
@prashant211087
@prashant211087 4 жыл бұрын
I appreciate your efforts. If possible, can you also share the references you go through for such design questions.
@vijayprajapati8475
@vijayprajapati8475 3 жыл бұрын
444r
@fragrancias972
@fragrancias972 3 жыл бұрын
He seems to read a lot of tech companies’ engineering blogs, based on his content.
@metalalive2006
@metalalive2006 3 жыл бұрын
really appreciate his effort , these engineering blogs in these tech companies are mostly very long articles
@amlanch
@amlanch 2 жыл бұрын
Excellent explanation. You didnt talk about the leader election and manager election in any of the layers but that's just some more detail.
@Miguel-ym2rr
@Miguel-ym2rr Жыл бұрын
This is the first time that I see how S3 works. Thank you so much!. I decided to focus my career on Distributed Systems as a Software Engineer, how do you get the base knowledge to design and implement a Distributed System?
@Vendettaaaa666
@Vendettaaaa666 3 жыл бұрын
The partition server + linked list of file servers idea seem like "Consistent Hashing on steroids"! Basically instead of a single server on a ring for a given hash range, it's an array of servers.
@SunilKumar-yd8xv
@SunilKumar-yd8xv 3 жыл бұрын
Amazing Content! Really appreciate your efforts. One question - Do you need cluster manager in this architecture? Simple, failure, geo, weighted routing are supported by DNS mostly.
@zianxu2006
@zianxu2006 3 жыл бұрын
great content. Really appreciate it. I'm wondering, is it a good idea to start with a simple design and then scale up towards the final target design? I tried that at an interview and got the feedback that I didn't address many of the complexities until later in the discussion... Some other times I jumped into details upfront and got the feedback that I was focusing on details too much too soon....
@RajenderReddy12sw
@RajenderReddy12sw 2 жыл бұрын
it's always a good idea to ask the interviewer.. what they are interested in..
@progfan234
@progfan234 3 жыл бұрын
Awesome stuff as always! I have a couple of questions: 1. What impact will consistent hashing in realtime have on serving requests? 2. What will happen when a particular partition server goes down? Will it be replaced by a standby? How many standbys should you consider maintaining? 3. Is the Partition Map table a single point of failure? Or is it a within-cluster replicated data store? 4. Would there be any benefits to replicating a given file server within a cluster?
@SharpySnipery
@SharpySnipery 2 жыл бұрын
حء مگر جنگففےےےےےتےگءیءءءیثتسےڈےڈءءقرقر قررقنرضنعضھڑضھھڑضھرگےرےڑےڑےڑثڑڑےثثثثڑحڑحڑءضءقءرءرقءڑقےڑضےڑتقتڑقءڑقحڑضیرءرقےرقفڑقےڑقیہقریءےن نڑںڑچغدڑ ڑنر
@hydtechietalks3607
@hydtechietalks3607 4 жыл бұрын
Great Talk, I love this.. but to differentiate from others, Please anounce who is the audience and what is the depth level you would go in this video..for example, are you going to discuss algorithms used in design or overview of it.. if its scoped for an application developer or scoped for systems design developer...
@kdakan
@kdakan Жыл бұрын
How do you do file and disk operations on the remote file server, from the partition server and the stream server (like copying, clearing up space from unused blocks, etc.)? Do you mount an NFS share on these servers and issue local shell commands on these remote shares?
@rohanbundelkhandi3202
@rohanbundelkhandi3202 4 жыл бұрын
Very Nice Video. One doubt, How Partition Server communicates to Stream Manager? As we don't have direct link over there..
@groinache
@groinache 2 жыл бұрын
very nice presentation. Concise and good pronounciation. However, too much echo. Suggest to get a better recording system or infra with anti-echo.
@happyandinformedlife1212
@happyandinformedlife1212 4 жыл бұрын
Given a set of processes running on a cluster of hosts , design a system that load balances the hosts through live migration of the process. The goal of the load balancer is to minimize or prevent recourse starvation, a situation in which processes are not allocated the amount of recourses they want to consume. In case where all hosts in the cluster are overloaded, we want to distribute recourses evenly across demanding process. Given an imbalanced cluster, we want to bring it to a banned state as soon as possible at the lowest cost. Can you do Load Balancer next:
@sowjanyav6570
@sowjanyav6570 3 жыл бұрын
what happens if a user wants to add more content to a file, (say file has 1-100 lines, and user wants to add 10 more lines to it) which is already in a sealed storage server? Will the file be copied to a new server? Or only the extending part in a different file server?
@DarwinLo
@DarwinLo 3 жыл бұрын
The Cluster Manager is responsible for updating the DNS entries upon a cluster failure. What do you suggest doing for client-side caching of DNS queries?
@KimetsuNoYaiba100
@KimetsuNoYaiba100 3 жыл бұрын
Good followup: How does PUT API work for large files?
@shantanu143
@shantanu143 2 жыл бұрын
Good contect however one doubt like if we are replicating from Europe to Asia isnt it Asynchronous replication?
@doydoybb
@doydoybb 4 жыл бұрын
I have a question. In your first simple design, you have a separate server to store metadata. On your second scaled storage system, where are the metadata stored? Is it all stored in the stream manager? Or is it stored on each individual partition server? Thanks!
@kinandchowdary6456
@kinandchowdary6456 3 жыл бұрын
That will handled by cluster manager.
@rujhanarora7892
@rujhanarora7892 2 жыл бұрын
@@kinandchowdary6456 Stream manager dude
@ranjithsudhakar9304
@ranjithsudhakar9304 4 жыл бұрын
Great work, a small suggestions if it makes sense for you. Videos less than 20 minutes are appealing than longer videos. In case if it cannot be condensed then could be split in to parts. Awesome work on all your system design videos. Thanks
@Reji012345
@Reji012345 4 жыл бұрын
It's better to be at file.. otherwise it will break the flow.
@ellakkiankvp6267
@ellakkiankvp6267 4 жыл бұрын
Not really, that can be left to the audience, I mean if you need break, you can pause, right? Also since this is a single entity, it's good to be a single video, honestly, I don't see any partitions here. Also psychologically imo if you recall the flow and feel something's hazy it's Less cognitive load to look for it in the flow compared to thinking between videos.
@viewforsourav
@viewforsourav 3 жыл бұрын
How does Partition Server handle concurrent write requests if the system wants to honor append mode of writing to disk? One solution will be for a Single Stream - one can have multiple writers, each of which write to different file servers. However orchestrating such a model would be excruciatingly complex. Or Partition Servers can be logical entities with a 1-1 mapping to the stream id. Definitely that will lead to having many stream ids and some house keeping work for the Stream Manager. This will ensure the append mode of writing data and a better spread of file servers to stream ids. Let me know your thoughts Naren@Tech Dummies. Thanks for your videos.
@willinton06
@willinton06 3 жыл бұрын
"excruciatingly complex" sounds about right, there's a reason why only a handful of companies even try to get something like this working.
@baoleijia3764
@baoleijia3764 3 жыл бұрын
appreciate your share, but 1, I don't think different replications located in defferent Region, it costs to much to tranfser data between replications 2, i don't think the fail over switch is done by dns,
@zakariamaaraki1130
@zakariamaaraki1130 3 жыл бұрын
Great video keep going! I have only one remark, in minute 11 you said that replication must be in other region in case of a disaster, i think data must stay in the same region for some reasons (latency, RGPD ...) but in different Availability zones instead (this is the default option used by S3). Am i right ?
@phildinh852
@phildinh852 2 жыл бұрын
Yes, data is replicated in AZs of same region. There is an option to replicate data to another bucket in another region.
@shallimeetyougmail
@shallimeetyougmail 2 жыл бұрын
Time 48:10 Remapping of range from 0-100 to 0-50 and 50-100 is fine. But what happens to the files which are already written in the previous partition? How will the reads for UUIDs with hashes 0-50 map to the older partition?
@SudhanshuTamhankar
@SudhanshuTamhankar Жыл бұрын
In that case, the mapping is not updated till the new stream is already "warmed up", which means that the files with 0-50 hashes are already copied over to the new stream. Once this is done, there is a cut-over transaction in the partition manager DB which now starts routing the calls for 0-50 into the new stream. In the meanwhile, there might be files which got written to the old stream while this transaction was still happening. So that is handled by a catchup routine which ensures all files have been copied over. Imagine it to be a two stage commit : When the cut over begins, there is a soft commit which says : write all new files for 0-50 in new stream. At the same time, while reading, try reading from both new and old stream. Once all files are copied over and there's no stale writes left over in old stream, the commit is finalized. Now all reads and writes for 0-50 go to new stream, and some garbage collection happens for old stream to free up space. Hope this helps.
@eugenee3326
@eugenee3326 Жыл бұрын
Great video but why can't ZooKeeper just do what Partition Manager does?
@andybhat5988
@andybhat5988 2 жыл бұрын
Ceph RADOS layer with remote replication can handle this much better. It also does not need metadata server for replication. Using CRUSH, proper availability can be guaranteed.
@OnkarSingh-fc8mu
@OnkarSingh-fc8mu 2 жыл бұрын
(Time 48:10) In case, when there is more load on the partition servers, the partition manager splits the range into two partition servers, how does this newly created partition server would talk to the older file server in the streaming layer (where the file was actually stored) Does anything change in streaming layer as well?
@amishsumit
@amishsumit 2 жыл бұрын
When partition manager assigns a new partition for a subrange say 1-50 out of 1-100, it also updates the partition map table entries. For example all the hash values say 14, 36, 42, 58, 89 were initially mapped to the partition server 2. Once the new partition server is added corresponding exiting stream servers in map table (14, 36 & 42) will be mapped to this new partition server. That way any further read request for those existing stream servers will be served by this new partition server.
@phildinh852
@phildinh852 2 жыл бұрын
​@@amishsumit But a partition server is assigned to 1 stream only?
@mopsyched
@mopsyched 3 жыл бұрын
Something like RAFT or Frangimini or Spanner is always used for file servers
@himanshuupadhyay6749
@himanshuupadhyay6749 3 жыл бұрын
Quick question, when the request of a file upload goes to the server, is it chunked on client side? if so where sync service will come into the picture?
@Gerald-iz7mv
@Gerald-iz7mv 2 жыл бұрын
good question - shouldnt there be a chunk service - which splits the file into chunks?
@RachnaDiary
@RachnaDiary 3 жыл бұрын
how to store images or videos? what is the mechanism behind that? what have you explained it's for storing a file is okay but for photo/videos how it works?
@pearlssnowboard3793
@pearlssnowboard3793 3 жыл бұрын
Do you have any idea how to design a system load a 5G file to 5000 server?
@paraschawla3757
@paraschawla3757 3 жыл бұрын
S3 system use Object Storage instead of Block Storage as mentioned in 43:00 min, Correct me if I misunderstood.
@metalalive2006
@metalalive2006 3 жыл бұрын
does anyone know how cloud storage like Amazon S3 handle access control of each uploaded file ? for example , Amazon S3 exposes API endpoints for consumers to read and edit access control list of a file object , how does S3 do things ? really appreciate any reply or hints.
@noypi613
@noypi613 3 жыл бұрын
what technology do you use store the file? is it a database?
@nikhil199029
@nikhil199029 26 күн бұрын
TCP
@tylerscott6531
@tylerscott6531 3 жыл бұрын
Do AWS regions each represent a continent? I thought "us-east-1" and "us-west-2" were both in the US.
@akashjain2990
@akashjain2990 2 жыл бұрын
Why do we need partition layer? Why can't the API layer directly talk to Streaming layer since there is 1:1 of Partition to streaming layer anyway?
@PoojaMehta271
@PoojaMehta271 2 жыл бұрын
Isn’t API server at 23 min nothing but a load balancer?
@nagaranimogullapelly2742
@nagaranimogullapelly2742 3 жыл бұрын
Can you please upload instacart system design
@noypi613
@noypi613 3 жыл бұрын
how will the api insert data to the data store server?
@rishabhgoel1877
@rishabhgoel1877 4 жыл бұрын
Thanks, it would have been much better if you had related these concepts in terms of S3 keys and buckets
@viditmathur8437
@viditmathur8437 3 жыл бұрын
what happens if cluster manager goes down?
@prasenjitkundu7904
@prasenjitkundu7904 3 жыл бұрын
do you know captain america
@ariellyrycs
@ariellyrycs 4 жыл бұрын
Hey , how can I deposit you the dollar 💵, this is too much work, I have an interview coming up and I’m watching all your videos , thank you
@TechDummiesNarendraL
@TechDummiesNarendraL 4 жыл бұрын
Thanks, Join the channel. You will find join button in the channel page!
@sumonmal009
@sumonmal009 3 жыл бұрын
Solution 20:28
@gijduvon6379
@gijduvon6379 2 жыл бұрын
I think noone today use spinning disks in production. At least in new projects. SSD are not so costly as they used to be.
@nalamda3682
@nalamda3682 Жыл бұрын
why not zip?
@zuowang5185
@zuowang5185 15 күн бұрын
Is this a mid level answer?
@MohanRaj-vp1zt
@MohanRaj-vp1zt 3 жыл бұрын
Lot of content, but language & presentation is quite poor. Because of that the flow is broken multiple times. This really doesn't help in an interview setting of 45 mins. The first major thing that an interviewer would want to see is the REST API signature of different functionalities offered , for example upload_file.
Paste bin system design | Software architecture for paste bin
31:20
Tech Dummies Narendra L
Рет қаралды 63 М.
КАРМАНЧИК 2 СЕЗОН 6 СЕРИЯ
21:57
Inter Production
Рет қаралды 404 М.
How many pencils can hold me up?
00:40
A4
Рет қаралды 18 МЛН
Black Magic 🪄 by Petkit Pura Max #cat #cats
00:38
Sonyakisa8 TT
Рет қаралды 15 МЛН
OMG 😨 Era o tênis dela 🤬
00:19
Polar em português
Рет қаралды 11 МЛН
System Design for Beginners Course
1:25:07
freeCodeCamp.org
Рет қаралды 1 МЛН
Realtime Advertisement Clicks Aggregator | System Design
32:56
Code with Irtiza
Рет қаралды 17 М.
AWS re:Invent 2023 - Dive deep on Amazon S3 (STG314)
51:25
AWS Events
Рет қаралды 7 М.
System Design Interview: Design Uber w/ a Ex-Meta Staff Engineer
1:03:05
Hello Interview - Tech Interview Preparation
Рет қаралды 24 М.
File (NAS) vs. Block (SAN) vs. Object Cloud Storage
29:28
Full Stack Zach
Рет қаралды 80 М.
In Memory databases internals for system design interviews
34:59
Tech Dummies Narendra L
Рет қаралды 39 М.
How row oriented and column oriented db works?
21:26
Tech Dummies Narendra L
Рет қаралды 40 М.
Google Maps System Design Interview Question
1:01:07
codeKarle
Рет қаралды 140 М.
System Design Interview - Distributed Cache
34:34
System Design Interview
Рет қаралды 344 М.
КАРМАНЧИК 2 СЕЗОН 6 СЕРИЯ
21:57
Inter Production
Рет қаралды 404 М.