System Design Interview - Distributed Message Queue

  Рет қаралды 291,334

System Design Interview

System Design Interview

Күн бұрын

Пікірлер: 355
@stillmattwest
@stillmattwest 2 жыл бұрын
This was the first video I'd seen from this channel. This is some next-level system design content. Way more in-depth than other videos I've seen. Unfortunately, it doesn't look like there have been any recent uploads, which is really too bad.
@random-characters4162
@random-characters4162 Жыл бұрын
Yeah! But the content is so dense! I mean it more than enough
@debasishnayak5576
@debasishnayak5576 5 жыл бұрын
Normally I play videos in 1.5x or 2x. Your videos have so much information that I am afraid of losing some fundamentals if I play in 2x. Outstanding quality. Please keep making such videos.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you, Debasish. Really appreciate the feedback!
@vasilyvlasov3255
@vasilyvlasov3255 3 жыл бұрын
This is absolutely the best content on KZbin on the system design topic! No "scratching the surface" bullshit, but rather very in-depth and concrete explanation on how to navigate successfully through the system design interviews. Thank you Mikhail for your great efforts! Большое спасибо, Михаил! One thing bothers me though, there haven't been any recent updates on the channel. I'm pretty sure, all the people here will appreciate and support if Mikhail decides to continue his endeavours and uploads new videos! Anyway, thanks a lot!
@rahulnath9655
@rahulnath9655 Жыл бұрын
this is the best video for systems design I've ever watched. I listened to it at 50% speed to write down every word whereas I usually watch YT vids at 1.5x -- each sentence was invaluable.
@suzi3245
@suzi3245 Жыл бұрын
Each word in this video is a golden word. Make sure you don't skip or neglect it. Thank you so much
@adityabahuguna6815
@adityabahuguna6815 5 жыл бұрын
Appreciate your efforts on aggregating and delivering such quality content in such a lucid manner. I don't think there is better content than this anywhere on youtube especially for system design topics. Wow ! (Y)
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you, Aditya. Appreciate your kind words.
@TheMsnitish
@TheMsnitish 3 жыл бұрын
@Jordan Apollo SPAM Alert
@kumarchandan9685
@kumarchandan9685 3 жыл бұрын
Totally agree, top quality content
@jamesyin3220
@jamesyin3220 2 жыл бұрын
This is the best content on system design I've ever seen. Please consider resuming the journey! We'd love to ride along!
@SudeeptaSood
@SudeeptaSood Жыл бұрын
can't thank you enough for this video. All of these components are building blocks and the interviewer can dig deep as to how the requests are handled from client to server. Awesome video
@jhnmn
@jhnmn 5 жыл бұрын
I almost never comment on KZbin but this undoubtedly deserves an exception. Thank you for the superb quality content you’ve put together. I wouldn’t be surprised if this series becomes the de facto video resource for systems design and architecture interviews. Hope you keep uploading!
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you, Johan, for all the kind words! More videos to come.
@jasonmeyer495
@jasonmeyer495 11 ай бұрын
I wish this guy was still making these videos. By far the best of the system design interview content out there (I've watched them all, lol).
@ShivaSomapur
@ShivaSomapur 3 жыл бұрын
I don't think any other youtube video on System Design goes this deep into explaination. Thanks for your efforts in bringing these videos to all of us.
@jayendrasingh6580
@jayendrasingh6580 3 жыл бұрын
These videos are best resource among all I have gone through. I am surprised, why this channel is not posting any more videos. Good Work and thank you !
@thunrou
@thunrou 4 жыл бұрын
Mikhail is the best. His system design videos are systematic and top notch with respect to quality and crispness. Presenting how to finish a complete design in 35 to 40 mins Wow, this is a great feat and only way for us. Prior to watching these videos my ideas and presentation is messy and dis-organized, but these videos gave me the basic sense what is expected in System Design and how to approach it. I take down the notes while watching the videos and try to apply these principles to other interview questions. This is greatly helping with my preparation. Can't thank you enough Mikhail, you are one of my best teachers.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Glad to hear that, thunrou! Thanks for sharing!
@jackson1012-x8e
@jackson1012-x8e 4 жыл бұрын
This is the best distributed message queue system design I have seen so far. Many good concepts introduced and summarized , I feel it very helpful by using it as a guideline and read the other documents for more details such as the AWS SQS document. Looking forward to more content in the future.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Hi Jianan. Thank you a lot for the feedback!
@yooos3
@yooos3 Жыл бұрын
wow! This is so to the point and even the duration of the video is as good as an actual interview discussion! Touched so many "must know" topics!
@deephorse6110
@deephorse6110 3 жыл бұрын
Thank you for summarizing precisely about what can be covered in a 40 minute time limit. Knowledge is one part which is built over learning and experience. Your video really helps to focus on structuring and expressing the knowledge in a coherent manner. Thank you.
@Dragoon77
@Dragoon77 5 жыл бұрын
I've watched the whole series already, thanks for the great quality content! looking forward for more
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you Dragoon77! Working on more videos.
@gnanyreddy3030
@gnanyreddy3030 20 күн бұрын
This is really a goldmine to learn system design from scratch
@akhtarmnnit
@akhtarmnnit 5 жыл бұрын
I have seen a bunch of youtubers for system design interview...this one is one of the better ones...good way of using graphics while talking instead of mundane approach of heavy talking and using a whiteboard....Great job buddy...I am gonna explore all your video now
@vcfirefox
@vcfirefox 2 жыл бұрын
I bought your course. Arguably the best investment I made for system design courses so far. Thanks for putting together contents and explaining them so lucidly. I look forward to other two modules.
@sonamsawhney3428
@sonamsawhney3428 2 жыл бұрын
the course isn't available for new users :(
@nehanigam4997
@nehanigam4997 10 ай бұрын
where is the course link?
@mukesh_srivastav
@mukesh_srivastav 4 жыл бұрын
It is full of quality content, It took me around 2-3 hours to completely watch along with preparing notes for this video. And 9 pages of notes it is, initially I thought it would be in 3-4 pages only. It's full of rich content, that I had to note everything down. Thank you. I am looking for a better way to prepare for System Design questions. With this speed, I am not sure how much time It will take for me.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Hi Mukesh. You are right, preparation for a system design interview is a lengthy process. And it greatly depends on the level you plan to apply and your background. Just take one step at a time. First, try to learn about concepts without going too deep into each one of them. Otherwise, you will find yourself traversing a very deep graph of various interconnected concepts. Second, go deeper with every next iteration, if time allows. Third, use your daily work as a constant source of knowledge. Ask yourself how things in your project/application work. And do not stop asking yourself until you get a pretty solid understanding of a particular feature. It is a big and a very good question you've raised. I should, probably, create a video on how to prepare, depending on how much time one's have.
@mukesh_srivastav
@mukesh_srivastav 4 жыл бұрын
@@SystemDesignInterview Thank you so much, Mikhail. I will follow these tips. Really informative.
@HellSamael
@HellSamael 3 жыл бұрын
I have been following some system design channels on youtube and this one is by far the best. Well structured, it discusses tradeoffs, solutions are clear.... In one word, AWESOME! Thank you very much.
@mputcha
@mputcha 5 жыл бұрын
Just the right amount of detail. Surprised it has fewer views. Shows view count is not very reliable. Thank you for the great effort
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Appreciate the feedback, Manju! Thanks.
@niko79542
@niko79542 5 жыл бұрын
Hands down best System Design Channel on youtube. I cannot wait for more videos!
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you, Niko, for the feedback. Much appreciated! Working on a new video right now.
@ron234ghi
@ron234ghi 2 ай бұрын
I'm surprised I've not found this sooner. One of the best resources for System Design. Sadly, I think Mikhail has stopped making the content. I hope you read this and do make more awesome content! Subscribing in case you do!
@pramodsingh4668
@pramodsingh4668 5 жыл бұрын
I love the way you dive in every component one by one.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you for the feedback, Pramod!
@rishabhjain2404
@rishabhjain2404 4 жыл бұрын
thank you for working on the subtitles, makes it easier to consume your good content
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Sure, Rishabh! Thanks for letting me know that subtitles help. Every next video will have subtitles as well.
@rishabhjain2404
@rishabhjain2404 4 жыл бұрын
Thanks Mikhail. You have excellent english fluency. I am just used to different pronunciations of certain words.
@akshitg
@akshitg 4 жыл бұрын
The quality of the videos that you make are really make. Please continue making such videos.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Glad you liked it, Akshit. Thanks for sharing!
@mostinho7
@mostinho7 Жыл бұрын
5:10 load balancer availability/single point of failure, can assign multiple A DNS records for our domain (multiple IPS ) when we want multiple load balancers 8:00 usd cases for frontend service 9:15 request deduplocation to achieve exactly once or at most once semantics (the ack response could be lost when sending back to user, so user will retry and we don’t want to process duplicate requests that we already housed) 12:00 reusable components for other system designs
@harjos78
@harjos78 4 жыл бұрын
This is pure GEM!.. Amazing crystal clear explanation on Distributed computing concepts. Best tutorial till date on whole of youtube for system design prep material... Great work
@MrThepratik
@MrThepratik 5 жыл бұрын
This is by far the best curated content on system design Wish I had come here before . Keep up the good work
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Pratik. Thank you for the feedback. And welcome to the channel!
@hyunminkim3315
@hyunminkim3315 5 жыл бұрын
Very thorough! Really appreciate your hard work. I can tell your channel will become huge for engineering resources.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you, Hyunmin. Appreciate your feedback and words of encouragement!
@harshdubey9951
@harshdubey9951 4 жыл бұрын
Hi Mikhail, excellent video on the system design course series. Very nicely presented & explained. One of the best channel.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Thank you, Harsh, for the kind words! And for being active on the channel (actively commenting)! Much appreciated.
@victormartins-software3912
@victormartins-software3912 4 жыл бұрын
I can’t thank you enough, I was really struggling to grasp these topics and your explanations really helped me put it all together 🙏 excellent work!
@digitalkind
@digitalkind 5 жыл бұрын
13:17 Strictly speaking, using a database as a backend message store is a valid option (because database = storage engine + high level type system). The problem is that FIFO message ordering semantics, that is usually expected from queues, is not easily achievable on the top of off-the-shelf databases. Because the latter are usually designed for a different query/update patterns than it's common for queues. But some databases can handle this custom case perfectly, and some queues (Kafka) are even positioning themselves as a databases.
@junfu8695
@junfu8695 3 жыл бұрын
mysql auto increment id could be an option to implement FIFO queue. Queue name could be a secondary index, and auto increment id is a primary key.
@YashRaithatha1989
@YashRaithatha1989 5 жыл бұрын
Just awesome ! Your approach to problem solving is very generic. Really liked it and keep posting such fantastic system design interview questions. This is the best material i have seen till date on the topic. Thanks a lot.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you for the feedback, Yash!
@bhaveshssharma8826
@bhaveshssharma8826 4 жыл бұрын
Best course available on the Internet.
@donaldbough3445
@donaldbough3445 2 жыл бұрын
Amazing video series that goes beyond high level fluff, thank you so much!
@chiragr1336
@chiragr1336 4 жыл бұрын
Thanks allot @Mikhail. Your videos are so fun and easy to watch. I feel it's one of the best specifically for system design and you sound like some Russian Pro coder to top it;) I request you to make a video about all the possible components (load balancers, CDN, etc) in a system design interview that will ever be used, because you keep using few different components for different problems. If we get to know all the components, then we too can arrive at a better solution. Thanks for the great content and keep creating new videos! 👍
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Thank you for the kind words, Chirag! I have been thinking about the same for a while. And there are ideas how to address this. Just need to find more spare time to realize all these ideas ((
@TyzFix
@TyzFix 3 жыл бұрын
I am expecting you write a SD book that gives us the same amount of useful information as here. outstanding job!
@culsumu
@culsumu 2 жыл бұрын
Your videos are Superb ! Most useful videos on System design. Please start making more videos like this ! More on each component details which helps in System design :) !
@deathbombs
@deathbombs 2 жыл бұрын
17:08 to clarify option 2, queues are basically each a cluster in this case(each cluster contains a set of queues). Instances are the replications. Interestingly replication for inmemory hosts/instances are handled similarly to nosql nodes
@jackieh2195
@jackieh2195 5 жыл бұрын
Please keep uploading! This is great! Thanks
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Sure, we will. Thank you!
@riteshpatel16
@riteshpatel16 4 жыл бұрын
Thank you for all the hard work and such a great explanation of complicated topics. This is way superior to other paid content. I would love to see video on API design that covers how and what. For example, question that say design an API to upload photos from iOS, how do you go about it? What are the good characteristics of an API? What are key components you need to think about while designing an API and so on.
@farslan
@farslan 3 жыл бұрын
This video is great quality. I think sequential writes should have been suggested in the video. This seems to me the best way to achieve high throughput.
@saumittrasaxena2877
@saumittrasaxena2877 4 жыл бұрын
This is quality content. Really appreciate your efforts.
@BitsnBytes8
@BitsnBytes8 2 жыл бұрын
Very good content. Enjoyed going through the video. Thank you. Hope you continue this series
@austinkim7804
@austinkim7804 Жыл бұрын
Finally finished going through your videos. Thanks so much!
@swapniljain3459
@swapniljain3459 4 жыл бұрын
Great work and Explanation . Thanks a lot. This is the best explanation and walk through to prepare for a System design interview.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Glad you liked it, Swapnil! Thanks for the feedback!
@rahuljain070817
@rahuljain070817 4 жыл бұрын
One of the best system Design video I have watched, Awaiting more videos.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
More to come! Thank you, Rahul. Appreciate the kind words.
@sonty76
@sonty76 3 жыл бұрын
Really great content and well paced delivery. Waitiing for more content.
@reyazahmed9320
@reyazahmed9320 5 жыл бұрын
Great content. Thanks a lot. Just one feedback: Would have been great had there been subtitles as I find a bit difficult to get the words.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Reyaz. Thank you for the feedback! Point taken, I will try to add subtitles relatively soon.
@harshdubey9951
@harshdubey9951 4 жыл бұрын
Hi Reyaz, you can use inbuilt feature for subtitles provided by KZbin player. Click on the icon labelled with cc while playing the video. hope it helps until Mikhail provides the subtitles. Thanks
@ankitphophalia9849
@ankitphophalia9849 4 жыл бұрын
Just awesome...your approach to solving a system design is amazing. Great content. Thank a lot for your efforts.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Thank you, Ankit, for the feedback! Glad you liked the video!
@leoxiaoyanqu
@leoxiaoyanqu 3 жыл бұрын
Thanks a lot for your videos! Very helpful! I wonder if it's possible for you to have a mock interview video (e.g. you're on the interviewee side), covering things like what tools/apps would you use for real-world SDIs for better productivity, etc.
@wangjiprc
@wangjiprc 5 жыл бұрын
best system design video I had watched.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Glad to know that!
@madhukarm1319
@madhukarm1319 3 жыл бұрын
Thank you! This covers a lot of background. One thing i feel should have been covered how strict message ordering is achieved across partitions?
@jamess5330
@jamess5330 2 жыл бұрын
Very helpful! Another super effective way to prepare system design interviews: Do mock interviews with FAANG engineers at Meetapro.
@omfromam
@omfromam 5 жыл бұрын
Great content! Thank you for making this. One thing that seems not clear (at least to me) ~min15-min17 you show the flow of the message from FE to backend node and to receiver. It is unclear how do you separate persisted information between MS (database) and in-cluster manager (ZooKeeper). Both seems to store mapping between Queue name and Leader Host. Do you really need to store this information in two places? How are they synchronized? Why would you need to keep this info in the MS in the first place? Isn't ZooKeeper enough for queue-to-node mapping?
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Both options are fine: 1. When we store mapping in the Metadata service. Zookeeper is used for leader election and leader monitoring. And if leader changes, this information is propagated to the Metadata service. 2. When we store mapping in Zookeeper. Zookeeper is highly optimized for reads. Anyway, this information is stored in one place. So that we avoid any synchronization between configuration storages and have a single source of truth.
@BitsnBytes8
@BitsnBytes8 2 жыл бұрын
@@SystemDesignInterview Thank you for answering this question. I had the same doubt when going through the material.
@ThePrabhu1990
@ThePrabhu1990 5 жыл бұрын
This is excellent! Looking for more such videos.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you. We are working on more videos. Feel free to subscribe to stay tuned.
@ashwinkumar2126
@ashwinkumar2126 4 жыл бұрын
Really comprehensive coverage of the topic! Although, would've loved to see more discussion on Asynchronous out-cluster replication. It's tricky to design, eg. what happens when you receive a get request while the data hasn't fully replicated across hosts in a cluster? can we hit all hosts in a cluster? what happens if we receive a get request when the message deletion replication is in progress etc.
@ryany1111
@ryany1111 4 жыл бұрын
Thank you so much for your videos. I have watched all videos in your channel. Waiting for your distributed database video. Hopefully you are still running this channel
@jmitesh01
@jmitesh01 5 жыл бұрын
Summary(notes): 1. Problem statement: Producer sends data and exactly one of the consumers gets the data 2. Resolving ambiguity in the problem statement by asking questions such as scale, priority, and so on... 3. Just focus on the core set of requirements - sendMessage(messageBody), receiveMessage() 4. SLA numbers for the non-functional requirements 5. Components: LB, Control Plane(Metadata-Service), Data Plane-1(Frontend), Data Plane-2(Back-end) 6. FE: Required Cross cutting concerns such as billing, throttling, the most important - routing to Backend since the Backend is stateful and so on. 7. Metadata Service: Caching Layer for routing information and metadata ( high consistency required in case of very few writes, R/W Ratio) 8. Backend Service: API Handling Layer, Storage and so on. Since Backend has to be HA and fault tolerant as it requires a consensus service like ZooKeeper or In-Cluster and Out-Cluster management strategy. ---- Extend the above design of queue creation with queue deletion, message deletion, message replication, delivery semantics( exactly once delivery not supported because it requires 2PC) and Pull vs Push messages, security and monitoring. --- Scalability Bottlenecks, use-case exntensability and use-case supported/limitations?
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Mitesh. Concise and still captures the key ideas. Great summary.
@IDONTKNOWWHATTOKEEP
@IDONTKNOWWHATTOKEEP 4 жыл бұрын
Really great content. Please keep uploading more videos!
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Thank you, Narain. I surely will.
@yuankailiu614
@yuankailiu614 3 жыл бұрын
Reall enjoy the video, I wish I could upvote multiple times. One thing missing is how to persist message into file system.
@SchartzRehan
@SchartzRehan 5 жыл бұрын
This is crisp and clear. Many thanks.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you for the feedback.
@ky5069
@ky5069 4 жыл бұрын
Regarding how the front end service finds the leader backend nodes, you mention that this discovery would be done via metadata service. But in the in-cluster method, we actually have that information in the coordinator service (zookeeper). In this case, would the metadata service just be a thin wrapper for the coordinator service (in case of backend node discovery)? Thank you so much for sharing these videos Mikhail. (Also, I love that you mention several times that the interviewer is there to help us, I find it delightful to have that perspective, and definitely helps during the interview)
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Hi Suharto, Thank you very much for the feedback! Regarding your question, we can use either Metadata service or Zookeeper itself for storing and retrieving information about leaders. Please take a look at my answer here: kzbin.info/www/bejne/n3uvfWCBhdZ1pq8&lc=UgxAE6YfMUj95phbLid4AaABAg.90QChp-3ylO93AokcEr3Bu
@utsavmathur1478
@utsavmathur1478 3 жыл бұрын
This is amazing, thank you so much! Very detailed explanation and exactly what I was looking for.
@HarpreetKaur-oj5eg
@HarpreetKaur-oj5eg 10 ай бұрын
Brilliant content! Please start uploading again!
@ukpauchechi
@ukpauchechi 3 ай бұрын
Great video, very in depth I have a question concerning option b for the backend service architecture (Small cluster of independent hosts) The frontend calls the meta service to find out the instance responsible for the queue yet calls a random instance. If it’s going to call a random instance why call the meta service?
@xiaomengwu3399
@xiaomengwu3399 5 жыл бұрын
Thank you for your great work! Please keep it up with more great content! :)
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you, Xiaomeng, for the feedback! More content to come.
@siddharthmanumusic
@siddharthmanumusic 4 жыл бұрын
Such a great video with so much information! Many thanks!
@gemtyler8258
@gemtyler8258 4 жыл бұрын
keep up the good work! please upload more system design videos!
@babadun36
@babadun36 5 жыл бұрын
This is more just Distributed MQ. The video covers the fundamental approaches in modern data intensive distributed systems.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Ivan. Thank you for the feedback!
@deathbombs
@deathbombs 2 жыл бұрын
15:20 why was leader-follower option A? Why not use a loadbalancer? Why not use sharding to pick a queue, or have all machines be equal nodes, and pick a random one? Summary of option A and B: A uses leader to cleanup and replication B instances are equal, but still need something to handling of each instance
@vANvTO
@vANvTO 4 жыл бұрын
Great video, thanks for doing these. Can you please explain why we need a FE component? Initially I thought it was to design a distributed message queue, the ones used by the backend services, like the one that you had drawn in your other video: System Design Interview: Step by Step Guide. Or is this a question about how to design a chat appication? And then my last question is, can we combine VIP with load balancer into one component, the API gateway? Thanks!
@alexbordon8886
@alexbordon8886 3 жыл бұрын
I have the same confusion. Don't know why the front-end is needed.
@akankshamahajan9709
@akankshamahajan9709 9 ай бұрын
Wowww!!! These videos really helped me to prepare for my SD interview. Is there anything similar for ML System Design interviews?
@YeteshChaudhary
@YeteshChaudhary 5 жыл бұрын
It would be helpful if you also give a brief intro to RabbitMQ, Kafka and Kinesis/SQS. You talked about SQS briefly, that was really appreciative!
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Yetesh. Sounds like a topic of its own. Let me add this to the TODO list.
@engineerv3248
@engineerv3248 3 жыл бұрын
Thank you for great quality content. Excellent explanation, I can't miss single word. Why the series is stopped? Any other resources would you recommend?
@bhavyabansal1143
@bhavyabansal1143 2 жыл бұрын
any reason that we don't have more content uploaded here? is the author busy?
@ruhinapatel6530
@ruhinapatel6530 2 жыл бұрын
Please make more videos..ur videos are gem
@abhishekaggarwal9774
@abhishekaggarwal9774 4 жыл бұрын
Thank you so much.. I've an upcoming interview in couple of weeks and I'm totally confused from where to learn System Design. I came to your videos and now I'm thinking to listen to your videos multiple times so that this content fits in my brain. I've one question. Shouldn't Metadata Service and Metadata DB be connected to Backend rather than Frontend? Also, apprepreciate if you can upload few more top design interview videos like Design Whatsapp/Netflix etc. Also, I really like the idea if you can do mock interviews and upload so that we can learn from the mistakes.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Hi Abhishek. Thank you for the feedback! Much appreciated! First of all, let me start by wishing you luck at your interviews. Second, please read the following thread, it should provide more understanding about Metadata service purpose: kzbin.info/www/bejne/n3uvfWCBhdZ1pq8&lc=UgwNZ5mE3o8fFV_yY214AaABAg.8zdMxmBN3of9-4ktitHhKv Third, I have topics you mentioned in my TODO list. And yes, mock interviews sounds like a great idea. My only problem is to find time for all this ))
@chickentikkasauce1301
@chickentikkasauce1301 5 жыл бұрын
For monitoring, it’s be helpful to monitor size of the queue. Also number of messages getting queued or dequeued for each host at the queue level.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Good points!
@jmka1222
@jmka1222 5 жыл бұрын
A few questions - at 10:12 you say that "read when mesg arrives, write only when queue created". why? wouldn't you write every mesg arriving to the cache? - metadata service is it responsible for persisting data to db other than being used as cache? if so why is backend service also doing the same? - when you say distributed queue, you mean queue for communicating between a single producer-single consumer resides on several machine or between several "single produce-consumer" connections on several machines? if former, wouldn't only one machine be sufficient?
@jmitesh01
@jmitesh01 5 жыл бұрын
Hi Jm Ka, 1. Metadata service is used for storing meta information such as queues to backend service host mapping so when we create a new queue then only we need to add that info to Metadata Service persistent storage and cache as well. 2. Backend Service stores the actaul message and based on the requirements we may cache for frequeuent accessed queue to Metadata Service.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Jm Ka, Hopefully Mitesh's answer helped to clarify what is stored where. Metadata service for metadata only and Backend service for messages. As for your last question, can you please clarify what yo mean by "several single produce-consumer connections"? Each message lives on several machines. To achieve high availability. Simply speaking, we want to make sure that messages are not lost if a single machine crashes. Every time we store a published message, we replicate it. So, we always have several copies of the same data.
@jmka1222
@jmka1222 5 жыл бұрын
@@SystemDesignInterview Hi, Mitesh's post doesn't answer anything, it's simply reiterates stuff he heard you say. 1) was how at 10:12 for metadata service you say "read when mesg arrives, write only when queue created". Is metadata service (or cache) storing the whole queue including messages? Or is it storing only which queues go to which consumers? Either way, when a message arrives, then too it'd need to be cached in metadata service, so there has to be a write. In that case, saying ""read when mesg arrives, write only when queue created" would be wrong 2) Is metadata service only a cache or can front-end service persist messages without going through metadata service? backend service persists messages, but it looks like in your description metadata service is doing the same too? 3) by the term "distributed queue", one could mean several things. A) you can have a distributed queue that's conceptually a single queue for only 1 producer that's generating messages for 10 consumers, but the queue is replicated and sharded on several machines for availability concerns. B) you could have a distributed queue that's multiple queues for 5 different producers, each one catering to 10 consumers (total of 50 consumers). this queue can also be called distributed. C) a distributed queue that's multiple queues for 5 different producers, each one catering to only 1 consumer (total of 5 consumers). By "several single produce-consumer connections" I meant case (C). Which one of A or B or C you meant by "distributed queue"?
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Jm ka, 1. We write to Metadata service only when we create/update queue metadata. We do not store messages in Metadata service. When message arrives, we make a call to Metadata service to get details about the queue. E.g. we may retrieve message size limit value and check if the just arrived message exceeds the max size or not. Messages are only stored on back-end service machines. Please let me know what part of the video confused you and made you thinking we store messages in Metadata store. 2. When message arrives, front-end service needs to pick a back-end machine for storing the message. Same is true for retrieving a message, front-end needs to forward receive message request to a machine that stores messages for the requested queue. So, front-end needs to get this information from somewhere. From some persistent storage. Calling database directly is not great in this case, as there may be too many calls. This may be both slow and expensive. Metadata service helps to avoid direct calls to the database, by storing queue metadata information in memory. 3. Distributed simply means messages for the same queue are replicated and stored across several machines. There may be multiple producers and multiple consumers. Only one consumer gets the message. Let me know if you have other questions.
@xipan5344
@xipan5344 4 жыл бұрын
@@SystemDesignInterview Hi, Does that mean the in-cluster(zookeeper) out-cluster mapping information is retrieved from the metadata service
@chickentikkasauce1301
@chickentikkasauce1301 5 жыл бұрын
Overall great video. Speaker is very knowledgeable.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Thank you, Chicken Tikka Sauce, for all your comments. To this and other videos. Much appreciated!
@mohd.tahauddin9001
@mohd.tahauddin9001 3 күн бұрын
This is a great video, however, I would like to point out a flaw in the design. In the backend service, the only viable topology is single leader replication. Leaderless replication will not really work. And the reason for this is that in a queue, the FIFO order is important, i.e., no matter what server the data is read from, all servers must unanimously agree on the position of each item in the queue. In a leaderless replication protocol, you can use vector clocks, but this system will not be able to handle concurrent writes, and unlike databases, where concurrent writes can be stored as siblings and conflict-resolved later by the client, in a queue, this will be deemed unacceptable from the client's point of view - I queue something, I want it queued without conflict. Therefore, in order to achieve this, the design will be forced to fall back to single leader replication, which can enforce total order broadcast and use log replication to guarantee that all followers servers enqueue each item in the same position. Following that, no matter which server information is read from (leader or follower), there will be a consistent view of the queue. We can use Zookeeper or another coordination service for leader election in case of leader crash to improve availability.
@atabhatti6010
@atabhatti6010 Жыл бұрын
Thanks for the great content. Can you please talk about which choice of architecture (1, 2 or 3) you would make for the Metadata Service? And why? And please explain your comment that the datastore for MS does not need to be strongly consistent?
@onePunch95
@onePunch95 3 жыл бұрын
Great content! I have some confusion regarding the queue identification. 1. In the API definition, we are only sending the message, so when the first-ever message comes, how is that message getting mapped to a queue number? For example in the slides it says a sendMessage(msg) comes for queue id =1, how does the sender know about the queue id? Similarly, when the receiveMessage() API is called, how does the receiver know which queue to get the message from, secondly there are several messages in the queue, so how do we know which message it wants to receive, and how are we deciding? Let's say., when the first message comes around, the backend stores the data and takes care of replication, then writes the mapping in the DB. But how is this information being propagated to the receiver, that wants the message, how do they get to know about the queue id? 2. In the table shown for in-cluster management, for qid 1, the leader is A and followers are C, B. But if the queue is distributed over nodes, then how are we just having one leader node as A? Doesn't that mean we are storing the entire queue 1 in A, and the copies in the followers?
@AshishNegi1618
@AshishNegi1618 2 жыл бұрын
1a. Message should contain QueueId. 1b. API should be queue.ReceiveMessage() ; Queue object knows about queue_id and sends in either every poll request or is tied to tcp/grpc/websocket connection. 1c. Messages are received from queue in kind of FIFO order. So, client sends last Message id or Sequence Number and server sends SequenceNumber+1 th message. 1d. Client knows queue name and that should be able to give them queue id. It can be either hash of "queue_name" Or they ping Frontend service to get QueueId for a QueueName. 2. A distributed queue does not mean partial data on different nodes. It means full copy of data on all nodes. One of the node can only currently write -- this node is called Primary node. This is done so that even if one machine goes down for ever, full copy of data is available in other machines. This gives high availability/durability in case of failures.
@lch04thu
@lch04thu 4 жыл бұрын
Great video. I'm not sure why we didn't consider quorum based replication for the backend service?
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Hi Chenhao. You are right, it is possible to use quorum based replication. Thanks for bringing this topic in.
@ruslanda7690
@ruslanda7690 5 жыл бұрын
This is awesome!! Thank you!
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Glad to hear that! Thanks.
@sudharshannd3497
@sudharshannd3497 Жыл бұрын
Great video, but I think it should cover even more low-level details on how messages are stored in memory and retrieved using offset/invisible flag.
@ErwinDSouza
@ErwinDSouza Жыл бұрын
Thanks for the video! I wish there was more of a deep-dive into where to store the data - you said memory + file system, but how exactly? is it an async process? cheers!
@ChandraSekhar-zu9nw
@ChandraSekhar-zu9nw 5 жыл бұрын
Thanks for the great video. Just a question below: If we have multiple consumers, say, application deployed in a cluster and assuming consumers poll for new messages, how do we ensure that only one instance of them gets the message? Do we need to have some distributed lock on the message so that only one consumer would get it?
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Chandra, In case of a pull-based consumer (like the one described in the video) we indeed need a mechanism to "lock" a message. So that it is not available for other consumers. One option is to do something similar to what AWS SQS does: docs.aws.amazon.com/AWSSimpleQueueService/latest/SQSDeveloperGuide/sqs-visibility-timeout.html We could have also implement a push-based mechanism, when queue service itself is responsible for sending a message to one of the subscribed consumers (please check a video about Notification Service, similar idea is described there but for pub/sub use case, not a queue). In this scenario queue ensures that only one consumer gets the message by sending it to a single consumer from the list (e.g. using round-robin algorithm). Please let me know if more details are needed.
@kavehshirgir6107
@kavehshirgir6107 3 жыл бұрын
Excellent descriptions and problem solving. Please share more. One thing: about TLS termination I don't think it's a common practice to be done on service instance. It's usually done on LB. For example AWS LBs can be configured to be TLS terminators. Would you be able to elaborate more? Thanks
@suchismitagoswami5609
@suchismitagoswami5609 3 жыл бұрын
Really great content. I have one doubt. In the out cluster management option, let's say we split each queue into multiple partition across multiple clusters, and each partition is being handled by separate clusters with replication of data in all the nodes inside the cluster. What if an entire cluster goes down? How will we ensure durability of the message belongs to the partitions managed by that cluster.. Please help me to resolve the doubt.
@michaelzeltser1581
@michaelzeltser1581 4 жыл бұрын
Awesome video. The only thing that was a bit confusing for me was the part about in-cluster manager (zookeeper) - It wasn't clear if zookeeper is in fact the Metadata service or an additional component (along side MS).
@lch04thu
@lch04thu 4 жыл бұрын
Yeah I was confused too. I posted a comment below, happy to discuss more.
@SystemDesignInterview
@SystemDesignInterview 4 жыл бұрын
Hi Michael. Thanks a lot for the feedback! Here I left my view on why separation for out-cluster manager approach (multiple clusters) makes sense: kzbin.info/www/bejne/n3uvfWCBhdZ1pq8&lc=UgwD2gh8ozvKz1wMqId4AaABAg.99gLjSiA1G79BLjjputJT6 For the in-cluster manager approach, we can indeed use a single component. And the way Kafka uses Zookeeper proves this. As Zookeeper is "especially fast in "read-dominant" workloads." (from the official Zookeeper site). Although, I still prefer to separate these components for the in-cluster approach as well. There are many arguments in favor of one approach or another.
@naveenkothamasu1073
@naveenkothamasu1073 5 жыл бұрын
When frontend needs to call which backend service host to call, you mentioned it will contact Metadata service. But I belive it should contact in-cluster manager (ZK) as it has the authoritative information of which queue is owned by which backend host. Correct?
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Naveen. Good question. Please check my answer here and let me know if more details needed: kzbin.info/www/bejne/n3uvfWCBhdZ1pq8&lc=UgxAE6YfMUj95phbLid4AaABAg.90QChp-3ylO93AokcEr3Bu
@rajrishav
@rajrishav 4 жыл бұрын
@Naveen I was also confused on this point which Mikhail himself has answered in one of the comments above: ​ Both options are fine: 1. When we store mapping in the Metadata service. Zookeeper is used for leader election and leader monitoring. And if leader changes, this information is propagated to the Metadata service. 2. When we store mapping in Zookeeper. Zookeeper is highly optimized for reads. Anyway, this information is stored in one place. So that we avoid any synchronization between configuration storages and have a single source of truth. Hope it helps
@himangshuroy3571
@himangshuroy3571 2 жыл бұрын
Hi Mikhail, Others Not sure whether you will see this or not, but wanted to ask one thing. If I try to design gmail using a similar architecture outlined above then 1. Whether the Metadata server will store time, sender, size etc? While the backend store will store the actual mail? 2. Can the MD Server( No SQL with a Cache/façade in front) organized in a consistent hash ring using the User name( through hash) as primary key? 3. If 2 is correct how do I display the most recent mails? Seems I need to sort the data stored in a node, when to do it? Where and how to store it? 4. If I sort based on time and store in a distributed cache and then I want to sort by size how can I do it, Will the Frontend Service help on this? Does No SQL allows this kind of queries? 5. How will I know Which Backend Storage store the mail? Is there a mapping exists between MD server and Backend Cluster? Many thanks in advance.
@tejvepa8521
@tejvepa8521 5 жыл бұрын
In the backend storage design, Option B, one node in the pool of the backend hosts is responsible for distributing messages to other hosts in the pool. The downside with this approach as I see is that the client(sender Front end) has little idea about how many hosts the message is actually distributed to. What happens if the receive call selects a host which does not have the message? Wouldnt it make more sense to have a quorum based read and write approach at the front end service layer. Where it sends a send request to x hosts and wait for x acknowledgments and receive request is sent to y hosts and wait for y responses. As long as x+y > total hosts in the pool a consistent read is guaranteed. Of course this is assuming consistent reads are needed and strict ordering is needed which adds to latency. Also a little unclear on the role of metadata service here? Specifically what information does it store if there is another out cluster manager for hosting information about queue to cluster mapping.
@tejvepa8521
@tejvepa8521 5 жыл бұрын
I guess it really depends on the semantics of the queue. Ex: Is FIFO guaranteed etc
@bryanenglish7841
@bryanenglish7841 3 жыл бұрын
Wow this is great stuff!
@dharmendrabhojwani
@dharmendrabhojwani 5 жыл бұрын
Why this Guys is not giving more videos. We should in fact invest some money from our pockets and pay him to make such kind of videos rather than spending money individually on some courses.
@SystemDesignInterview
@SystemDesignInterview 5 жыл бұрын
Hi Dharmendra. I am working on a new video right now. I took a big topic this time. Plus, was on vacation for a couple of weeks. So, a little bit behind. Hopefully, you will like the upcoming video. It covers many concepts required for a successful system design interview and system design in general.
System Design Interview - Notification Service
25:11
System Design Interview
Рет қаралды 266 М.
Apache Kafka: a Distributed Messaging System for Log Processing
15:33
She made herself an ear of corn from his marmalade candies🌽🌽🌽
00:38
Valja & Maxim Family
Рет қаралды 18 МЛН
The evil clown plays a prank on the angel
00:39
超人夫妇
Рет қаралды 53 МЛН
When you have a very capricious child 😂😘👍
00:16
Like Asiya
Рет қаралды 18 МЛН
System Design Interview - Distributed Cache
34:34
System Design Interview
Рет қаралды 382 М.
System Design Interview - Top K Problem (Heavy Hitters)
36:18
System Design Interview
Рет қаралды 384 М.
16. System Design - Distributed Messaging Queue | Design Messaging Queue like Kafka, RabbitMQ
45:13
Distributed Systems in One Lesson by Tim Berglund
49:00
Devoxx Poland
Рет қаралды 420 М.
System Design Interview - Rate Limiting (local and distributed)
34:36
System Design Interview
Рет қаралды 310 М.
Design a Payment System - System Design Interview
31:40
High-Performance Programming
Рет қаралды 525 М.
Google system design interview: Design Spotify (with ex-Google EM)
42:13
IGotAnOffer: Engineering
Рет қаралды 1,2 МЛН
She made herself an ear of corn from his marmalade candies🌽🌽🌽
00:38
Valja & Maxim Family
Рет қаралды 18 МЛН