"Have you done any System Design course ? How are you so good with this subject ?" - These were the word of my interviewer. I had a High Level + Low Level system design with a start-up recently. Surprisingly the question was to design a file sharing system such as Google Drive as described in this video with some additional features. I explained the HLD with the diagram as I had learned from the the concepts of this video. After the HLD was over, the interviewer told me that I have created a very robust & elegant system. He further said, he was so satisfied with the HLD, that he no longer wants to go into the LLD. Folks, these videos are the absolutely anything that you will ever require to ace a system design interview. Do remember to learn the fundamentals used in the system. A huge thanks to #Hello Interview for putting out the best content out there.
@JohnVandivier7 ай бұрын
"he was so satisfied with the HLD, that he no longer wants to go into the LLD. " GOALS! kudos and congrats
@hello_interview7 ай бұрын
This is epic!
@charan7753 ай бұрын
which startup bro?
@abhijit-sarkarАй бұрын
These videos are undoubtedly great, but your interviewing experience at some start up doesn't prove that. Interviewing is taught at FAANG companies, and some dude at a company that opened 6 months ago wouldn't even come within 9 miles of a FAANG interviewer.
@YeetYeetYe5 ай бұрын
Simply amazing. I don't mean to throw shade to other channels, but this is by FAR the best system design interview prep. So many other channels are just people with a couple of months of experience at FAANG and it really shows the difference between junior FAANG engineers and Staff FAANG engineers. Extremely high quality work.
@hello_interview5 ай бұрын
So glad you like them!
@KiritiSai933 ай бұрын
You guys remind me of the "Acquired" podcast hosts. No click-baits or cringe posts, just sheer passion about the subject and high-quality in-depth analysis of things. Kudos and hope you continue the great work!
@hello_interview3 ай бұрын
That’s the idea. Pure value no BS 🫡
@draugno72 ай бұрын
I also loved the jokes and an occasional reassurance in the Uber video, looking forward to more! Ddinngdding (that driver's phone after Taylor Swift concert in a badly designed system). This channel is simply amazing because it ties together all of the concepts I learned and even elaborates on different DSs and DBs. Someone said 'no shade to other youtubers' but I say 'yes shade' because they usually confuse and frustrate people who watch with incomplete diagrams and explanations.
@EamonLinskey8 ай бұрын
These are the best System Design videos I have found. Great framework for approaching problems, clear explanations, helpful diagrams. And I really appreciate the notes about how insight’s different seniority levels might approach specific parts
@andjelaarsic92177 ай бұрын
My mind is absolutely blown by how beautifully everything is explained. I love how you understand what would be possible questions/confusions from people watching and you address them by explaining pros and cons. Thank you so much for the content! Your walkthroughs are by far the most useful and interesting.
@hello_interview7 ай бұрын
High praise! Appreciate you taking the time to share this 😊
@ghadialhajj17 сағат бұрын
I like how you brought up the concept of reconciliation in addition to the more "real-time" path, which is very similar to what you've explained in the ad click aggregator, because it shows how learning the concepts and becoming able to transfer them across seemingly different problems is more valuable and proper for an SDI than memorizing some architecture and trying to reproduce it during the interview. Thanks for the content.
@Wololowizz5 ай бұрын
I must say that this is the best system design video I've seen so far. You covered the problem and solution step-by-step while other videos just throws a bunch of ideas right away. Sometimes I feel overwhelmed watching other videos thinking that's impossible to know all of that, but watching this video we can know what's the expectation for each level and the most important thought: you don't need to know everything. And that's gold
@hello_interview5 ай бұрын
Glad you liked it! Check out our others if you haven’t already. Same format :)
@GauravGupta-op8ol9 ай бұрын
With my systems design interview coming up, I was looking forward to your video. It's great as always.
@madhurnsit5 ай бұрын
This is the best content I have come across on System Design interviews. Wish I had landed here this sooner. Thank you so much!
@lorddel9 ай бұрын
One more comment on this: comparing this to the written content on hellointerview, this one seems more round and well-thought (mainly regarding using S3 notif. on chunk upload completion, which wont work). Would be cool to see it reflected there on the platform! Good job
@hello_interview9 ай бұрын
Good feedback! I'll try to get that updated, particularly by adding sync which I just last minute decided to throw into the video.
@md_dm4909 ай бұрын
This channel has the best system design content on youtube. Keep up the good work.
@parashar150526 күн бұрын
There are many system design courses - both paid and free - and I have bought and seen many. I have rarely seen someone so organised, so methodical, so all-encompassing like the way are in creating a flow in the design. This just shows what a great thinker one needs to be to be able to create such a framework and flow. You would make everyone a bit of a better thinker than they are with your videos. Many thanks!
@alexandergordon92868 ай бұрын
It's pure gold! specially the parts where you are stopping the debates abouts what db to choose or if the calculations are needed. The deep dives are the best part.. no one goes that deep and thats actually what matters in an interview
@levimatheri76827 ай бұрын
Wow, by far the best system design videos anywhere. I love how simple you make it, and the invaluable tips!
@anuragtiwari30328 ай бұрын
i dont comment much, but for this kind of explanation i gotta give it u. Hands down the best explanation on youtube . pls continue making these kind of videos . This channel will blow up
@hello_interview8 ай бұрын
♥️
@prasidmitra68597 ай бұрын
These are like gift from God. The best SD resources I've found in the last 3 years.
@jagrit073 ай бұрын
Watched 20 minutes of the video so far and This is the 3rd resource I am watching regarding Dropbox design, I have read Alex's book, read Grokking book and now watching this just for fun and I think Evan King is actually the King lol. Amazing video, Please keep on adding more content. Yesterday, I commented on Tinder's Design video and now here. I think I might have to comment on all the videos once I watch those because this is really good stuff and we viewers should appreciate it and hence I will keep adding comments lol :D
@sumanthperuri657913 күн бұрын
Best video i have come across the system design question for dropbox, and changed my perspective on how to answer the question when asked in interview.
@JShaker6 ай бұрын
I'm so grateful for all of your videos. I've been practicing using the Hello Interview AI interviews, booked one mock with one of your interviewers, watched all the videos. The quality is so far beyond any other content out there, and I've successfully passed 5 system design interviews. Keep up the good content, your KZbin channel deserves to blow up and your website too #wouldinvest
@batusun7174 ай бұрын
please upload more stuff like this. This is literally the BEST on KZbin. Very much appreciate all the great efforts!
@cidwiththreeeyes2 ай бұрын
Thank you for another great video! Honestly, I don’t have any constructive criticism, it’s pretty much a perfect format for these videos-practical, concise, insightful. Other creators’ videos like this are good, but they feel like they’re just going through memorized recipes. Your videos are actually teaching system design theory. Really hope you have more of these as I make my way through your catalog.
@tushargoyal5544 ай бұрын
This is the best channel for learning system design. I've gone through a lot of explanations but found them talking things in isolation making it very hard to connect to get a full picture. The popular system design interview book also doesn't help much due to very discrete and sometimes inconsistent sharing of knowledge.
@Gamble3963 ай бұрын
One of the best System Design channels. Please keep uploading.
@noobu9 ай бұрын
Great stuff again! Not only good for interview but also for daily work 1) Clear and concise structure 2) Weigh trade off rigorously and explain the final decision clearly. Every single component is well though out with real world considerations
@adeeshacharya75208 ай бұрын
This is really good, irrespective of whether we are taking interview or not, any person looking at this level of explanation and detail would try to picture software differnetly. Thanks for making such videos, would love to see some more
@yankomirov42903 ай бұрын
You added systematic (pardon the pun) approach to such an open-ended nature of an interview. This was a game change for me! I really appreciate it, I went ahead and bought the Guided Practice which is also amazing and is my main practicing tool. Thank you so much!
@mehdisaffar7 ай бұрын
I love the content. It has been frustrating to watch some other system design videos where they just brush off over important details and act like everything is straightforward and easy, and just make 10s of services and never really explain the nitty-gritty details of how those things would work and IF they would actually work/be efficient etc. Thank you!
@mehdisaffar7 ай бұрын
I wish you had mentioned the challenges of 2-way syncing in this context. Because this is akin to master-master replication, in case of network partition (for example user makes changes to remote, hops on another offline device, makes changes, then comes back online) there is a chance of inconsistencies (user makes different changes on device 1 vs device 2). There would probably need to be a way to offer merging changes together or have the user choose between version 1 or 2.
@mehdisaffar7 ай бұрын
I think I talked too fast! You did mention reconciliation
@chongxiaocao57378 ай бұрын
one of the best system design preparation video I have seen online.
@EranM22 күн бұрын
49:19 if a chunk have changed, it's fingerprint has changed.. isn't this enough to notice change? how about if it was compressed? Did you open the compression in s3 and local?
@aldogutierrezalcala30474 ай бұрын
Bro, again me, just had a system design interview using your framework, still don't have the result but definitely this framework is basically pure gold to lead a conversation that i would keep using even in a daily job.
@hello_interview4 ай бұрын
Hell yes!! So glad it went well 💪
@Ptbcpr2 ай бұрын
did you end up getting the job?
@anmolgangwal92363 ай бұрын
bro we are ready to pay just enable the join icon in your channel, this content is too good to be free
@crackITTechieTalks7 ай бұрын
This is the best system design video, I have watched!! Specially the deep dives, You nailed it !! Looking forward to watch your videos.
@AncientArtist73 ай бұрын
Your content is great and really easy to follow through each step of the process. Please continue to make more system design videos. It is extremely helpful !
@pragatimodi9507 ай бұрын
Hi Evan, this is my first time giving system design interviews. Really glad I found this channel to learn from. Most of my prior feedback from mocks and system design have been framework related for when I explain my design. This really helps with that and I think even at work, this is a really good approach to follow for. most things. Awesome content, thanks a lot!!!
@VahidOnTheMove7 ай бұрын
Thanks for the videos. 47:45 I would like to know your opinion on push approach? By push approach I meant when the File service knows there is a change in a chunk, Sync service will let the client know. And, then the client will send a request to sync/download the chunk.
@dctc4221 күн бұрын
Very cool... It would be nice to do a followup that covers file versioning. I've been racking my brain on the best ways to do this. Keep up the good work! Minor callout on using the chunk fingerprint as the ID. You could get hash collisions for chucks that end up having the same content.
@ashutoshrana99987 ай бұрын
Will be the best system design interview channel for sure. Neat content. Keep up with the quality Man!
@galashrenik34043 ай бұрын
One suggestion I have is that when designing APIs, your videos often highlight the importance of handling partial data, which is typically expected of senior or staff engineers. In my view, API versioning carries a similar level of significance.
@DMA-I2 ай бұрын
I believe there is a slight flaw for the sync files from remote server feature (24:16). I believe we need to keep records in db which device which client has synced to date what updated time/what version or the get changes will loop endlessly (getchange will always get files needs to be updated, but they might just have been updated)
@Jadeish0129 күн бұрын
Thank you for breaking it down so elegantly, this was super helpful
@indreshgahoi71039 ай бұрын
Hey Evan , thank you so much for providing the great content. I really live the way you organize and put content across the board. ❤
@allenputich41925 ай бұрын
You do an amazing job of explaining the thought process, technical details, and growth opportunities!
@guitarMartialАй бұрын
49:09 - time is a weird commodity in distributed systems with clock drift et al wouldnt vector clocks be a better solution instead? this way we can detect write conflicts pretty well too
@hello_interviewАй бұрын
Yes :)
@guitarMartialАй бұрын
@@hello_interview Come to think of it - maybe even a Merkle tree here might be powerful. You are storing all the hashes already just build a local merkle tree and use anti-entropy to figure out delta periodically. Really wild thought - merkle tree + version vectors. One helps quickly figure out anti entropy as we can compare hashes the other helps with write conflict detection. Couple this with Kafka as you showed and you have a pretty amazing scaling solution.
@guitarMartialАй бұрын
55:31 - Merkle trees et al are giving me flashbacks to Torrenting days. Indeed the files were broken up in different chunks whose shas were used to perform comparisons for the sake of completion.
@EngineeringBootCampАй бұрын
Another great video. Some questions that came up in my mind after watching this video is - 1) How does local chunking work, do I literally break the files into parts and keep that in some other system or temp folder, and upload the files from there? 2) After I have uploaded the file, do I get rid of the chunks? 3) If we had a delta change in a remote file, you talked about comparing the fingerprints on all chunks and comparing locally, to only download ones that changed, implying we still keep these chunks locally somewhere? And even if I downloaded a modified chunk, how do I go ahead and stitch the chunks together to create the unified file in the main folder? [A little more clarity on those questions would be really beneficial.]
@TechieTech-gx2kdАй бұрын
1. The chunking is not a physical concept rather a virtual one, the files are still stored as bits in the physical storage but in the database dropbox maintains a table on the client side known as chunks, which keeps the ranges on the physical file representing that chunk. Here is schema for chunks table Column Name Data Type Description chunk_hash TEXT (Primary Key) The unique hash of the chunk (e.g., SHA-256). ref_count INTEGER Number of files referencing this chunk. file_path TEXT File path where this chunk resides. start_byte INTEGER Start byte position of the chunk in the file. end_byte INTEGER End byte position of the chunk in the file. Similarly dropbox has file table Tracks metadata about files, including their chunk composition. Column Name Data Type Description file_id TEXT (Primary Key) A unique identifier for the file (e.g., UUID). file_name TEXT The name of the file. file_path TEXT Full path to the file on the local disk. chunk_hashes TEXT Comma-separated list of chunk hashes in order. Now when you add a new file, in the application layer you create chunks and calculate hash of each of them, then try to commit those chunks in Dropbox metaService, the metadata service will inform if the chunk is already available and won't ask you to upload at BlobService. 2. As there are no physical chunks So there is no need to get rid of chunks. on the local storage we always deal with files and not chunks. 3. Nopes you are not keeping any chunks but instead you'll deal with hashes(chunk hashes to be precise), as soon as you receive a notification that there is a remote change you'll ask about the chunks and their hashes, To dive little deeper, the MetaService maintains the Server_file_journal which keeps Append Only logs for each namespace and let you know for a paricular namespace what all changes are available in the server and you download only those chunks which you don't have in local based on their hashes. Now once you have the chunks available you directly replace bytes of that modified file in the disk without the need to re-create the file, so you are dealing with bits here via start and end offset. Do let me know if you need more detail
@VarunVermaUSCАй бұрын
@@TechieTech-gx2kd Thank you so much, for taking the time out and sharing those details!
@pradeepbhat136326 күн бұрын
@@TechieTech-gx2kd Thanks for the details. So, if a new byte is added to the beginning of the file, the fingerprints will change for all the chunks and will it trigger a full file upload ?
@TechieTech-gx2kd24 күн бұрын
@@pradeepbhat1363 Hey, your interpretation is right! Dropbox actually solved this issue by implementing content-defined chunking instead of fixed-length chunks. No, adding a byte at the beginning won't trigger a full file upload - that's the beauty of content-defined chunking! I've implemented this in Java to demonstrate: github.com/neerajjain92/DropboxRabinChunker When you add a byte at the start, only the first chunk changes because: The 48-byte sliding window quickly moves past the modified area Once the window contains only unmodified content, it generates the same fingerprints Same fingerprints create identical chunk boundaries So Dropbox would only need to upload the first few modified chunk, while all other chunks remain unchanged and can be reused from the server. This makes sync super efficient for small changes in large files. Check out the implementation - it shows how the chunks resynchronize after the modified region using Rabin fingerprinting.
@OneSanddmanАй бұрын
I really love your video series. Just a slight problem to point out here. 50gbs uploaded with 100 mbs should take less than 10 minutes, not an hour 12 minutes.
@KingstonFortune20 күн бұрын
I would agree with you but then we would both be wrong 😉 Evan’s calculation in the video is actually correct because, first you have to convert Gigabytes to Gigabits (50 GB = 50 x 8 = 400 Gb) then divide it by the upload speed (400 Gb / 100 Mbps = 4,000 seconds) and then convert the seconds to hours (4,000 / 60 = 66.67 minutes) and finally (66.67 / 60 = 1.11 hrs) 😇
@adityaagarwal53483 ай бұрын
At 50:08, the delta sync approach might work in case of downloading updated chunk from s3 using range-bytes query and then updating file on the local system but it won't work other way around specifically because of s3. S3 objects are immutable so there will never be a case where a chunk will be updated. So if this questions come up in the interview, should we just mention that we won't sync files > some GBs or we should further divide the storage into blob and file-system (s3 and EFS) based on file size and handle the complexity on server?
@groovymidnight8 ай бұрын
I really like the 5-step structure, it's the best I've seen and it effectively helps me think through the designs in a methodical way.
@hello_interview8 ай бұрын
Right on! So glad it’s useful
@3rd_iimpact9 ай бұрын
I just finished reading the article on this lol. I’ll check out the video as well.
@aslgomes6 ай бұрын
Hey Stefan, awesome video, congrats! I've got a quick question though. Around the 49:46 mark, you mention adding an "updatedAt" to a chunk at a specific id/fingerprint. If a chunk changes, its fingerprint/hash/checksum would change too, right? So that id wouldn't really match the changed chunk anymore, would it? Doesn't that mean the old chunk gets "invalidated" and a new chunk id appears? Sorry if I'm missing something obvious here.
@hello_interview6 ай бұрын
No this is spot on, good call out. I was loose here. If the fingerprint is the ID, then an updatedAt does not make sense. If the fingerprint is not the ID, then it of course does. Trade off here of whether you want to keep old chunks around for versioning.
@AlbaraaAlHiyari8 ай бұрын
I truly appreciate all the effort you've put into making these amazing videos. Please keep them coming. One insignificant (not important) nitpick. 50 GB @ 100Mbps = ~ 1hr 7min. I think you just forgot to convert the decimal to minutes. You have it correct in the write up, as in 1.11 hours (0.11 * 60 = 6.6 minutes).
@hello_interview8 ай бұрын
Mental math is hard 😛
@AlbaraaAlHiyari8 ай бұрын
@@hello_interview tell me about it... Also not fun under the pressure of an interview 🤣
@krishnabirla163 ай бұрын
You did not talk about version inconsistency? If two clients keep changing their local folders, they will be in a loop of pushing their own sync and pulling the other client's sync. There has to be a timestamp/version based conflict resolution. Maybe a follow up please?
@phavelar8 ай бұрын
one can argue that "supporting 50gb upload file size" is a functional requirement (you placed it under non-functional requirement) - just a call out. great video!
@vaibhavsharma16537 ай бұрын
Amazing. Some Notes: DeepDive: Chunking CDNs Adaptive Polling with only updated chunks Compression.
@faruni829915 күн бұрын
Wow the best design video out there! Just wow.
@satyajeetkumar25883 ай бұрын
Awesome , so simple and elegant . It would have been great if you would have mentioned about checksum implementation to maintain data integrity as you have mentioned in the non functional requirements just to mention not the actual implementation.
@smalladi787 ай бұрын
Thanks for posting these! Great interview as always! I am learning a lot from these interviews. I found it interesting that you jumped ahead in order for the non-functional requirements since you knew the large file upload requirement would impact the design enough that doing the other ones first was not beneficial since they would become irrelevant. Obviously, this comes with actual experience of working on the job. May I suggest doing a follow up that uses the final design from this interview and consider how it may change if you piled on a more advanced feature like syncing only a partial set of folders or sharing folders with other people.
@pradeepbhat136327 күн бұрын
Great video man ! very useful for preparing for system design interview.
@jimitshah76367 ай бұрын
Great video for system design preparation. Methodology, the way he approached the question was good. 5 steps. Pretty good
@JyotiKundani052 ай бұрын
This video was really helpful. Amazing work of putting this together and your explanation was on point. Much appreciated!
@hello_interview2 ай бұрын
Glad you liked it! 🙂
@suri4Musiq9 ай бұрын
Loved this resouce, thank you so much! But I just wanted to point out that in my interview I was asked about sharing files with other users and I feel like this design concentrated more on just syncing files across multiple devices. In the former, I think we can talk a little more about CDN/other approaches which were hand waved here.
@hello_interview9 ай бұрын
Checkout the write up I linked! I go into sharing there.
@venkatamunnangi12879 ай бұрын
Thanks for the effort and videos. Easily one of the best in business for mocks and educational material.
@deathbombs8 ай бұрын
45:45 I wonder how syncing would change if instead of folder status, it's for database writes with many writers
@evangeloskostopoulos81739 ай бұрын
This is really awesome, thank you. Please keep them coming!
@vijaykhurana87669 ай бұрын
Great content. Thank you for posting. One of the best system design video I have come across for this design.
@dashofdopeАй бұрын
For the chunking -how many parallel calls would we do? Maybe it doesn't matter?
@god_of_blunder5 ай бұрын
these are the best Design videos i ever found, Thanks and Kudos.
@hello_interview5 ай бұрын
❤️
@jherreria7 ай бұрын
I really appreciate your help in this topic. I'm learning a lot! Keep the videos coming!
@adityaagarwal53483 ай бұрын
At 27:24 For determining which files are already available on the local system, can we store a client to files mapping on the server based on client id and then getChanges API uses that data + file metadata to calculate which files needs to be transferred to the client? I know there can be issues when there is a sync gap b/w local and remote like file is deleted on the local but anyway system is eventual consistent. Keeping lots of data on the client will grow the app size.
@TechieTech-gx2kdАй бұрын
What dropbox implement is something amazing, it maintains a server_file_journal which is an append only log for any namespace_id, this keep on storing amy changes being made to a particular file, imagine a text file you do CRUD on the file, all these operations are stored into that server_file_journal.. Client simply asks saying that for this nsId give me what's the latest after a specific checkpoint which is a pointer named journalId(which each client maintains for their namespace), when it asks what all happend after this journal id sever returns the chunk details(probably a different hash) and client simply downloads them. "Keeping lots of data on the client will grow the app size." it's not the appSize it's the userData it's what you want to keep in your machine and get quick access to and also at the same time get access to it on the remote machines too.what you are referring to is something different which ICloud offers which is optimizing storage by keeping a bare minimum photos/video thumbnail on iPhone and when users request that file it fetches high definition
@ahmedkhan257 ай бұрын
Excellent sys design interviews - I like the informative tone and clear approach - thanks
@jeremyklein9538 ай бұрын
Really good approach. I love how you build up to the full solution. It makes a lot of sense to me and helps me reason these complex systems as well
@mindrust2039 ай бұрын
Hey Evan, this content is fantastic, thank you! I have a question regarding your solution to chunking around the 39 minute mark When we ask S3 to fetch us a pre-signed URL, do we do that for all our chunks as well? Does this happen on initial request to upload the file (metadata)? The way the File Metadata entity schema is described, it looks like we have a top-level S3Link, but also chunk-level S3 links embedded in the file metadata, so the upload flow is a little unclear to me
@hello_interview9 ай бұрын
Good question, you're right to be a little confused here. So as I alluded to S3 offers and API called multi-part upload. For this, it requires just 1 presigned url, but, multi-part upload re-stitches the chunks back into a single file in s3, so this does not allow us to send over chunk deltas for syncing. As a result, we have to upload as chunks manually without relying on multi-part upload. So, long answer, but yes, you'd actually need to request a presigned url for each chunk, I should have made that clearer but tbh was not sure in the moment if multi-part upload could be configured to not re-stitch the file, so I omitted :)
@KITTU16239 ай бұрын
Thank you very much for the videos. One small nit pick. DynamoDB supports a maximum of 400KB per item and if we are storing all the chunk metadata in the item, for a 50GB file with 5 MB chunk size, assuming we need 100Bytes per chunk metadata, our item size would be around 1MB.
@hello_interview9 ай бұрын
Good catch! True
@stashittКүн бұрын
Thanks for this amazing video, I bought guided practices which have been incredible. I have one question, For a 50 gigabyte file we are storing an array of 10000 chunks in chunks, is it feasible ?
@Marcus-yc3ib3 ай бұрын
Please keep upload these kind of videos. Thank you very much.
@Ynno29 ай бұрын
Do you suggest a different delivery framework for system design interviews which aren't necessarily "product"?
@hello_interview9 ай бұрын
Topical! Was chatting about updating the site with that soon. I’d recommend very similar, but core entities and api are what may change as they could be less relevant. Instead I’d frame it as focusing on the inputs and outputs of the system more generally. And then still thinking about the data persisted
@hello_interview9 ай бұрын
I’ll do a pure infra question next
@rushio8673Ай бұрын
I think uploading speed up using chunks was clearly explained, but how do we speed up a download using the chunk wasn't clear, only brought up the point of whether to use the CDN or not, but if not using CDN then what ?
@hameeeed599216 күн бұрын
You first request the meta data from the backend and then download each chunk in the array from s3 sequentially.
@59sharmanalin3 ай бұрын
We didnt outline file sharing feature, is it because of time constraints?
@hello_interview3 ай бұрын
Went with syncing in the video instead since people asked for that in the comments
@VyasaVaniGranth6 ай бұрын
First - please continue making and sharing these videos, this is incredible. Very few high quality sources available out there and this is probably the best one in my eyes. Second - how realistic is it that the download and upload happen directly b/w client and S3? Are there security concerns with this approach that should be considered? For reference, there's a Dropbox engineer's talk where uploads go through an intermediate service - this does mean additional copies of the data meaning more memory / compute but seems more realistic. In general, for any design that has media upload (eg. newsfeed), would you recommend direct upload to S3?
@hello_interview6 ай бұрын
yah its a good point, most major systems don't do this for a number of reasons. While is largely academically correct and optimal, at youtube/dropbox/etc scale, they prefer more control so they're rolling their own systems here.
@krishnabirla163 ай бұрын
Do you not do web socket based design videos intentionally? Can you do some chat apps and video call apps?
@bqrkhn4 ай бұрын
Very nice video. A question: You added a updatedAt at each chunk. But chunks are identified with their ID which is calculated from a finger print. When the file changes, the finger print changes, how do we update the updatedAt? Possible Answer: From client we send both old and new chunk IDs and then update both id and updatedAt. Is this the correct strategy?
@fragrancias9724 ай бұрын
Same question here.
@bqrkhn4 ай бұрын
@@fragrancias972 what do you think about my possible answer ?
@insofcury2 ай бұрын
@@bqrkhn +1 I think this definitely solves the problem.
@MrSnackysmorez5 ай бұрын
I love the videos and these are some of the best explanations. I love the flow and how everything builds on each other. It makes it much more manageable to do these problems. However you are driving and dictating this and this is so much harder to do when the interviewer wants to constantly interrupt and ask questions while you are doing these steps without first letting you explain what you are doing. I have this happen pretty often. How can you tell them to just chill and let you proceed? Appreciate these videos!
@haixiongwang46082 ай бұрын
Will the version management of files be out of scope for 35-45 mins interview discussion. Just want to get high level understanding the scope of current SD. Thanks
@amitb29217 ай бұрын
Thanks for a great content, especially the Deep Dive part, which generally people do not discuss about. I have one question around storing the chunks as list in the DB. For 50 GB file and 5 MB chunks there will be 10K chunks created. So the chunks list will have 10K entries. Now updating one chunk list column for every chuck status change could be quite challenging. Would it be better if we have a separate table for chunks instead. Also while you do the matching of chunks with the fingerprint, You need to check 10K entries from Local DB(with separate table and indexed) vs 10K entries in the chunk list (in single table column), where former is more efficient. Kindly let me know what are your thoughts on above points ?
@hello_interview6 ай бұрын
Sounds reasonable to me! Good call out
@amitb29216 ай бұрын
@@hello_interview Thanks a ton for the response. I have modified my comment above to be bit more clear.
@Vancez-z2hАй бұрын
As for syncing updated files on other clients, how does the client know the files are updated?
@ndubuezeprecious3913 ай бұрын
Great stuff. This is the best I’ve seen so far. Can I know the app you are using for the white boarding, it looks really sleek
@hello_interview3 ай бұрын
Excalidraw
@kkfun1Ай бұрын
Does the File service create a signed URL for every file chunk to be uploaded?
@pujamishra14758 ай бұрын
I have a product architecture interview coming up. I was really looking for some good product architecture/design examples and then came across this. This is very helpful because you talk about the client, user experience, malicious users and relate it to the design decisions made. Thank you! One question, for a product architecture interview - should we go into more details about the APIs like explicitly write out requests, response, failure/success codes or the amount of discussion you did on APis is enough for senior level? Can you also tell me what topics/ points would you add over the discussion in this video if this was asked in a product architecture design round. Thanks again!
@surojitsantra76278 ай бұрын
One of the best and detailed explanation. Thank you so much for this content. Please upload more such videos.
@hello_interview8 ай бұрын
New one later today!
@IshaZaka9 ай бұрын
Hi Evan, Thankyou so much for providing this type of content. plz make a system design video on payment system
@BlunderMunchkin6 ай бұрын
Huh. I would have prioritized consistency over availability. So much so, in fact, that I didn't even think it was a question. Some of the biggest headaches I've experienced as a developer have been caused by having an out-of-date file. I would much rather be temporarily unable to retrieve a file than to be fooled into thinking that the file I retrieved is the correct version.
@GabrielAnyaeleАй бұрын
I really love your videos. I have a question though, are there chunk ids constant (most likely so)?. You made mention that the chunk ids are a hash of the bytes of the chunks, what happens when the chunks are updated - Do we still maintain the initial ids? You put out amazing contents, I appreciate once again
@mahdidi96Ай бұрын
Very important question, but I just noticed: there is no persistence of folder structure, i.e. how the files are organized locally. Say you setup dropbox on a new device. I can see this system syncing the files themselves, but you would lose your folder structure. The new device would contain all the files in the same directory essentially. How would you address this? (unless you did and I missed it)
@caesar55552 ай бұрын
Thank you! This is awesome! In Meta is your interviewer going to be at the hiring committee or will just send your and their notes?
@hello_interview2 ай бұрын
Depends on level and situation. most likely they won't be there
@caesar55552 ай бұрын
Thank you for the answer. @@hello_interview Staff+ . So the interviewer is just a medium for intaking information....
@hello_interview2 ай бұрын
@@caesar5555 In some sense. They provide a judgement and the loop (the collection of interviewers, together with the recruiter) make a call on whether to put the package forward to the hiring committee. That group (usually a couple directors) does not have enough time to review every detail of the notes, so they use some heuristics to see where the major risks are and decide whether to move forward.
@kamalsmusic6 ай бұрын
For the client to know how to stitch together chunks, doesn't it need to know the starting offset & length for each one?
@viveksharma-tt5nj2 ай бұрын
Simply amazing !! Thanks a lot for such clear and concise explanation !
@nobodyknows2286 ай бұрын
1. How can we handle write conflicts when we have a folder which is supposed to be consistent across multiple devices? 2. Also when two devices are disconnected from the internet and if users updates some files how does the sync happens when they come back online and when both tries to write the changes at the same time at a same file path? I am not sure if these solutions work but I think 1. We can use a Redis lock for writes with TTL same as the timeout or a little more of the pre-signed url. If connection fails in between we can just resume the upload when connected back. But this might be a problem when a user is trying to upload big files with large timeout durations since other users might have to wait till the user uploading currently is done. 2. When the user comes back online we should probably first fetch all the changes that are executed on the device and raise conflicts with the user asking what action to perform(similar to git) and acquire lock to write if required.
@jmms498 ай бұрын
great videos, thanks for uploading these. Easily the best content about system design interviews I've found. I would probably suggest to use merkle trees for the sync functionality, seems like a natual way to diff and sync large file systems
@charan7753 ай бұрын
how do you handle nested folders in your schema? also chunks could kept as separate table at user id level, so that we can reuse chunks of different files..
@jayshah2348 ай бұрын
Hi Evan, Thanks for the detailed explanation! Very helpeful! At 40:50, you mentioned that S3 exposes multi-part upload API. Does that mean on client end we don't have to handle chunking and fingerprinting given that we use S3 multi-part? Thanks!
@hello_interview8 ай бұрын
You’ll still do the chunking but it will handle fingerprint checks
@SunilKumar-jl6dl5 ай бұрын
Hey there, I have some questions. Would be great to get your thoughts: 1. S3 supports multipart upload and all the chunks would get reconstructed into a single file at S3. Isn't this correct? If yes, then having file chunks in the database would be redundant right? Or would S3 have the chunks always and give access to the download at the client end? 2. At the client end should we know how the updated/deleted chunks of a previously uploaded file be stitched back together? 3. Would folder sharing with other users be a possible follow up question? Like what Google drive offers.