"Have you done any System Design course ? How are you so good with this subject ?" - These were the word of my interviewer. I had a High Level + Low Level system design with a start-up recently. Surprisingly the question was to design a file sharing system such as Google Drive as described in this video with some additional features. I explained the HLD with the diagram as I had learned from the the concepts of this video. After the HLD was over, the interviewer told me that I have created a very robust & elegant system. He further said, he was so satisfied with the HLD, that he no longer wants to go into the LLD. Folks, these videos are the absolutely anything that you will ever require to ace a system design interview. Do remember to learn the fundamentals used in the system. A huge thanks to #Hello Interview for putting out the best content out there.
@JohnVandivier6 ай бұрын
"he was so satisfied with the HLD, that he no longer wants to go into the LLD. " GOALS! kudos and congrats
@hello_interview6 ай бұрын
This is epic!
@charan7752 ай бұрын
which startup bro?
@abhijit-sarkar22 күн бұрын
These videos are undoubtedly great, but your interviewing experience at some start up doesn't prove that. Interviewing is taught at FAANG companies, and some dude at a company that opened 6 months ago wouldn't even come within 9 miles of a FAANG interviewer.
@KiritiSai932 ай бұрын
You guys remind me of the "Acquired" podcast hosts. No click-baits or cringe posts, just sheer passion about the subject and high-quality in-depth analysis of things. Kudos and hope you continue the great work!
@hello_interview2 ай бұрын
That’s the idea. Pure value no BS 🫡
@draugno7Ай бұрын
I also loved the jokes and an occasional reassurance in the Uber video, looking forward to more! Ddinngdding (that driver's phone after Taylor Swift concert in a badly designed system). This channel is simply amazing because it ties together all of the concepts I learned and even elaborates on different DSs and DBs. Someone said 'no shade to other youtubers' but I say 'yes shade' because they usually confuse and frustrate people who watch with incomplete diagrams and explanations.
@YeetYeetYe4 ай бұрын
Simply amazing. I don't mean to throw shade to other channels, but this is by FAR the best system design interview prep. So many other channels are just people with a couple of months of experience at FAANG and it really shows the difference between junior FAANG engineers and Staff FAANG engineers. Extremely high quality work.
@hello_interview4 ай бұрын
So glad you like them!
@EamonLinskey7 ай бұрын
These are the best System Design videos I have found. Great framework for approaching problems, clear explanations, helpful diagrams. And I really appreciate the notes about how insight’s different seniority levels might approach specific parts
@andjelaarsic92176 ай бұрын
My mind is absolutely blown by how beautifully everything is explained. I love how you understand what would be possible questions/confusions from people watching and you address them by explaining pros and cons. Thank you so much for the content! Your walkthroughs are by far the most useful and interesting.
@hello_interview6 ай бұрын
High praise! Appreciate you taking the time to share this 😊
@Wololowizz4 ай бұрын
I must say that this is the best system design video I've seen so far. You covered the problem and solution step-by-step while other videos just throws a bunch of ideas right away. Sometimes I feel overwhelmed watching other videos thinking that's impossible to know all of that, but watching this video we can know what's the expectation for each level and the most important thought: you don't need to know everything. And that's gold
@hello_interview4 ай бұрын
Glad you liked it! Check out our others if you haven’t already. Same format :)
@GauravGupta-op8ol8 ай бұрын
With my systems design interview coming up, I was looking forward to your video. It's great as always.
@madhurnsit4 ай бұрын
This is the best content I have come across on System Design interviews. Wish I had landed here this sooner. Thank you so much!
@lorddel8 ай бұрын
One more comment on this: comparing this to the written content on hellointerview, this one seems more round and well-thought (mainly regarding using S3 notif. on chunk upload completion, which wont work). Would be cool to see it reflected there on the platform! Good job
@hello_interview8 ай бұрын
Good feedback! I'll try to get that updated, particularly by adding sync which I just last minute decided to throw into the video.
@md_dm4907 ай бұрын
This channel has the best system design content on youtube. Keep up the good work.
@JShaker5 ай бұрын
I'm so grateful for all of your videos. I've been practicing using the Hello Interview AI interviews, booked one mock with one of your interviewers, watched all the videos. The quality is so far beyond any other content out there, and I've successfully passed 5 system design interviews. Keep up the good content, your KZbin channel deserves to blow up and your website too #wouldinvest
@alexandergordon92867 ай бұрын
It's pure gold! specially the parts where you are stopping the debates abouts what db to choose or if the calculations are needed. The deep dives are the best part.. no one goes that deep and thats actually what matters in an interview
@prasidmitra68596 ай бұрын
These are like gift from God. The best SD resources I've found in the last 3 years.
@cidwiththreeeyesАй бұрын
Thank you for another great video! Honestly, I don’t have any constructive criticism, it’s pretty much a perfect format for these videos-practical, concise, insightful. Other creators’ videos like this are good, but they feel like they’re just going through memorized recipes. Your videos are actually teaching system design theory. Really hope you have more of these as I make my way through your catalog.
@anmolgangwal92362 ай бұрын
bro we are ready to pay just enable the join icon in your channel, this content is too good to be free
@levimatheri76825 ай бұрын
Wow, by far the best system design videos anywhere. I love how simple you make it, and the invaluable tips!
@guitarMartial24 күн бұрын
49:09 - time is a weird commodity in distributed systems with clock drift et al wouldnt version vectors be a better solution instead? this way we can detect write conflicts pretty well too
@hello_interview24 күн бұрын
Yes :)
@guitarMartial24 күн бұрын
@@hello_interview Come to think of it - maybe even a Merkle tree here might be powerful. You are storing all the hashes already just build a local merkle tree and use anti-entropy to figure out delta periodically. Really wild thought - merkle tree + version vectors. One helps quickly figure out anti entropy as we can compare hashes the other helps with write conflict detection. Couple this with Kafka as you showed and you have a pretty amazing scaling solution.
@guitarMartial24 күн бұрын
55:31 - Merkle trees et al are giving me flashbacks to Torrenting days. Indeed the files were broken up in different chunks whose shas were used to perform comparisons for the sake of completion.
@VahidOnTheMove6 ай бұрын
Thanks for the videos. 47:45 I would like to know your opinion on push approach? By push approach I meant when the File service knows there is a change in a chunk, Sync service will let the client know. And, then the client will send a request to sync/download the chunk.
@jagrit072 ай бұрын
Watched 20 minutes of the video so far and This is the 3rd resource I am watching regarding Dropbox design, I have read Alex's book, read Grokking book and now watching this just for fun and I think Evan King is actually the King lol. Amazing video, Please keep on adding more content. Yesterday, I commented on Tinder's Design video and now here. I think I might have to comment on all the videos once I watch those because this is really good stuff and we viewers should appreciate it and hence I will keep adding comments lol :D
@anuragtiwari30327 ай бұрын
i dont comment much, but for this kind of explanation i gotta give it u. Hands down the best explanation on youtube . pls continue making these kind of videos . This channel will blow up
@hello_interview7 ай бұрын
♥️
@yourssachin7 ай бұрын
Love the content and explanation. I watched hundreds of videos on system design from last 4-5 years and also have paid subscription from few. I don't have any doubt that, your channel can become premier system design platform in no time if you keep the content quality high ( just like last 3 videos). Next video, I'd recommend to talk about messaging platform like WhatsApp or FB messenger. There are so many videos on this topic but didn't find any which explain the details and really help in the interview.
@EngineeringBootCamp4 күн бұрын
Another great video. Some questions that came up in my mind after watching this video is - 1) How does local chunking work, do I literally break the files into parts and keep that in some other system or temp folder, and upload the files from there? 2) After I have uploaded the file, do I get rid of the chunks? 3) If we had a delta change in a remote file, you talked about comparing the fingerprints on all chunks and comparing locally, to only download ones that changed, implying we still keep these chunks locally somewhere? And even if I downloaded a modified chunk, how do I go ahead and stitch the chunks together to create the unified file in the main folder? [A little more clarity on those questions would be really beneficial.]
@TechieTech-gx2kd2 күн бұрын
1. The chunking is not a physical concept rather a virtual one, the files are still stored as bits in the physical storage but in the database dropbox maintains a table on the client side known as chunks, which keeps the ranges on the physical file representing that chunk. Here is schema for chunks table Column Name Data Type Description chunk_hash TEXT (Primary Key) The unique hash of the chunk (e.g., SHA-256). ref_count INTEGER Number of files referencing this chunk. file_path TEXT File path where this chunk resides. start_byte INTEGER Start byte position of the chunk in the file. end_byte INTEGER End byte position of the chunk in the file. Similarly dropbox has file table Tracks metadata about files, including their chunk composition. Column Name Data Type Description file_id TEXT (Primary Key) A unique identifier for the file (e.g., UUID). file_name TEXT The name of the file. file_path TEXT Full path to the file on the local disk. chunk_hashes TEXT Comma-separated list of chunk hashes in order. Now when you add a new file, in the application layer you create chunks and calculate hash of each of them, then try to commit those chunks in Dropbox metaService, the metadata service will inform if the chunk is already available and won't ask you to upload at BlobService. 2. As there are no physical chunks So there is no need to get rid of chunks. on the local storage we always deal with files and not chunks. 3. Nopes you are not keeping any chunks but instead you'll deal with hashes(chunk hashes to be precise), as soon as you receive a notification that there is a remote change you'll ask about the chunks and their hashes, To dive little deeper, the MetaService maintains the Server_file_journal which keeps Append Only logs for each namespace and let you know for a paricular namespace what all changes are available in the server and you download only those chunks which you don't have in local based on their hashes. Now once you have the chunks available you directly replace bytes of that modified file in the disk without the need to re-create the file, so you are dealing with bits here via start and end offset. Do let me know if you need more detail
@VarunVermaUSC2 күн бұрын
@@TechieTech-gx2kd Thank you so much, for taking the time out and sharing those details!
@tushargoyal5543 ай бұрын
This is the best channel for learning system design. I've gone through a lot of explanations but found them talking things in isolation making it very hard to connect to get a full picture. The popular system design interview book also doesn't help much due to very discrete and sometimes inconsistent sharing of knowledge.
@aldogutierrezalcala30473 ай бұрын
Bro, again me, just had a system design interview using your framework, still don't have the result but definitely this framework is basically pure gold to lead a conversation that i would keep using even in a daily job.
@hello_interview3 ай бұрын
Hell yes!! So glad it went well 💪
@PtbcprАй бұрын
did you end up getting the job?
@Gamble3962 ай бұрын
One of the best System Design channels. Please keep uploading.
@DMA-IАй бұрын
I believe there is a slight flaw for the sync files from remote server feature (24:16). I believe we need to keep records in db which device which client has synced to date what updated time/what version or the get changes will loop endlessly (getchange will always get files needs to be updated, but they might just have been updated)
@adeeshacharya75206 ай бұрын
This is really good, irrespective of whether we are taking interview or not, any person looking at this level of explanation and detail would try to picture software differnetly. Thanks for making such videos, would love to see some more
@chongxiaocao57376 ай бұрын
one of the best system design preparation video I have seen online.
@yankomirov4290Ай бұрын
You added systematic (pardon the pun) approach to such an open-ended nature of an interview. This was a game change for me! I really appreciate it, I went ahead and bought the Guided Practice which is also amazing and is my main practicing tool. Thank you so much!
@batusun7173 ай бұрын
please upload more stuff like this. This is literally the BEST on KZbin. Very much appreciate all the great efforts!
@mehdisaffar6 ай бұрын
I love the content. It has been frustrating to watch some other system design videos where they just brush off over important details and act like everything is straightforward and easy, and just make 10s of services and never really explain the nitty-gritty details of how those things would work and IF they would actually work/be efficient etc. Thank you!
@mehdisaffar6 ай бұрын
I wish you had mentioned the challenges of 2-way syncing in this context. Because this is akin to master-master replication, in case of network partition (for example user makes changes to remote, hops on another offline device, makes changes, then comes back online) there is a chance of inconsistencies (user makes different changes on device 1 vs device 2). There would probably need to be a way to offer merging changes together or have the user choose between version 1 or 2.
@mehdisaffar6 ай бұрын
I think I talked too fast! You did mention reconciliation
@galashrenik34042 ай бұрын
One suggestion I have is that when designing APIs, your videos often highlight the importance of handling partial data, which is typically expected of senior or staff engineers. In my view, API versioning carries a similar level of significance.
@59sharmanalin2 ай бұрын
We didnt outline file sharing feature, is it because of time constraints?
@hello_interview2 ай бұрын
Went with syncing in the video instead since people asked for that in the comments
@noobu8 ай бұрын
Great stuff again! Not only good for interview but also for daily work 1) Clear and concise structure 2) Weigh trade off rigorously and explain the final decision clearly. Every single component is well though out with real world considerations
@AncientArtist72 ай бұрын
Your content is great and really easy to follow through each step of the process. Please continue to make more system design videos. It is extremely helpful !
@amitb29216 ай бұрын
Thanks for a great content, especially the Deep Dive part, which generally people do not discuss about. I have one question around storing the chunks as list in the DB. For 50 GB file and 5 MB chunks there will be 10K chunks created. So the chunks list will have 10K entries. Now updating one chunk list column for every chuck status change could be quite challenging. Would it be better if we have a separate table for chunks instead. Also while you do the matching of chunks with the fingerprint, You need to check 10K entries from Local DB(with separate table and indexed) vs 10K entries in the chunk list (in single table column), where former is more efficient. Kindly let me know what are your thoughts on above points ?
@hello_interview5 ай бұрын
Sounds reasonable to me! Good call out
@amitb29215 ай бұрын
@@hello_interview Thanks a ton for the response. I have modified my comment above to be bit more clear.
@adityaagarwal53482 ай бұрын
At 50:08, the delta sync approach might work in case of downloading updated chunk from s3 using range-bytes query and then updating file on the local system but it won't work other way around specifically because of s3. S3 objects are immutable so there will never be a case where a chunk will be updated. So if this questions come up in the interview, should we just mention that we won't sync files > some GBs or we should further divide the storage into blob and file-system (s3 and EFS) based on file size and handle the complexity on server?
@OneSanddman10 күн бұрын
I really love your video series. Just a slight problem to point out here. 50gbs uploaded with 100 mbs should take less than 10 minutes, not an hour 12 minutes.
@bqrkhn3 ай бұрын
Very nice video. A question: You added a updatedAt at each chunk. But chunks are identified with their ID which is calculated from a finger print. When the file changes, the finger print changes, how do we update the updatedAt? Possible Answer: From client we send both old and new chunk IDs and then update both id and updatedAt. Is this the correct strategy?
@fragrancias9723 ай бұрын
Same question here.
@bqrkhn3 ай бұрын
@@fragrancias972 what do you think about my possible answer ?
@insofcuryАй бұрын
@@bqrkhn +1 I think this definitely solves the problem.
@aslgomes5 ай бұрын
Hey Stefan, awesome video, congrats! I've got a quick question though. Around the 49:46 mark, you mention adding an "updatedAt" to a chunk at a specific id/fingerprint. If a chunk changes, its fingerprint/hash/checksum would change too, right? So that id wouldn't really match the changed chunk anymore, would it? Doesn't that mean the old chunk gets "invalidated" and a new chunk id appears? Sorry if I'm missing something obvious here.
@hello_interview5 ай бұрын
No this is spot on, good call out. I was loose here. If the fingerprint is the ID, then an updatedAt does not make sense. If the fingerprint is not the ID, then it of course does. Trade off here of whether you want to keep old chunks around for versioning.
@pragatimodi9505 ай бұрын
Hi Evan, this is my first time giving system design interviews. Really glad I found this channel to learn from. Most of my prior feedback from mocks and system design have been framework related for when I explain my design. This really helps with that and I think even at work, this is a really good approach to follow for. most things. Awesome content, thanks a lot!!!
@krishnabirla162 ай бұрын
You did not talk about version inconsistency? If two clients keep changing their local folders, they will be in a loop of pushing their own sync and pulling the other client's sync. There has to be a timestamp/version based conflict resolution. Maybe a follow up please?
@vaibhavsharma16536 ай бұрын
Amazing. Some Notes: DeepDive: Chunking CDNs Adaptive Polling with only updated chunks Compression.
@crackITTechieTalks6 ай бұрын
This is the best system design video, I have watched!! Specially the deep dives, You nailed it !! Looking forward to watch your videos.
@KITTU16238 ай бұрын
Thank you very much for the videos. One small nit pick. DynamoDB supports a maximum of 400KB per item and if we are storing all the chunk metadata in the item, for a 50GB file with 5 MB chunk size, assuming we need 100Bytes per chunk metadata, our item size would be around 1MB.
@hello_interview8 ай бұрын
Good catch! True
@3rd_iimpact8 ай бұрын
I just finished reading the article on this lol. I’ll check out the video as well.
@nobodyknows2285 ай бұрын
1. How can we handle write conflicts when we have a folder which is supposed to be consistent across multiple devices? 2. Also when two devices are disconnected from the internet and if users updates some files how does the sync happens when they come back online and when both tries to write the changes at the same time at a same file path? I am not sure if these solutions work but I think 1. We can use a Redis lock for writes with TTL same as the timeout or a little more of the pre-signed url. If connection fails in between we can just resume the upload when connected back. But this might be a problem when a user is trying to upload big files with large timeout durations since other users might have to wait till the user uploading currently is done. 2. When the user comes back online we should probably first fetch all the changes that are executed on the device and raise conflicts with the user asking what action to perform(similar to git) and acquire lock to write if required.
@BlunderMunchkin5 ай бұрын
Huh. I would have prioritized consistency over availability. So much so, in fact, that I didn't even think it was a question. Some of the biggest headaches I've experienced as a developer have been caused by having an out-of-date file. I would much rather be temporarily unable to retrieve a file than to be fooled into thinking that the file I retrieved is the correct version.
@satyajeetkumar25882 ай бұрын
Awesome , so simple and elegant . It would have been great if you would have mentioned about checksum implementation to maintain data integrity as you have mentioned in the non functional requirements just to mention not the actual implementation.
@phavelar7 ай бұрын
one can argue that "supporting 50gb upload file size" is a functional requirement (you placed it under non-functional requirement) - just a call out. great video!
@suri4Musiq7 ай бұрын
Loved this resouce, thank you so much! But I just wanted to point out that in my interview I was asked about sharing files with other users and I feel like this design concentrated more on just syncing files across multiple devices. In the former, I think we can talk a little more about CDN/other approaches which were hand waved here.
@hello_interview7 ай бұрын
Checkout the write up I linked! I go into sharing there.
@allenputich41923 ай бұрын
You do an amazing job of explaining the thought process, technical details, and growth opportunities!
@adityaagarwal53482 ай бұрын
At 27:24 For determining which files are already available on the local system, can we store a client to files mapping on the server based on client id and then getChanges API uses that data + file metadata to calculate which files needs to be transferred to the client? I know there can be issues when there is a sync gap b/w local and remote like file is deleted on the local but anyway system is eventual consistent. Keeping lots of data on the client will grow the app size.
@TechieTech-gx2kd2 күн бұрын
What dropbox implement is something amazing, it maintains a server_file_journal which is an append only log for any namespace_id, this keep on storing amy changes being made to a particular file, imagine a text file you do CRUD on the file, all these operations are stored into that server_file_journal.. Client simply asks saying that for this nsId give me what's the latest after a specific checkpoint which is a pointer named journalId(which each client maintains for their namespace), when it asks what all happend after this journal id sever returns the chunk details(probably a different hash) and client simply downloads them. "Keeping lots of data on the client will grow the app size." it's not the appSize it's the userData it's what you want to keep in your machine and get quick access to and also at the same time get access to it on the remote machines too.what you are referring to is something different which ICloud offers which is optimizing storage by keeping a bare minimum photos/video thumbnail on iPhone and when users request that file it fetches high definition
@Ynno28 ай бұрын
Do you suggest a different delivery framework for system design interviews which aren't necessarily "product"?
@hello_interview8 ай бұрын
Topical! Was chatting about updating the site with that soon. I’d recommend very similar, but core entities and api are what may change as they could be less relevant. Instead I’d frame it as focusing on the inputs and outputs of the system more generally. And then still thinking about the data persisted
@hello_interview8 ай бұрын
I’ll do a pure infra question next
@JyotiKundani05Ай бұрын
This video was really helpful. Amazing work of putting this together and your explanation was on point. Much appreciated!
@hello_interviewАй бұрын
Glad you liked it! 🙂
@krishnabirla162 ай бұрын
Do you not do web socket based design videos intentionally? Can you do some chat apps and video call apps?
@smalladi786 ай бұрын
Thanks for posting these! Great interview as always! I am learning a lot from these interviews. I found it interesting that you jumped ahead in order for the non-functional requirements since you knew the large file upload requirement would impact the design enough that doing the other ones first was not beneficial since they would become irrelevant. Obviously, this comes with actual experience of working on the job. May I suggest doing a follow up that uses the final design from this interview and consider how it may change if you piled on a more advanced feature like syncing only a partial set of folders or sharing folders with other people.
@Marcus-yc3ib2 ай бұрын
Please keep upload these kind of videos. Thank you very much.
@ashutoshrana99986 ай бұрын
Will be the best system design interview channel for sure. Neat content. Keep up with the quality Man!
@TatianaRacheva4 ай бұрын
IIRC, low latency was specifically low priority for Dropbox because they (like email) rely on the client syncing the data and user accessing the local copy when it is ready. Also, I question whether consistency is less important than availability. I don’t know, but I’m curious how the answer would be different if latency could be high and consistency had to be strong.
@mindrust2038 ай бұрын
Hey Evan, this content is fantastic, thank you! I have a question regarding your solution to chunking around the 39 minute mark When we ask S3 to fetch us a pre-signed URL, do we do that for all our chunks as well? Does this happen on initial request to upload the file (metadata)? The way the File Metadata entity schema is described, it looks like we have a top-level S3Link, but also chunk-level S3 links embedded in the file metadata, so the upload flow is a little unclear to me
@hello_interview8 ай бұрын
Good question, you're right to be a little confused here. So as I alluded to S3 offers and API called multi-part upload. For this, it requires just 1 presigned url, but, multi-part upload re-stitches the chunks back into a single file in s3, so this does not allow us to send over chunk deltas for syncing. As a result, we have to upload as chunks manually without relying on multi-part upload. So, long answer, but yes, you'd actually need to request a presigned url for each chunk, I should have made that clearer but tbh was not sure in the moment if multi-part upload could be configured to not re-stitch the file, so I omitted :)
@jimitshah76366 ай бұрын
Great video for system design preparation. Methodology, the way he approached the question was good. 5 steps. Pretty good
@xparkyoloАй бұрын
You don't need paid courses when the quality of content for system design on youtube is so high. Amazing explanation clear to the point and tackles
@deathbombs7 ай бұрын
45:45 I wonder how syncing would change if instead of folder status, it's for database writes with many writers
@groovymidnight7 ай бұрын
I really like the 5-step structure, it's the best I've seen and it effectively helps me think through the designs in a methodical way.
@hello_interview7 ай бұрын
Right on! So glad it’s useful
@GabrielAnyaele10 күн бұрын
I really love your videos. I have a question though, are there chunk ids constant (most likely so)?. You made mention that the chunk ids are a hash of the bytes of the chunks, what happens when the chunks are updated - Do we still maintain the initial ids? You put out amazing contents, I appreciate once again
@ramannanda6 ай бұрын
For the delta sync bit, probably should go a bit deeper into rechunking for an existing file, to perform the delta sync.
@dashofdope27 күн бұрын
For the chunking -how many parallel calls would we do? Maybe it doesn't matter?
@AlbaraaAlHiyari7 ай бұрын
I truly appreciate all the effort you've put into making these amazing videos. Please keep them coming. One insignificant (not important) nitpick. 50 GB @ 100Mbps = ~ 1hr 7min. I think you just forgot to convert the decimal to minutes. You have it correct in the write up, as in 1.11 hours (0.11 * 60 = 6.6 minutes).
@hello_interview7 ай бұрын
Mental math is hard 😛
@AlbaraaAlHiyari7 ай бұрын
@@hello_interview tell me about it... Also not fun under the pressure of an interview 🤣
@ezwalduzumaki316125 күн бұрын
Begging you to answer... much love: One question regarding non functional requirements, how do you decide which one to pick? You started with uploading large files and not working your non functional from top to bottom, why? What's the intuition behind that?
@charan7752 ай бұрын
how do you handle nested folders in your schema? also chunks could kept as separate table at user id level, so that we can reuse chunks of different files..
@mahdidi9620 күн бұрын
Very important question, but I just noticed: there is no persistence of folder structure, i.e. how the files are organized locally. Say you setup dropbox on a new device. I can see this system syncing the files themselves, but you would lose your folder structure. The new device would contain all the files in the same directory essentially. How would you address this? (unless you did and I missed it)
@fragrancias9723 ай бұрын
Excellent content. Please tell me if I’m mistaken, but I believe GET /files/:fileid would return a list of chunk s3 links, not the file itself. Also, I don’t think merely filtering chunks by update time would work for syncing. You would need a tombstone for when chunks are removed. You didn’t quite specify how “polling the DB”/ update time filtering works with delta sync. Merkle trees could be used to optimize the reconciliation you mentioned, right?
@jayshah2347 ай бұрын
Hi Evan, Thanks for the detailed explanation! Very helpeful! At 40:50, you mentioned that S3 exposes multi-part upload API. Does that mean on client end we don't have to handle chunking and fingerprinting given that we use S3 multi-part? Thanks!
@hello_interview7 ай бұрын
You’ll still do the chunking but it will handle fingerprint checks
@fmagarik3 ай бұрын
I didn't get the event bus part. What's actually stored there?
@pujamishra14757 ай бұрын
I have a product architecture interview coming up. I was really looking for some good product architecture/design examples and then came across this. This is very helpful because you talk about the client, user experience, malicious users and relate it to the design decisions made. Thank you! One question, for a product architecture interview - should we go into more details about the APIs like explicitly write out requests, response, failure/success codes or the amount of discussion you did on APis is enough for senior level? Can you also tell me what topics/ points would you add over the discussion in this video if this was asked in a product architecture design round. Thanks again!
@indreshgahoi71038 ай бұрын
Hey Evan , thank you so much for providing the great content. I really live the way you organize and put content across the board. ❤
@pankajk90737 ай бұрын
one question- how do we merge chunks in order after downloading to local device? is it a good idea to keep some kind of sequence number for each chunk for a file?
@hello_interview7 ай бұрын
Yah!
@Sandeepg2556 ай бұрын
@@hello_interview Wont this mergeing logic be too heavy on the client side ?
@venkatamunnangi12878 ай бұрын
Thanks for the effort and videos. Easily one of the best in business for mocks and educational material.
@kkfun18 күн бұрын
Does the File service create a signed URL for every file chunk to be uploaded?
@haixiongwang4608Ай бұрын
Will the version management of files be out of scope for 35-45 mins interview discussion. Just want to get high level understanding the scope of current SD. Thanks
@god_of_blunder4 ай бұрын
these are the best Design videos i ever found, Thanks and Kudos.
@hello_interview4 ай бұрын
❤️
@yuuhameaw15105 ай бұрын
Thanks for the great content! One question though, if we use chunk fingerprint as an id, when the chunk change the fingerprint would be changed. How are we going to sync them?
@hello_interview5 ай бұрын
Add a new chunk. Good to keep the old around for versioning (not a requirement here, so either way works)
@ahmedkhan256 ай бұрын
Excellent sys design interviews - I like the informative tone and clear approach - thanks
@RS7-12328 күн бұрын
so all in all, are u suggesting we maintain chunks so that we can only upload the changed chunks rather than the whole file? if so, it means we possibly can't use s3 multi part upload since that always assembles the chunks back. did i understand it right?
@jeremyklein9537 ай бұрын
Really good approach. I love how you build up to the full solution. It makes a lot of sense to me and helps me reason these complex systems as well
@TomasV24710 күн бұрын
Hi! Thanks for these videos. They are great! The depth and how trade offs are presented is really helpful. Out of curiosity, how would you adjust this for supporting users worldwide? Imagine a user uploading everything in North America and then moving to Europe and retrieving their files. Would you use something like DDB Global Tables and a CDN? Or do you think there is a better approach? You could maybe get away with the latency by uploading new files to the new region and just accepting downloads of old files are going to be slow (which may be ok as users will probably already have the files locally?) but the DB records to get the files for each user still need a kind of global replication or tracking of where the client was previously connecting maybe? And then migration records and files to new region?
@viveksharma-tt5njАй бұрын
Simply amazing !! Thanks a lot for such clear and concise explanation !
@HiraMalik-r3i4 ай бұрын
Thanks for such a detailed video. Query: If File service is pushing the change events to Event Bus and also updating Db someway, wouldn't this lead to dual write problem? I do not see the same event being consumed by both( no message queues in that flow in the diagram). Shouldn't we instead use CDC or some other solution for this problem? What are your thoughts on that?
@MrSnackysmorez4 ай бұрын
I love the videos and these are some of the best explanations. I love the flow and how everything builds on each other. It makes it much more manageable to do these problems. However you are driving and dictating this and this is so much harder to do when the interviewer wants to constantly interrupt and ask questions while you are doing these steps without first letting you explain what you are doing. I have this happen pretty often. How can you tell them to just chill and let you proceed? Appreciate these videos!
@Vancez-z2h15 күн бұрын
As for syncing updated files on other clients, how does the client know the files are updated?
@Resocram7 ай бұрын
If you split the files into chunks are you able to upload all chunks using the same pre-signed URL? Or do you need to generate a new URL for each chunk? How would you piece together the file from S3 when you download it through chunks?
@hello_interview7 ай бұрын
S3 has a multi-part upload api that only requires 1 pre-signed url. Depending on if you need the chunks as chunks in S3 or not you can use that (it stitches them back together automatically). If you want to save the chunks as chunks, you need N pre-signed urls
@eforeyerman3 ай бұрын
Are there any nitty-gritty details we need to know about auth for when the client talks to S3 directly on behalf of the file workflow? Or is that all handled by the pre-signed URL?
@dannyryngler64255 ай бұрын
Question - what should the file id be? It can't be based on the file name, as names can change. It also couldn't be a hash of the whole file, as the file itself can obviously change. Amazing content, thank you!!
@hello_interview5 ай бұрын
Depends on if you want versioning or not. Can be the fingerprint or a random uuid, depends on requirements
@vijaykhurana87668 ай бұрын
Great content. Thank you for posting. One of the best system design video I have come across for this design.
@jherreria6 ай бұрын
I really appreciate your help in this topic. I'm learning a lot! Keep the videos coming!
@SunilKumar-jl6dl4 ай бұрын
Hey there, I have some questions. Would be great to get your thoughts: 1. S3 supports multipart upload and all the chunks would get reconstructed into a single file at S3. Isn't this correct? If yes, then having file chunks in the database would be redundant right? Or would S3 have the chunks always and give access to the download at the client end? 2. At the client end should we know how the updated/deleted chunks of a previously uploaded file be stitched back together? 3. Would folder sharing with other users be a possible follow up question? Like what Google drive offers.
@danielkling46473 ай бұрын
First I would like to say that this content is excellent. Why though would you implement chunking yourself instead of using S3's multipart upload?
@evangeloskostopoulos81737 ай бұрын
This is really awesome, thank you. Please keep them coming!