System Design Interview: Design Top-K Youtube Videos w/ a Ex-Meta Senior Manager

Рет қаралды 54,895

Hello Interview - SWE Interview Preparation

Күн бұрын

Пікірлер: 218

@shashankreddy1186 2 ай бұрын

Great video! I found a little correction: @31:54 the last video should be coming in at 5:04not 6:04

@kirankjoseph 2 ай бұрын

Have been trying to understand that piece by replaying many times. 5:04 makes sense, not 6:04

@hello_interview 2 ай бұрын

@@kirankjoseph Thanks! Sorry about that. Pinning this so others don't run into the same issue.

@2sourcerer 20 күн бұрын

Thanks! That confused me a great bit why I would need to decrement it. I had to consult the chatgpt to realize it is a sliding window.

@travel-hacker-yo 6 ай бұрын

I made it to Meta E5 thanks to your channel and the site. Without you it's not possible. I am always grateful for your work🙏

@hello_interview 6 ай бұрын

Amazing. Congrats!

@noextrasugar 6 ай бұрын

Congrats 🎊 What did you get to design? It’d be nice to know how you approached it

@uhsarpa517 4 ай бұрын

@prabharans5258 I have the same interview coming up on Friday. Can you share your LinkedIn?

@abhijit-sarkar Ай бұрын

Unfortunately, this video left me with a lot of unanswered questions, as some portions seemed intentionally or unintentionally hand wavy. This is one of the most confusing videos on the channel.

@Rocky-xb3vc 3 күн бұрын

I watched the first part and was confused by your comment, everything seemed straightforward. But at 29:00 - time window, your comment started to click just as the content became confusing. In any case, thank you Hello Interview very much, I'm preparing for a systems design interview and your videos are invaluable.

@falgunijhaveri6623 6 ай бұрын

The windowing part is too hard to follow. I would instead lean on some data streaming solution and storing into persistent DB at frequent intervals.

@jubinantony1264 4 ай бұрын

This is what I thought too, use a sharded database which can be updated at regular intervals by buffering the data in kinesis firehouse

@AnthM33 4 ай бұрын

This is what I'm wondering now. Why not just use a stream processor capable of doing this aggregations. Its basically the suggested route anyway right? I'm curious why not do it that way.

@alexs591 3 ай бұрын

The windowing IS data streaming. That’s the abstraction that most streaming systems provide. You get a batch of records from a window of time. You can also do this continuously record-by-record, but that radically increases the counter updates you need proportional to views rather than to time windows Internally you have a sharded log, and it gives you the last N seconds of log as your window

@avenged7ex 2 ай бұрын

what you describe is the earlier approach he mentions with storing in a time-series db. For his final solution, this still utilizes data streaming via Kafka. However the optimization he makes is having several consumers with different jobs. One which increases counts across each interval heap (second, minute, hour, day heaps) respectively, and then a consumer for each interval specified to decrease the counts for each view it encounters delayed by that interval.

@kedikeba Ай бұрын

@falgunijhaveri6623 - I have encountered this question in my SDI, i provided a flink solution, but i was questioned on how the aggregation works. I think these solutions are good, but knowing how they work proves to the interviewer that you din't just cram a solution and throw it at them. I paused the video several times to look for how the windows work, worked for me. You may want to do the same.

@VY-zt3ph Ай бұрын

Please create video for these topics => I love the way you approach System Design problems and it really helps to build intuition 1. Payment gateway like stripe 2. Payment system like Paypal 3. Stock Exchange System =====> (could not understood Alex Xu book chapter) 4. Wallet Service =====> (could not understood Alex Xu book chapter)

@draconyster 4 ай бұрын

The only good one of the proposed approaches is the aggregation one, because it is actually practical and handles failures. The one with decrementing the counters is pretty bad, because it literally doubles processing and also requires retaining events after they have been processed, so they can be processed the second time. Especially it doesn't scale to larger time windows. The best way to start this question is to negotiate with the interviewer how fresh the count should be. For large services it absolutely makes no sense to do all this on per-minute basis. If the acceptable lag is e.g. 30 minutes, so that the top-k remain the same for 30 minutes, then aggregation buckets can also be much wider. Also the usual followup to this question is to also support top-k based on search. So it is not just global top-k but also also filtered by something. With that followup the heap approach becomes unusable but aggregations and indexes still work.

@hello_interview 4 ай бұрын

From a practical perspective, you're right - which is why real-time analytics are pretty rare. They're expensive and the marginal cost doesn't always yield additional value. From an interview perspective, you aren't always able to relax requirements to suit a particular solution. Generally speaking it's easiest to have a reasonable "toolbox" of approaches that you can apply in a wide variety of situations. Appreciate the thoughts!

@andrehil 13 күн бұрын

Isn't the heap approach useless even before that? When you add the date filter it makes no sense to have it because the date parameter can completely change the result.

6 ай бұрын

I've been waiting for this video! Never been so excited for a youtube video!

@anandnarasimha1089 6 ай бұрын

omg same here... i have been eargerly waiting for this video

@firezdog 5 ай бұрын

the initial explanation of a sliding window isn't that clear to me. my thought was that if you're asking for the top k most popular videos *today*, it could be relevant, but i have trouble seeing how it would be relevant if the time period has a fixed start and end.

@kanesweet6585 4 ай бұрын

Also confused by this lol. Wasn’t really mentioned later either

@BillieLXU 21 күн бұрын

I think that's what he meant at the beginning when he said that there shouldn't be arbitrary period (you can't use 15 April - 15 May if the current date is 28 June), but rather everything has to be anchored for today. I don't think his explanations or rather clarifications were that clear. I had to watch it quite a few times to really understand what he meant ... if he was my interviewer I think I'd be doomed at the requirement clarification stage.

@KENTOSI 6 ай бұрын

This was extremely helpful. I learnt so much, especially about using sliding windows to update time/lag offsets. Thank you!

@bytebrews 4 ай бұрын

How about using a distributed OLAP database like Apache druid/pinot etc? We can process the incoming streams using any stream processor like flink/kafka streaming etc. to do windowed aggregation (1min windows) and write the results to our distributed OLAP. On the read side we just have to query the results ordered by max views. As we are using OLAP db , so that query will be very fast.

@hello_interview 4 ай бұрын

You need to constrain the amount of processing required at query time. In a worst case scenario doing 1 min aggregations across 1B videos, you're aggregating over terabytes of data. There are ways to reduce this but they introduce some other headaches you need to solve!

@vishalgautham 4 ай бұрын

@@hello_interview Hi, I'm just trying to follow. Why would there be terabytes of data if we do it across 1B videos? We are only looking at the video Id and their counts for windowed aggregation, correct? If this was the case, won't ad click aggregator with a very similar design have an issue with terabytes of data? how is this solution different here then? And why does it work fine for ad click aggregator? The same could happen with ads as well, when we are trying to go over 1B ads in 1 min aggregations resulting in terabytes of data "Some other headaches" - can you please clarify? What are they? how to solve, if any? Seemed a little vague.

@hello_interview 4 ай бұрын

@@vishalgautham These are a lot of questions. Why terabytes of data? Because you need to aggregate over thousands of minutes for e.g. 1 day. You then need to sort this aggregated dataset, in near-realtime. How do you solve them? The answer described in the video :). We want to minimize the number of operations required at read time. The ad click aggregator is a different problem. It has lesser read-time latency requirements and more general queries possible - hence, less optimizations available.

@vishalgautham 4 ай бұрын

@@hello_interview thanks for the clarification 🙏 appreciate it! :) it just seems like ad click aggregator and this problem are very closely related - so if we can have similar requirements , may be we can follow a similar design? I just googled and found out that the volume of ad views and number of KZbin videos watched in a day are of the same order. If the users are watching a video, they potentially see a KZbin an ad as well. But yeah not everyone will click on the ad, so the volume goes down. Also, since we end up sorting or heapifying in top k, that makes it slightly more complex.

@hello_interview 4 ай бұрын

@@vishalgautham They are somewhat similar, but this problem in particular has more constrained query patterns. The OLAP-style solution for ad click aggregator works for *a lot* of query patterns but doesn't give you as much room to optimization. The interviews themselves may seem similar on the surface but the interviewers are often looking for something different - best to clarify before you start. Hope that helps!

@rushio8673 23 күн бұрын

will you please create a video explaining the deep dive part in more details? it a bit hard to visualize everything from even a client perspective and the stream perspective simultaneously, definitely easy for you with the knowledge you have.

@arneishprateek6444 6 күн бұрын

The time window aggregation part went completely over my head

@insofcury 2 ай бұрын

This is awesome. Gives me the POV of a senior manager

@dp8jl 2 ай бұрын

Love these videos, best content I have seen. Feel bad for the people who are missing out on this channel

@VY-zt3ph Ай бұрын

The best of all system design blogs and vlogs

@ankurbalar Ай бұрын

At 25:45 you took the Top K Service out. Given this is just 1 instance of Top K service, do we still need a load balancer before it? Don't we need a load balancer after it before the sharded counter service instnces?

@lagneslagnes 5 ай бұрын

About comment at 13:00 - how some candidates design for scale before solving the actual problem. For some problems that requires very specific knowledge (like this one), a fallback is to literally design for scale and show your general infrastructure knowledge as an attempt to pass interview. So it's a tactic that some people might consider after trying to think of a real solution and not getting any insights or hints on how to start.

@hello_interview 5 ай бұрын

Yeah can definitely empathize with that, but for most FAANG it's not a good strategy for the interview. Better to take a step back and think some more - lot of candidates feel pressure to be constantly moving when, if they had a couple extra minutes, might think of a better solution.

@lagneslagnes 5 ай бұрын

@@hello_interview The topK problem is not something one can think up a good solution to without prior experience with the topK problem or a lot of hints. Look at a sample system design video (same topK problem) released by Google: kzbin.info/www/bejne/iWKnp3iah5Wci9E The interviewee showed a lot of knowledge and skills even though she didn't truly explain how the topK would work -- a sometimes winning strategy for problems like this that require specialty knowledge.

@guohongqiang552 6 ай бұрын

This is really helpful. Thank you so much. So far, all the videos are about general system design. Could you please do one for a ML system design, may be design a recommendation system?

@hello_interview 6 ай бұрын

Coming in the next couple months!

@ellenq-68 27 күн бұрын

@@hello_interview Looking forward to this one!

@hello_interview 27 күн бұрын

@@ellenq-68 Haven't forgotten about this! Just had some other stuff come up.

@azb1011 2 ай бұрын

Thanks for the video. Several maybe a bit stupid questions: 1/ How would you store the heap? Is it simply stored in in-memory? 2/ With several shards and aggregation how do you guarantee 10-100ms latency?

@hello_interview 2 ай бұрын

In memory! These are very simple operations that can take place in close geographic proximity so 100ms is doable. The cache makes this easy though!

@azb1011 Ай бұрын

@@hello_interview Thanks, >> In memory! In in-memory approach it is easy to lose the heap, isn't it? How to solve durability issues then? >> These are very simple operations that can take place in close geographic proximity so 100ms is doable. The cache makes this easy though! This is true even for the design with the front services that calls all shards for sub-results?

@SaurinShah1 6 ай бұрын

One question I have with this problem and your solution is, the top k heap will constantly need to be updated with views coming in for the videos inside. i.e. lets say you have 3 video with views as [v1: 100, v2: 150, v3:200], if we get chunk of views for v1 (say 75), new heap will look like [v2:150, v1:175, v3:200]. How would the heap handle modifying values of videos that exist within the heap. Another problem with a heap is, read entire heap is usually in the order of nlogn and would require you to poll entire heap(unless there is some other method I am not aware of).

@hello_interview 6 ай бұрын

This is a very common DSA pattern: you maintain a hash table of the nodes in the heap. This allows you to look up nodes by key in O(1). The vast majority of updates are not going to change the ordering and are constant time. The updates aren't a problem. Iterating over a heap (which is backed by a binary tree or array) can be done in linear time.

@SaurinShah1 6 ай бұрын

@@hello_interview Firstly. Thanks so much for the comment :) I love your videos and am learning a lot from them. Not sure I follow (and may just be my lack of grasping tbh). "you maintain a hash table of the nodes in the heap." ok. but how does the ordering of nodes in the heap update with change in the hash table? In real terms, lets say a video goes viral. How does it make its way into the heap and go to the back of the heap (assuming min heap) as it becomes more popular. My basic question is, a heap from what I understand is a data structure that stores data in a manner that you will always have the min/max on top of the heap. As you poll elements, you get them in a sorted manner. If an element within the heap needs to be updated, then you need to remove the specific element(which wouldn't easily do as it would need to perform restructuring of the tree underneath). So, for our use case, when more views of a videos come in, we need to update elements within the heap, which would entail removing all elements in the heap and readding them back after adding new views we've received. Alternate(or maybe this is what you are alluding to) solution algorithmically would be, TreeMap of (view count -> Set) as the "Heap" along with Map of (video -> view count) [backed by sql or in memory db] 1. when new view counts come in for a video, store the view count existing of the video and then update it based on new view counts. 2. If the new view count > min(view count of treeMap), treeMap.get(newViewCount).add(video) 3. if the original view count > min(view count of treemap), treeMap.get(originalViewCount).remove(video) 4. prune treemap to only contain K videos. Also, this process needs to be batched to avoid concurrency issues. "Iterating over a heap (which is backed by a binary tree or array) can be done in linear time." Understood. We could maybe cache the results with a ttl of lets say 30s so that we are not doing it again and again.

@hello_interview 6 ай бұрын

So as a nudge: this isn't a distributed system problem and you don't need to invoke SQL, you can do this in a few dozen lines of your favorite language. I'd highly recommend you try implementing a heap. Once you do, it'll be clearer how you might implement an increment(key) and decrement(key) that run in constant time iff the ordering of elements doesn't change.

@adrienbourgeois108 3 ай бұрын

@ sorry but @SaurinShah1 is absolutely right. What you are doing with the Heap is not mathematically correct. @SaurinShah1 is right to point out that when the value of a node that is already in the heap changes your design essentially falls apart. Even if you have a hashmap to access the nodes in the tree in constant time, how does that help? You still need to recompute your heap tree to keep it correct as the order may have changed. So although in your video you suggest that processing a view has a complexity of O(log(k)) (constant time to update the count hash map, plus a potential push/poll in the heap), in reality if you want to keep correctness, the complexity is O(n) as every time you update a count you do need to rebuilt the entire heap tree.

@TheSanSanch 2 ай бұрын

@@adrienbourgeois108 @hello_interview gives a very good advice to try implementing a heap by yourself. You can maintain heap structure after changing a specific node. Insertion and deletion are easily done in O(log(n)), in some implementations even faster (see Strict Fibonacci heap with O(1) for instance). You don't need to rebuild the whole tree, you just need to heapify one element. And even more in some implementations decreasing a node value in min-heap can be done in O(1).

@michaelkaspiarovich9244 6 ай бұрын

What if the size of the request (window) can be arbitrary?

@hello_interview 4 ай бұрын

You're back to approaches used in time series databases (like the indexes and aggregations described earlier in the video)!

@JohnVandivier 6 ай бұрын

Interesting note about implementing this Kafka solution in a brownfield: you can have a light wrapper in the Kafka value that adds an arbitrary value determined from offline historical analysis with no notable performance or availability or consistency issues!

@pldoto1993 Ай бұрын

27:24 how to ensure that the view stream is partitioned in the same way your counters were? Are we going to have multiple kafka clusters and each cluster pertaining to one counter?

@jiananstrackjourney370 4 ай бұрын

Can I learn what the reason that you choose a blob store for the checkpoint is instead of a database? Is it the limit of the data size?

@hello_interview 4 ай бұрын

These are going to be many-gigabyte payloads of binary data that need to be extra durable but we can tolerate a few hundred ms of latency. This is a *perfect* use-case for blob storage!

@Ryan-g7h 5 ай бұрын

I’m lost at the part where we need to know which click counts as a view

@AbraKadabra-lr9nq 4 ай бұрын

Great work there. Much appreciated 💯

@Drorbrook 6 ай бұрын

Hey Stefan, Great video, well explained and articulated. 2 question for you: 1. Can you expand on the cache a bit more? how does the cache data structure would look like in practice? 2. For falling edge when decrement our counter from all heaps, technically speaking - is that been done by consuming another kafka event which we publish 1 hour or 1 day later?

@sanskarmani9094 6 ай бұрын

I'm so extremely glad to hear this video out. The written explanation had me lost honestly with how we're handling the edge case (when we are saying the solution is to increase the size of the heap when potentially everything can be falling off, this wasn't really called out in the original text). I have been loving the entire content of hello interview and given my constant readings on system design, this is definitely the best place anyone can really study. You guys are so concise, clear and offer even complex solutions in such a digestible manner. As much as I want to gate-keep this content for myself, I want to see you guys explode (honestly the rarest feeling I ever get). Tl;dr You guys rock and thank you so much for the amazing content. Honestly, I would love to contribute to your vision if that's a possibility at all. If you end up reading this and want a few extra hands on deck, lemme know and I would be super happy to.

@andrii_popov 6 күн бұрын

It’s interesting to know, does this problem have any fine solution for random top K time intervals and top k under category/sub category? Keeping the response latency within the given limits of tens ms? Please give some hints!

@k.alipardhan6957 6 ай бұрын

I got asked this question in the social media context last week, wish I had this video haha

@TatianaRacheva 5 ай бұрын

Me too, I got asked this question (similar - top 100 Spotify songs) last Tuesday

@zeningc 3 ай бұрын

Fantastic video! I really enjoyed it and found it very informative. I have a couple of questions, if you don't mind: 1. Does the Count-Min Sketch support decrement operations? 2. Regarding the 'top k heap', is it implemented as a traditional heap (like a max/min heap in data structures), or is it more of a 'sorted list'? My understanding is that heaps only maintain the maximum/minimum element, and the second max/min is only accessible when the top element is removed. Thanks so much for your time!

@hello_interview 3 ай бұрын

1. Partly. Some of the guarantees won't hold in the general sense (e.g. you could conceivably have negative counts) but in this case where we're only removing what we add it does. 2. Think of heaps like trees with special operations to maintain the heap property. All elements are accessible.

@zeningc 3 ай бұрын

@@hello_interview thank you! that's really helpful

@antonsiardziuk 4 ай бұрын

UPD: I was incorrect =) Wouldn't count-min sketch be incompatible with the last design where we subscribe to lagging topic? You can't decrement count-min sketch, as well as you can't remove item from Bloom filter I think? (On 46:50 you are suggesting decrementing value in count-min sketch)

@hello_interview 4 ай бұрын

You should be able to decrement for CMS in this instance. Given we're removing exactly the elements we added, the errors should be unbiased. You can't do this with a bloom filter because you're making a stronger statement about elements (e.g. I've *never* seen this element) which might be false if you happen to remove elements associated with each of its positive bits. Appreciate corrections if I'm wrong about this :)

@antonsiardziuk 4 ай бұрын

@@hello_interview You are correct and my intuition was wrong :). Unlike setting a bit which might be already set, the plus one can of course be undone by minus one.

@nelsonn5123 3 ай бұрын

I’m not sure if this has been discussed in relation to Kafka streaming (KSQL DB), but what happens when some consumers of a topic are slower at consuming? This could result in inaccurate counters. I’m also unclear if the suggestion about lagging offsets is meant to address the issue of delayed consumers. 🤔

@AbhimanyuSeth Ай бұрын

Great video and great explanation! Loving your videos. Had a doubt for this one though... Generally, I understand the Using two-pointer solution. We start the falling edge consumer exactly 1-minute/1-hr/24-hours behind the rising edge consumer. But how do we make sure the falling edge consumer is processing at a rate such that it stays behind the rising edge consumer as per the time-window? The rising edge consumer can only go as fast as new events are coming on the stream. But the falling edge consumer has at least 1-minute or more of data to process and might process that faster than the rising edge consumer is able to and close the gap. I see the write-up mentioned about pausing the consumer, but it's not clear how the falling edge consumer is going to keep track of where rising edge consumer is, and how the consumer is going to pause and resume at precisely the right times.

@hello_interview Ай бұрын

There's no coordination needed. The falling edge has to be consuming events that are at least {time window} old. It can't catch up because it's going to be waiting until events are at least that old.

@AbhimanyuSeth Ай бұрын

The falling edge knows it's 1 minute behind the rising edge. When it starts it is 1 minute behind the rising edge consumer, so it can process the messages for that minute only. But after that, I'm not clear how it processes minute by minute without knowing which minute the rising edge is processing?

@hello_interview Ай бұрын

@@AbhimanyuSeth Either (a) you can just assume it's always the current time and handle the cases where it falls behind, or (b) you can use a local variable for the "last processed time" from the rising edge and simply subtract the time window.

@hello_interview Ай бұрын

The reality is if the counters fall behind, you're going to have bigger consistency issues. There are some non-trivial issues maintaining coherence for any solution when your processors aren't in sync.

@AbhimanyuSeth Ай бұрын

Thank you! Makes sense

@FanGuo-b7f 6 ай бұрын

Really really awesome videos, thanks for sharing.🎉

@PedanticAnswerSeeker 26 күн бұрын

can someone explain the last bit? I have no idea what he is trying to say. I dont understand the deep dive solution at all

@ItsMeIshir 6 ай бұрын

Thanks for the video, Stefan.

@hello_interview 6 ай бұрын

Glad you liked it!

@satishmhetre5542 27 күн бұрын

So, in case of aggregation approach are we going to keep multiple heaps ( for last minute, hour, day, ... )?

@Harry-p3d 3 ай бұрын

Can the aggregation method 32:06 meet the latency requirement for read latency (10 - 100 ms)? As for a 60-min aggregation, we need to process 60 * 3.6 billion data points. Can this be done within 100 ms?

@davidoh0905 6 ай бұрын

When discussing staleness and consistency, are you considering them as corresponding topic? i.e. does consistency mean consistency between view event happening and the corresponding count being available in topK structure? and that the staleness requirement constraints our design to go for immediate vs eventual consistency between the event happening and topK structure? thus the choice of DB and cache will be determined? I love your logical thought process connecting staleness requirement to consistency requirement! Please let me know if this is the correct understanding!!

@hello_interview 4 ай бұрын

Yes, these are connected!

@19pend 7 күн бұрын

Is there a reason we don't use Redis to keep track of the top k and counts? Or was that just assumed?

@hello_interview 6 күн бұрын

Extra roundtrip here, but it's a valid alternative.

@aruvanshn 6 ай бұрын

can we use FLINK for the aggregation window?

@hello_interview 6 ай бұрын

Sure, although be prepared to talk about how you’re using it. Just invoking flink or spark streaming is unlikely to get you a pass from your interviewer.

@davidoh0905 6 ай бұрын

@hello_interview The whole video felt like recreating stream processing framework like Flink! What would you say the difference is?

@twinklejaswani3803 6 ай бұрын

@@hello_interview In the Ad click aggregator video, we're using Kinesis with Flink and it was mentioned the aggregation window there also would be 1 min, with fields like AdId, minute, count. Can't we use the exact same thing here with Flink or is there something I'm missing

@firezdog 5 ай бұрын

this is what occurred to me after taking a night to think about the problem. it's a mapreduce where you distribute processing the stream from a given point over workers. since the windows overlap, you can use results from your previous window (I think that's also the basis for the checkpointing).

@lagneslagnes 5 ай бұрын

@@twinklejaswani3803 Every interviewer has a different style they expect the interviewer to follow. That is, many fall in the trap of wanting to hear what they would say themselves. You can find literally 180 degrees / opposite contradictions between authors of system design tutorial videos on many aspects of system design. Some say they hate when someone does capacity planning, some say they absolutely expect it. Some say concentrate on api design and schema design (in fact in one applypass video, the guy said he flunks people who don't discuss exact api responses), some say don't go in depth there. I've seen some tutorial authors actually switch their expectations/recommendations as they themselves got experience making system design videos. The biggest "game" with system design interviews is to build rapport with the interviewer, and try to figure out what he/she wants. Perfect this and you will improve your chances of succeeding at these interviews immensely.

@ayuve9146 2 ай бұрын

Not able to understand, how are you updating the heap and counter reading the events from the kafka. Could you elaborate more using a example?

@davidoh0905 6 ай бұрын

@35:00 the deepdive portion starting from here is quite difficult to understand but I guess the gist is that we want to re-read from kafka topics, just the portion of the "outdated", and use it to decrement the counter and also update the heap accordingly? When updating the heap, how do you envision doing that? do you search for the video id and then update the corresponding view count? or do you just re-build the entire heap every minute, hour, and day?

@hello_interview 6 ай бұрын

Updating the corresponding entry in the heap and then balancing is the most efficient. Most updates won’t actually change ordering.

@davidoh0905 6 ай бұрын

@@hello_interviewoh that’s a great insight that it’s not going to change ordering each time! Is this based on your experience or relatively common knowledge??

@sayantangangopadhyay669 6 ай бұрын

@@hello_interview So we are using lagged kafka topics, to go back in time using the retention of kafka and find the video ids which has been played last hour( for example) and decrement the hour counter for those video ids after the one hour passed. and the same process will be followed for all other minute,day,week,month counters. So that we can increment and decrement counter and get the precise topk count. Please correct me if my understanding is wrong. As the part is very complex to understand Also one more question, how we can fudge the heap for consider the runner up entries for ranking purpose after decrement counter for some video id in heap. I am not aware about heap fudging. So a brief explanation and some doc link will be a great help.

@guitarMartial Ай бұрын

How would this scale though to million of counters? How would you dyanmically allocate counters to counting processors particularly if they have not been seen before?

@zuowang5185 6 ай бұрын

32:40 if you removed the minutes for the last day, how can I query from day 0.5 to day 1.5?

@fdddd2023 6 ай бұрын

from practical/hands on point of view - how to implement those pointers with kafka, does it have build in functionality like that or we need to write some specific code and somehow use kafka apis?

@hello_interview 6 ай бұрын

You’ll need to write code but it’s very simple, something like kafkajs seek and some client side waits are enough.

@richi2889 6 ай бұрын

I have two questions: 1. How are you reading from, say, 1 hour ago offset? 2. When you start reading from 1 hour ago till the data which is not 1 hour ago, wouldn't you be reading a lot of data and can it not make the system slow?

@shiweist 6 ай бұрын

You set up 2 streams (aka topics): current events and events from 1 hour ago. You’ll have a consumer for each stream. The first consumer increases the count and the second decreases the count.

@laurah9500 5 ай бұрын

@@shiweist i struggle with this too. How do you get events from the past 1 hour / 1 day etc. in a separate Kafka topic and hav that consumed by te exact time it needs to be removed from the count?

@shiweist 5 ай бұрын

@@laurah9500 RabbitMQ has a plugin called Delayed Message Exchange that will delay the delivery of messages by a certain amount of time. From the video, I assumed that Kafka has a similar feature. But I just did a quick search and it seems that Kafka doesn't have this feature. So, yeah. I don't know how the presenter intended to implement this in Kafka.

@satvikpandey3404 4 ай бұрын

You can ingest view timestamp as part of the record event in Kafka. While consuming you check if the consumed record is is not of an hour ago view then you stop for sometime.

@AnthM33 4 ай бұрын

@@shiweist Wouldn't you just do 2 consumer groups and not 2 topics?

@loadingSEA 2 ай бұрын

Suggestion: Use Arial/Times new roman font for what you write. A little hard to tell what you are typing

@davidoh0905 6 ай бұрын

@28:40 Elastically scaling a stateful servers seem very complicated. Just like Kafka partitions are quite delicate to handle as brokers and storage are tightly coupled!

@hello_interview 6 ай бұрын

Yes! Although it's a bit easier here since you can always recover the state by reading checkpoints/stream as opposed to shuffling a bunch of state around between servers.

@lagneslagnes 5 ай бұрын

It wasn't even explained.

@ButtarPerMinder 2 ай бұрын

Hey! Really great content very helpful. Love the way you breakdown the problem and solution for different level of candidates. I have an idea about this solution. For check pointing block, instead of having a blob storage what if we take a time series database for example InfluxDB or OpenTSDB? We can optimise our queries and memory requirements.

@hello_interview 2 ай бұрын

Appreciate that! The checkpoints are actually pretty simple in the case of checkpoints and the query is only to retrieve the entire checkpoint. TSDBs can be helpful in other places, but I probably wouldn't use it here.

@so7am96 5 ай бұрын

Might be a stupid question, but how would you keep the video views in your heap up-to-date when you want to insert a new item without having to destroy it and rebuild it again?

@hello_interview 5 ай бұрын

en.wikipedia.org/wiki/Binary_heap#Insert

@bansalankur Ай бұрын

can we user redis sorted set to store the counts for each interval instead of in memory heap etc. ?

@hello_interview Ай бұрын

You can, you'll just have additional latency and you'll just need to solve the related problems of consistency, recovering from crashes, etc. Different solution.

@kedikeba Ай бұрын

Thanks Stefan, great video. I trying to understand the rising the falling edge pointers ( for 1min aggregations, the falling edge pointer will start processing 1min later) - sounds to me like a tumbling window and not a sliding window. Is this correct ?

@hello_interview Ай бұрын

No, these are sliding! If we were making buckets and snapping to minute intervals, those would be tumbling. At 01:01:30pm the rising edge is incrementing new events that happened at 01:01:30pm and the falling edge is decrementing events that happened at 01:00:30pm.

@DMA-I 2 ай бұрын

I am totally lost at 21:46 when you see you would like to replicate a service , the service has a heap and counts it is counting from the Kafka, how can you replica it with all these (a stateful service)? If you have mutiple replica how to you sync the state (count, heap) between the replicas?

@hello_interview 2 ай бұрын

Good question. The states in this service are completely derived from the stream - as long as each replica has consumed to the same point in the kafka stream, the "state" they contain is identical!

@chuckwang4437 Ай бұрын

What's the design look like for arbitrary windows?

@ammarmirza4106 3 ай бұрын

The design proposed doesn't seem that great. Why not... 1. Implement counts as a standalone db (RDS or DDB) that can scale independently. Since we are only worried about 76gb of data it shouldn't be an issue. Use read-replicas and multi-az for fault tolerance and high availability. This approach will have low consistency but it shouldn't matter anyway for top k. 2. Implement the Heap as it's own distributed scalable service with replicas. That way if it goes down we can just promote a child or read replica to master. 3. Have snapShot service takes snapShots of both the DB and HeapService as one object and store in S3. This way there is a relationship in time between the heap and the counts db if we need to restore to a point in time. 4. Turn "Views" into a simple standalone scalable service which will just read from the stream and update count and update the heap if applicable. 5. Turn the shard stream (I am using Kinesis) to shard on videoId. For hot shards use additional partitioning of videoId#timeStamp which can be tuned to be day/hours etc depending on how hot the shard is. From the consumer pov we only care about videoId to update the count and the heap if needed. I uploaded what it looks like here: freeimage.host/i/dDEHhdl Reason why I don't think the proposed design is good: 1. By sharding the Top K service you are effectively just doing horizontal scaling again (kind of like 3D horizontal scaling). This seems like a big issue and you are effectively scaling an inefficient design. Similar to an O(n^2) algorithm which can be simplified into O(n) or better with the correct data struct. 2. The step of aggregation also doesn't make much sense as by implementing your service this way you are creating a de-facto sharded DB which already exists as an out-of the box solution so why re-invent the wheel? Additionally, you may not get the correct results by doing this querying as all top K results may live in one heap. Additionally additionally, since we are setting the constraint of the Heap to 1000 elements we can use an in memory solution with one service and read replicas. For example the Java PriorityQueue implementation for 1000 objects which contain (string videoId, int count) will be approximately 40kb which is literally nothing in distributed computing. 3. In the case of the sliding window when snapshotting the heap every 30 sec this amounts to about 360 snaps/day * 40kb of data = 16mb/day of snaps saved in PQs. Then if we wanted to iterate through these snaps on different time windows. We can do it in 2 steps. 3a) The first is to evaluate the timestamp, then get the associated PQs for each range.If ranges aren't exact round up/down to the nearest whole 30 sec. 3b) At maximum we are looking at 360k elements that need to be parsed through per day of sliding window. Doing a quick experiment in Java (M3 16gb ram) I get an average of about 320ms to loop and aggregate 360k elements. I.e. per day of sliding window = 320ms. Therefore we can extrapolate that 1 week of data = 2.14sec to iterate through, 4 weeks = 12sec, 6months = 1min12sec . Network overhead would be maybe about 0.5-1sec so these times are okay. (Java code here pastes.dev/fbsT9U7w6n ). Now this can seem like a lot as our non func reqs say we want 10-100ms latency. However, if you actually go to youtube trending (kzbin.infotrending) page right now and see the front page loaded for you, you will see around 100 videos loaded initially which means we should really reduce our K down from 1000 to 100 and this would change our query times to the following 1day = 32ms --- 1 week = 280ms --- 4 week = 1.2sec --- 6month = 7.2sec This would reduce all of the time 10x and would allow us to hit our non-func reqs of 10-100ms easily. Additional loading can be handled using lazy loading as the user scrolls down and we can use caching to reduce latency as well. And this would be totally within non func reqs. Only downside with this is we can't have smaller than 30 sec time windows which is probably fine (non functional reqs are 1min). Giving final design : freeimage.host/i/dDGEKLN Thoughts? Another final note, in other sys design videos you guys always do the scaling and optimizations in the "deep dive" section but you instead did it in the HLD section can you clarify this?

@baetz2 2 ай бұрын

This looks very interesting! Could you explain why snapshotting every 30 sec gives 360 snaps/day? It looks like 24*60*2=2880 to me. Your populateDataObjects(int days) method creates a PQ with 2880k items per day, not 360k, doesn't it? Could you also explain the algorithm of calculating the result from the snapshots for dummies? Thank you!

@ammarmirza4106 2 ай бұрын

@@baetz2 yes 360 is a typo, It was a bit late when I was writing this 😅2880 is the correct number of snaps /day. I don't fully follow the question, you want me to explain how I calculated the time for how long it takes to make a snapshot?

@lokesh2608 4 күн бұрын

Introducing a checkpointing and bootstrap before actually explaining what exactly is getting checkpointed and why is confusing. Also handwaving out "consistency of counters across jobs and consistency of top k heap" seems undesirable, especially since we have gone out of our way of saying we want to be "precise" in the functional requirements.

@070906214 3 ай бұрын

Does this question fall under infrastructure or product design question?

@susantaghosh504 12 күн бұрын

I think this can be easily done using stream processing

@hazemabdelalim5432 6 ай бұрын

Can we use apache flink here ? It also supports having a sliding window , i am not sure if this will be acceptable in an interview

@hello_interview 6 ай бұрын

Acceptable! But you're going to have to explain how it works to most interviews.

@firezdog 5 ай бұрын

@@hello_interview why not just make the question to design map reduce then.

@hello_interview 5 ай бұрын

@@firezdog Because there are plenty of solutions that don't require flink/map-reduce.

@firezdog 5 ай бұрын

@@hello_interview I guess I would be very interested learning about the tradeoff that might be involved in using map reduce vs something more bespoke

@flyingpiggy741 4 ай бұрын

27:44 what does round and robin the DNS for each replica mean?

@dp2120 Ай бұрын

Round Robin is an algorithm/approach for when you have multiple agents that can perform a service. You go around one by one asking each one to do a task that way no one gets overloaded and they all get an even distribution of the number of total tasks you have. DNS is just domain name service so each server for the round robin will have an ID that can be queried and used to give it a request.

@rushio8673 29 күн бұрын

didn't understand the idea of hour count, how does decreasing the count help ? I guess you mean to say when the video is clicked, video id gets inserted into the specific kafka topic based on the hashing on video id, it gets listened by count service, count service keeps an hourCount map with video id as a key and increases the count for that hour, now how does it determine that hour is ended and how does decrementing help here using the kafka topic ? kafka topic would persist all the entries that entered topic at any point of time for 7 days, but still not getting how would that help?

@hello_interview 29 күн бұрын

If you wanna keep track of the count of view in a window K, for every view increment a counter and then after K time has elapsed decrement that counter. Your counter represents the number of views in K.

@rushio8673 28 күн бұрын

@@hello_interview so we define K as 1 hr or 30 minutes or so, and the approach we follow is sliding window, so let's say if we start calculating video counts at 12:00 pm till 1 pm, we keep incrementing hourcount map with video id as a key and count as the value, as soon as we pass 1 pm mark and let's say we reach 1:01, we deduct all the counts recorded from 12:00 to 12:01 and we retrieve these counts using kafka's retention of those clicks in the topics?

@hello_interview 28 күн бұрын

@@rushio8673 Yeah basically, you got it.

@rushio8673 24 күн бұрын

@@hello_interview Thank you so much for the clarification, really appreciate it. I have asked for some clarifications on other two system design videos, but i don't get any responses there unfortunately. If you can please clarify some doubts on those designs too, i would really appreciate it as I love your content and I try to very closely follow what you post.

@davidoh0905 6 ай бұрын

We might also be able to propose that we should do this regionally and per category!!

@hello_interview 6 ай бұрын

Yeah these are reasonable extensions of this problem!

@pankajk9073 5 ай бұрын

wow, love this solution. just wondering how can we keep track of keys to expire, will delay queue do the job?

@skibidi-k2i6w 25 күн бұрын

It would be better to put selfie video box upper left to not cover the contents.

@МаксимШульдінер 2 ай бұрын

why not OLAP along with Kinesis/Flink ?

@DMA-I 2 ай бұрын

Honestly I am still not quite convinced relying a service directly serving all the query requests to clients (minute/day/month/all time), which means everything is staying in memory for a long long while. I feel it is not stable/feasible. I am more convinced the approach you used for ads click aggregation approach, in that example you used an OLAP db, I am thinking in this example I could use an OLTP db (postgres) to save all results of 1 minute granularity , then using a cronjob to get day/month/all time top k result in postgres, then add a service in front of postgres to serve the clients' queries. Do you think is there any drawbacks of my approach? Thanks!

@DilipKumar-ij3cf 2 ай бұрын

I am confused with 1 min time windows. Aren't these fixed time window and not sliding window, right? Wasn't the requirement to do sliding window? How are we deciding 1 min window boundary? how do 1 min top K between two 1 min windows. Say new window is filled for 30 seconds but we are not taking stats for those 30 seconds right?

@hello_interview 2 ай бұрын

This is a downside of that approach, but one could argue about exactly how fine-grained the sliding window needs to be. Most interviewers aren't going to split hairs here.

@aforty1 6 ай бұрын

Thank you for this. For what candidate target level would you ask this question in a Meta interview?

@davidoh0905 6 ай бұрын

@21:30 when you are mentioning replication of the service, it's not about distributed servers right? is it more about having a backup instance ready to take over that are in standby mode? Is that a way to make a single instance more fault tolerant to take over quickly? Or did you mean that the replicated services are doing 100% identical things, counting and sorting, in order to make the topK available from multiple instances???

@hello_interview 6 ай бұрын

In this case the latter. These aren’t just cold standbys, we can actually use them!

@leizhao2106 4 ай бұрын

qq, what exactly those "counters" boxes, are they instances host services with heap and counts in memory or in what storages?

@hello_interview 4 ай бұрын

Yeah, you got it. They're hosts. Theoretically you could have multiple counter instances on a single host if you wanted to but it's probably best to think of them as servers.

@ellenq-68 4 ай бұрын

I'm a little bit confused on the aggregator part for global top k, do we need to recalculate the global top k from scratch based on the top k heaps from all partitions each time? I guess it's not much data if the number of partitions are not extremely large though.

@hello_interview 4 ай бұрын

You'll probably want to cache this given we have a bit of time in the non-functional requirements. But you do need to aggregate/reduce across all the shards to get the correct value for the global top-k with this design.

@ellenq-68 4 ай бұрын

@@hello_interview yeah that makes sense, thank you for the clarify!

@B-Billy 3 ай бұрын

What is this tool used in the video for drawing and text?

@dreadpirateroberts919 3 ай бұрын

excalidraw

@АнтонВласов-з1я 6 ай бұрын

How is delayed kafka solution different from the aggregation windows? If we keep the kafka queue alive for 7 days to enable delayed reads internally it is the same thing, only divided not by minute, but by each kafka message. Moreover, there is kafka message overhead. Large memory usage is stated as the main problem of the first approach, but how is it solved with kafka?

@hello_interview 6 ай бұрын

Great question. If we take for granted that the infrastructure already has view events in Kafka, the space costs there can be spread across other subsystems in the infra. Beyond this, the aggregation approach has worse performance since you need to sum across many windows to create aggregations (somewhat mitigated by exploiting their hierarchy) and more complexity. But if you happened to bring it up in the interview and solve those issues I think it's viable.

@HandyEngineering 6 ай бұрын

Great video! I was following all the logic as cluse as I can and one thing remained not very clear to me - how do you update a heap with decrementing value (the case of hourly heaps with negative counts coming from delayed kafka reading). Just pure algo question I guess Great content!!!

@zy3394 5 ай бұрын

The hour / day count is cache in memory right? so when data is too large like you mentioned at the end of the video trillions billions, Can we use database and sharding to store counts in order to resolve that large data problem? is it better than count min sketch ?

@hello_interview 5 ай бұрын

Different tradeoffs! For many production use-cases, an approximation is fine and count-min-sketch is a good solution.

@Nick-lw7rj 6 ай бұрын

I may've missed it, but what datastore solution options would you consider for the heaps? Oops my mistake, I guess this is just an in-memory heap since k would probably only be 1000 or less.

@hello_interview 6 ай бұрын

Yep!

@Nick-lw7rj 6 ай бұрын

@@hello_interview how would we rebuild the heaps when the service is initially deployed or redeployed?

@hello_interview 6 ай бұрын

@@Nick-lw7rj From the checkpoint.

@Nick-lw7rj 6 ай бұрын

@@hello_interview right, makes sense, thanks!!

@firezdog 5 ай бұрын

i'm not sure how the heap solution would work without having to rebuild the heap each time you update frequencies. you would then need to copy the heap in memory and pop off of it k times to get the top k? Or am I missing something?

@hello_interview 5 ай бұрын

You're missing something :). Most increments/decrements will require no balancing.

@firezdog 5 ай бұрын

@@hello_interview where can I learn more about that

@dp2120 Ай бұрын

This confuses me too. If you’re updating the frequency, then you need to consider each time that it may require re-ordering of the heap. I don’t see how it’s possible to get around that.

@MuhammadUmarHayat-b2d 6 ай бұрын

thanks for sharing this, it is extremely helpful. qq: we did not keep a top-k videos for minute and all time which is part of functional requirement. is it implicit here that we will have such Min top-k and All time top-k memory stores as well?

@shiweist 6 ай бұрын

Yeah. I think Stefan left them out to avoid clustering the diagram.

@schan263 6 ай бұрын

What is this "Premium AI System Design Mocks" on your website?

@hello_interview 6 ай бұрын

Still a work in progress, but we're building AI-powered mock interviews that (a) actually give you good feedback, and (b) are reflective of what you'd expect to see in FAANG interviews. Launching once we feel confident they're good enough - feel free to join the waitlist of you want to hear from us when that happens or be part of alpha testing.

@hwang1607 4 күн бұрын

thanks but I found this relatively confusing to understand

@illiyaz 6 ай бұрын

Great video, keep up the good work. A question however is, how about using something like kafka streams feeding into ksqldb and then into Kafka topics for subsequent processing. The top1000 videos for the past 5 mins or so can be answered from KsqlDB and for larger counts, we can aggregate the data from kafka topics into a DB with 1 hour aggregates. The 1 hour aggregates can be further aggregated into 1 day in a Mysql DB. We can potentially use Redis sorted sets that can be fed off the Kafka topics and use a ZRANGE query too. Let me know what you think

@LawZist 6 ай бұрын

i also think about this solution. i would also like to hear what they think about it :)

@karantirthani2628 5 ай бұрын

I gave this solution when I was asked about this in an ecommerce wesbite context.

@nikhilmugganawar 6 ай бұрын

Request to add a video for system design of a Talent Assessment platform like Hirevue and a resume screening/tracking platform

@hello_interview 6 ай бұрын

Make requests here: www.hellointerview.com/learn/system-design/answer-keys/vote

@zuesoel3150 5 ай бұрын

One stupid question 19:00 whats 100K sec/day?

@hello_interview 5 ай бұрын

An estimate! The actual number of seconds in a day is 86,400 but that makes for some gnarly mental math. 100k is close enough!

@zuesoel3150 5 ай бұрын

@@hello_interview thanks for clarifying. Your content is GEM 💎

@FrequencyModulator 4 ай бұрын

Let's say a user decides to query the Top-K views from Wednesday 16:59 to Friday 13:39. Do I understand it correctly that the Top-K Service then will: 1. Get the minute data for Wednesday from 16:59 to 17:00 2. Get the hourly data from 17:00 to 24:00 2. Get the daily data for Thursday 3. Get the hourly data for Friday from 00:00 to 13:00 3. Get the minute data from 13:00 to 13:39 4. Combine it and calculate the Top-K for the period Wednesday 16:59 to Friday 13:39 Is that correct? If yes, then where do we keep all this data?

@hello_interview 4 ай бұрын

No, not in the full solution. You're referring to the "Good Solution: Heap Expirations" here: www.hellointerview.com/learn/system-design/answer-keys/top-k#1-handling-time-windows

@nelsonn5123 3 ай бұрын

Yes, your approach sounds reasonable My thought process if I get asked these types of questions would be: 1. Understanding the user: If the user has an analytics background, the frequency of these types of queries is likely lower, as they tend to be more exploratory or report-based. This can affect how we optimize for data retrieval. 2. Data storage: For recent data (within the last month), day-level and possibly hour-level aggregates can be stored in an in-memory cache (e.g., Redis) for quick access. This data is frequently queried and can be cached to reduce database load. 3. Minute-level data: Minute-level data typically generates more entries, so it is more efficient to store it in a specialized time-series database (e.g : Cassandra) and fetch it on-demand as needed. 4. Data freshness and caching policies: The cache would likely implement expiration policies to keep the data fresh and ensure that memory isn’t overwhelmed by older data. We could also shard the cache based on time ranges to further improve query efficiency. 5. Query optimization: Time-range indexes would be used to quickly retrieve the relevant portions of data across minute, hour, and day levels, combining them to calculate the Top-K views efficiently.

@MadHolms 5 ай бұрын

that's a good explanation, but in a real interview, there is no way you can come up with the last solution within the interview if you never ever before thought about this problem. Now that I watched it, I might be able to present the solution, before watching, there is noway I'd see it, and that's sad :(

@hello_interview 5 ай бұрын

The good news is you don't need an optimal solution to pass most interviews. The important thing is you demonstrate a sharp, methodical process, solid insights on the problem, and you're able to make progress in a way that your interviewer knows "if I gave them another few hours, they'd have this nailed."

@maxvettel7337 6 ай бұрын

I think top k question goes in pair with search engine (like twitter search) because their requirements are pretty similar

@hello_interview 6 ай бұрын

Yeah these share a lot of similar requirements. Tweet search will have some aspects of a reverse index which are different, and the ranking aspects tend to be more complex, but definitely share some common patterns.

@albertli7044 17 күн бұрын

I love your videos, but this one is hard to understand and follow. I don't think it explains the issue well.

@techleap2271 4 ай бұрын

100 GB is a reasonable amount to keep in memory on any server?? Really? Isn't this a pretty beefy server you're talking about, wouldn't it be better to think of horizontal scaling instead

@hello_interview 4 ай бұрын

Very reasonable! Horizontal scaling gives us some benefits (e.g. redundancy) but once we've done that it's far more performant to avoid distributing. In this particular problem we've done both!

@RolopIsHere 2 ай бұрын

My 12 years old server has 256GB of RAM, a datacenter or cloud server will have 1TB or 2TB of ram. I don't see how 100GB is a considerable ammount.