Designing a Realtime Gaming Leaderboard - Horizontally Scalable and Highly Available

Рет қаралды 15,970

Күн бұрын

Пікірлер: 52

@LeoLeo-nx5gi 5 ай бұрын

Suggestion to everyone, also read the technical queries/comments here, and check Arpit's answer, thanks guys and thanks a ton Arpit!!

@AsliEngineering 5 ай бұрын

Thank you Leo for mentioning this ✨

@anujpanchal3244 5 ай бұрын

Hi Arpit, One doubt I had after viewing the final architecture. What is the frequency of tailing that rockset does to DynamoDB to fetch the recently written records? I am asking this because the title of the video says we are designing a real-time leaderboard and we are fetching the data from the transactional DB to an analytics DB like Rockset. This tailing action might have some replication lag wherein the Rockset polls the recently written data on DynamoDB and when DynamoDB sends the data to Rockset. This can introduce some replication lag which can affect some users when we talk about millions of concurrent users. Because if a user tries to access the leaderboard immediately after winning a quiz he/she would expect that his name should be in top segment(s) and if he/she doesn't find his/her name in the top segment(s) then there are high chances that the gaming company might lose out on a genuine user. Just wanted to know your view on this 😃

@deepshikhazutshi6637 Ай бұрын

I think alongside persisting in transactional database. We can ingest user score data in redis sorted sets which are optimised for high load especially sorted sets can increment & quickly give out the top player's ranks efficiently. All the reads to get top K players can be served from the redis itself To avoid overloading the service & transactional DB we can do writes in batches with flush interval of 500ms adhering to NFR of being near real-time. And to reduce latency of read requests what we can do is, We can fetch top k users periodically within a flush interval of 200ms and store it in memory to serve everyone the same snapshot of top K winners in a window of past 200ms. This way we store a snapshot of top K players from redis sorted set and store it in memory in some data structure/ or in redis itself. A worker thread in background can invalidate this and refresh it again with new set of top K players

@abhaysoni8631 5 ай бұрын

So, Arpit, we are implementing the CQRS pattern and segregating read and write operations to different databases with eventual consistency. We are accepting the trade-off that we may sometimes have to deal with stale data.

@abhaysoni8631 5 ай бұрын

am i right. can you please highlight more such patterns please name them i want to read more about such patterns

@MarathiNationOne 4 ай бұрын

Your content looks real examples and experienced. Not like others just to prepare the topic and making videos.

@dippatel1739 5 ай бұрын

Rockset is shutting down after acquisition by openai. Customer had to move out. 😅

@AsliEngineering 5 ай бұрын

Still their underlying tech is pretty neat, I learnt some pretty interesting stuff going through it.

@Aditya_Vyas 5 ай бұрын

Thank You Arpit for always posting without forgetting it. I know today we have less subscribers & views but I know you have travelled it long & one day people will understand the value of your content. They will love your passion of posting videos/content about System Design & they'll understand it's more than just drawing boxes. Art shines in the darkest days. I'm always with you & I owe you a lot whenever I learn from your piece of content. Thank You for being there ❤

@AsliEngineering 5 ай бұрын

Thank you so much Aditya :) this means a ton. made my day with this comment! all the very best :) #BeCurious

@ankitraj1796 5 ай бұрын

I have on question, you mentioned that performing writes and reads at scale on a single database is not going to work, but lets say in the design we are ingesting data from Dynamo DB to Rockset which means that Rockset is also having a high write workload and simultaneously it’s processing read queries too, so wouldn’t Rockset face issues in this case?

@AsliEngineering 5 ай бұрын

Watch previous videos about Rockset on how it handles high write throughput. I have it covered in the same playlist.

@ankitraj1796 5 ай бұрын

@@AsliEngineering gotcha, yeep I just checked again it’s due to write goes to in memory buffer first and then flushed to disk periodically

@sachinjain3833 5 ай бұрын

It's good to know about Rockset. I have a few questions: Does Rockset store all the data, including data from 5 years ago, or only the most recent data, such as from the last 5 days? If so, what is the eviction policy? Is it advisable to store precomputed records for 1 month, 2 months, 6 months, 1 year, and 5 years in an SQL database? For example, when a request is made to view the leaderboard for the past 45 days, the request would query both MySQL and Rockset. The data for the first 30 days would be fetched from MySQL, and the data for the remaining 15 days would be calculated from Rockset. The combined data would then provide a fast calculation for the 45-day leaderboard. Throw some light if this fits In the design

@AsliEngineering 5 ай бұрын

1. Yes they store all the data. 2. You can configure eviction 3. They don't want you to store precomputed value, instead they compute on the fly They provide connectors with various systems which can ingest data to Rockset. Then you simply query the data with regular SQLs.

@shvmsxn41 5 ай бұрын

The problem that comes with this kind of architecutre is to make sure that both the databases are in sync and to avoid dual write problem. A lag or inconsistency between these databases simply means that the system is not real time any more. The overhead of syncing the databases will itself become a challenge.

@ManishKumar-qx1kh Ай бұрын

isn't that a generic usecase in DB replication? That applies to every DB that you are going to use right?

@namantyagi6294 5 ай бұрын

can we use elasticsearch instead of Rockset and use something like CDC to get updates from the transactional system? Elasticsearch too is optimized for aggregation queries just like RocksetDB

@sitanshushrestha2629 5 ай бұрын

The same question was asked to me in one of the retail giant. I used Elastic search instead of rockset (didn't knew about this). But, the interviewer was not satisfied with the answer.

@namantyagi6294 5 ай бұрын

@@sitanshushrestha2629 can you tell what the exact question you were asked and what answer you gave? will help me understand it deeper

@AbhishekPrakash6262 5 ай бұрын

Hi Aprit, can you please explain why did we use dynamo db in the middle and not directly ingested the data to the rockset db? Is rockset lacking some other things that dynamo db does better?

@AsliEngineering 5 ай бұрын

DDB is transactional database powering other features of the game, like game state, player state, etc. Rockset will be a global secondary index of all transactional data powering realtime analytics. Go through my replies on other comments for more details.

@pratikkulkarni891 5 ай бұрын

Hello, thanks for your videos. I recently found them and have been enjoying them. A slightly off-topic question - I see you have mentioned the technical books you have read mentioned in the video description; have you read them cover to cover or do you follow some other practice? How do you recommend reading and possible remembering technical books?

@rark-o7f 5 ай бұрын

great series !!! Complete fascination ... looks more clean, compact, modern architecture solving lot of problems at all level of the architecture !! thanks for sharing this knowledge with the world

@d093w1z 5 ай бұрын

I always wondered how games scale their infrastructure. Thanks for this great video!♥️ Would love if you continued a bit about how MMORPGs work.

@hariikrishnan 5 ай бұрын

Could you explain more about why read replica was a bad choice ?

@AsliEngineering 5 ай бұрын

Because read replica of DDB would still be inefficient for aggregation, as it is not meant for aggs and analytics.

@shardulsilswal1140 5 ай бұрын

As always great video Arpit! In order to be consistent, do we implement something like write through to ensure the leaderboard queries show the latest data? This would increase the write latency and might affect the availability of the system?

@haha7836hahah 5 ай бұрын

Hi. Never heard of rockset before so this might not be a good question but, You talked about the integration of dynamo(or any db good for transactions) with rockset. My question is why not just use rockset if it offers low latency aggregations and also can ingest large volume of data efficiently?

@AsliEngineering 5 ай бұрын

DDB is source of truth for other transactional use cases like - score management, managing game state, managing player state, etc. Rockset on the other hand is auxiliary database that is hyper optimized for realtime aggregations, and hence leveraged for driving leaderboards.

@roonywalsh8183 5 ай бұрын

@@AsliEngineering so is the Rockset an alternative to KSqlDB ??

@divjyotsethi2695 5 ай бұрын

Nice video! Why use dynamo on write path and not just have rockset? Are there additional requirements which dynamo supports?

@AsliEngineering 5 ай бұрын

It is the source of truth for everything game related, like game state, player state, etc. so DDB serves all transactional usecases and Rockset becomes auxiliary database catering to realtime analytics usecase.

@gauravmadan5217 5 ай бұрын

Great video! this does solve the problem for a quick and reliable solution. What would be the approach if some company don't want to go for a managed solution? Like can this be implemented via normal data pipelines, using spark. Assuming the SLAs for the metrics/leaderboards are a bit relaxed, like 5 min.

@AsliEngineering 5 ай бұрын

Yes. Just replace Rockset with your integration and use Redis as a DB to drive leaderboard.

@someshu9665 5 ай бұрын

Hi Arpit, thanks for a knowledgeable video. I will surely learn more about dynamodb + rockset. I work in a gaming company and we use Redis for leaderboards.

@AsliEngineering 5 ай бұрын

Think of Rockset as a global secondary index of all your transactional databases providing sub second latency to any query. Redis does the job well, not saying no to it. But there are some complex cases that cannot easily solved with Redis. For example, Leaderboard with Slicing and Dicing of data across different criteria.

@anshulsrivastava6132 5 ай бұрын

Can we not use CDC with olap db to achieve this?

@AsliEngineering 5 ай бұрын

OLAP are not known for their low latency reads. So not a great choice.

@amansingh-os9gd 5 ай бұрын

Analytics with low latency or sub real time

@anshulsrivastava6132 5 ай бұрын

@@AsliEngineering elasticsearch then?

@techlifewithmohsin6142 5 ай бұрын

@@anshulsrivastava6132 ES is also not a good option for complex agg, there are certain low latency OLAP databases designed specifically to solve such latency issues and give millisecond latency . Rockset, Apache Druid both can solve this problem at scale with milliseconds latency