Suggestion to everyone, also read the technical queries/comments here, and check Arpit's answer, thanks guys and thanks a ton Arpit!!
@AsliEngineering5 ай бұрын
Thank you Leo for mentioning this ✨
@anujpanchal32445 ай бұрын
Hi Arpit, One doubt I had after viewing the final architecture. What is the frequency of tailing that rockset does to DynamoDB to fetch the recently written records? I am asking this because the title of the video says we are designing a real-time leaderboard and we are fetching the data from the transactional DB to an analytics DB like Rockset. This tailing action might have some replication lag wherein the Rockset polls the recently written data on DynamoDB and when DynamoDB sends the data to Rockset. This can introduce some replication lag which can affect some users when we talk about millions of concurrent users. Because if a user tries to access the leaderboard immediately after winning a quiz he/she would expect that his name should be in top segment(s) and if he/she doesn't find his/her name in the top segment(s) then there are high chances that the gaming company might lose out on a genuine user. Just wanted to know your view on this 😃
@deepshikhazutshi6637Ай бұрын
I think alongside persisting in transactional database. We can ingest user score data in redis sorted sets which are optimised for high load especially sorted sets can increment & quickly give out the top player's ranks efficiently. All the reads to get top K players can be served from the redis itself To avoid overloading the service & transactional DB we can do writes in batches with flush interval of 500ms adhering to NFR of being near real-time. And to reduce latency of read requests what we can do is, We can fetch top k users periodically within a flush interval of 200ms and store it in memory to serve everyone the same snapshot of top K winners in a window of past 200ms. This way we store a snapshot of top K players from redis sorted set and store it in memory in some data structure/ or in redis itself. A worker thread in background can invalidate this and refresh it again with new set of top K players
@abhaysoni86315 ай бұрын
So, Arpit, we are implementing the CQRS pattern and segregating read and write operations to different databases with eventual consistency. We are accepting the trade-off that we may sometimes have to deal with stale data.
@abhaysoni86315 ай бұрын
am i right. can you please highlight more such patterns please name them i want to read more about such patterns
@MarathiNationOne4 ай бұрын
Your content looks real examples and experienced. Not like others just to prepare the topic and making videos.
@dippatel17395 ай бұрын
Rockset is shutting down after acquisition by openai. Customer had to move out. 😅
@AsliEngineering5 ай бұрын
Still their underlying tech is pretty neat, I learnt some pretty interesting stuff going through it.
@Aditya_Vyas5 ай бұрын
Thank You Arpit for always posting without forgetting it. I know today we have less subscribers & views but I know you have travelled it long & one day people will understand the value of your content. They will love your passion of posting videos/content about System Design & they'll understand it's more than just drawing boxes. Art shines in the darkest days. I'm always with you & I owe you a lot whenever I learn from your piece of content. Thank You for being there ❤
@AsliEngineering5 ай бұрын
Thank you so much Aditya :) this means a ton. made my day with this comment! all the very best :) #BeCurious
@ankitraj17965 ай бұрын
I have on question, you mentioned that performing writes and reads at scale on a single database is not going to work, but lets say in the design we are ingesting data from Dynamo DB to Rockset which means that Rockset is also having a high write workload and simultaneously it’s processing read queries too, so wouldn’t Rockset face issues in this case?
@AsliEngineering5 ай бұрын
Watch previous videos about Rockset on how it handles high write throughput. I have it covered in the same playlist.
@ankitraj17965 ай бұрын
@@AsliEngineering gotcha, yeep I just checked again it’s due to write goes to in memory buffer first and then flushed to disk periodically
@sachinjain38335 ай бұрын
It's good to know about Rockset. I have a few questions: Does Rockset store all the data, including data from 5 years ago, or only the most recent data, such as from the last 5 days? If so, what is the eviction policy? Is it advisable to store precomputed records for 1 month, 2 months, 6 months, 1 year, and 5 years in an SQL database? For example, when a request is made to view the leaderboard for the past 45 days, the request would query both MySQL and Rockset. The data for the first 30 days would be fetched from MySQL, and the data for the remaining 15 days would be calculated from Rockset. The combined data would then provide a fast calculation for the 45-day leaderboard. Throw some light if this fits In the design
@AsliEngineering5 ай бұрын
1. Yes they store all the data. 2. You can configure eviction 3. They don't want you to store precomputed value, instead they compute on the fly They provide connectors with various systems which can ingest data to Rockset. Then you simply query the data with regular SQLs.
@shvmsxn415 ай бұрын
The problem that comes with this kind of architecutre is to make sure that both the databases are in sync and to avoid dual write problem. A lag or inconsistency between these databases simply means that the system is not real time any more. The overhead of syncing the databases will itself become a challenge.
@ManishKumar-qx1khАй бұрын
isn't that a generic usecase in DB replication? That applies to every DB that you are going to use right?
@namantyagi62945 ай бұрын
can we use elasticsearch instead of Rockset and use something like CDC to get updates from the transactional system? Elasticsearch too is optimized for aggregation queries just like RocksetDB
@sitanshushrestha26295 ай бұрын
The same question was asked to me in one of the retail giant. I used Elastic search instead of rockset (didn't knew about this). But, the interviewer was not satisfied with the answer.
@namantyagi62945 ай бұрын
@@sitanshushrestha2629 can you tell what the exact question you were asked and what answer you gave? will help me understand it deeper
@AbhishekPrakash62625 ай бұрын
Hi Aprit, can you please explain why did we use dynamo db in the middle and not directly ingested the data to the rockset db? Is rockset lacking some other things that dynamo db does better?
@AsliEngineering5 ай бұрын
DDB is transactional database powering other features of the game, like game state, player state, etc. Rockset will be a global secondary index of all transactional data powering realtime analytics. Go through my replies on other comments for more details.
@pratikkulkarni8915 ай бұрын
Hello, thanks for your videos. I recently found them and have been enjoying them. A slightly off-topic question - I see you have mentioned the technical books you have read mentioned in the video description; have you read them cover to cover or do you follow some other practice? How do you recommend reading and possible remembering technical books?
@rark-o7f5 ай бұрын
great series !!! Complete fascination ... looks more clean, compact, modern architecture solving lot of problems at all level of the architecture !! thanks for sharing this knowledge with the world
@d093w1z5 ай бұрын
I always wondered how games scale their infrastructure. Thanks for this great video!♥️ Would love if you continued a bit about how MMORPGs work.
@hariikrishnan5 ай бұрын
Could you explain more about why read replica was a bad choice ?
@AsliEngineering5 ай бұрын
Because read replica of DDB would still be inefficient for aggregation, as it is not meant for aggs and analytics.
@shardulsilswal11405 ай бұрын
As always great video Arpit! In order to be consistent, do we implement something like write through to ensure the leaderboard queries show the latest data? This would increase the write latency and might affect the availability of the system?
@haha7836hahah5 ай бұрын
Hi. Never heard of rockset before so this might not be a good question but, You talked about the integration of dynamo(or any db good for transactions) with rockset. My question is why not just use rockset if it offers low latency aggregations and also can ingest large volume of data efficiently?
@AsliEngineering5 ай бұрын
DDB is source of truth for other transactional use cases like - score management, managing game state, managing player state, etc. Rockset on the other hand is auxiliary database that is hyper optimized for realtime aggregations, and hence leveraged for driving leaderboards.
@roonywalsh81835 ай бұрын
@@AsliEngineering so is the Rockset an alternative to KSqlDB ??
@divjyotsethi26955 ай бұрын
Nice video! Why use dynamo on write path and not just have rockset? Are there additional requirements which dynamo supports?
@AsliEngineering5 ай бұрын
It is the source of truth for everything game related, like game state, player state, etc. so DDB serves all transactional usecases and Rockset becomes auxiliary database catering to realtime analytics usecase.
@gauravmadan52175 ай бұрын
Great video! this does solve the problem for a quick and reliable solution. What would be the approach if some company don't want to go for a managed solution? Like can this be implemented via normal data pipelines, using spark. Assuming the SLAs for the metrics/leaderboards are a bit relaxed, like 5 min.
@AsliEngineering5 ай бұрын
Yes. Just replace Rockset with your integration and use Redis as a DB to drive leaderboard.
@someshu96655 ай бұрын
Hi Arpit, thanks for a knowledgeable video. I will surely learn more about dynamodb + rockset. I work in a gaming company and we use Redis for leaderboards.
@AsliEngineering5 ай бұрын
Think of Rockset as a global secondary index of all your transactional databases providing sub second latency to any query. Redis does the job well, not saying no to it. But there are some complex cases that cannot easily solved with Redis. For example, Leaderboard with Slicing and Dicing of data across different criteria.
@anshulsrivastava61325 ай бұрын
Can we not use CDC with olap db to achieve this?
@AsliEngineering5 ай бұрын
OLAP are not known for their low latency reads. So not a great choice.
@amansingh-os9gd5 ай бұрын
Analytics with low latency or sub real time
@anshulsrivastava61325 ай бұрын
@@AsliEngineering elasticsearch then?
@techlifewithmohsin61425 ай бұрын
@@anshulsrivastava6132 ES is also not a good option for complex agg, there are certain low latency OLAP databases designed specifically to solve such latency issues and give millisecond latency . Rockset, Apache Druid both can solve this problem at scale with milliseconds latency
@sachinmukherjee295 ай бұрын
Thanks for this awesome video. Can you please make a video explaining the query execution plan of a database?
@subhamsingh81435 ай бұрын
Hi Arpit, You mentioned that DynamoDB is not so good with Aggregation, can you tell me how does it compensate for that.?
@AsliEngineering5 ай бұрын
It does not compensate. If you want aggr with DDB then you need to pull the data out, agg it the way you like, and then serve.
@freecourseplatformenglish28295 ай бұрын
Hey Arpit, Thanks for detail video Now I can answer a lot os system design questions.
@AsliEngineering6 ай бұрын
Here's the link to Rockset, give it a shot - rockset.com/