Design Scalable News Feed System Similar to Instagram, Facebook & Twitter | System Design

  Рет қаралды 21,798

Code with Irtiza

Code with Irtiza

Күн бұрын

Design a scalable news feed system similar to feeds on Instagram, Facebook, and Twitter! We start with a simple working version and then build up to an optimized decoupled architecture while talking about the different tradeoffs that we are making.
System Design Playlist: • System Design Beginner...
00:00 Final Architecture Teaser
01:27 High-Level Requirements
02:20 Data Models
04:55 Creating A Post
06:50 Kafka CDC + Stream Processors
09:25 CDC Streams + Kafka
12:25 Getting User’s Feed
15:30 Problems with Computing Feed at Every Request
17:00 Pre-computing Feed with Redis Cache
20:45 Populating Feed Cache in Realtime
24:25 Populating Feed Cache Offline
25:55 Final Architecture Summary
29:52 Outro & Future Videos
#systemDesign #softwareArchitecture #interview
Visit me at: irtizahafiz.com
Reach me at: irtizahafiz9@gmail.com

Пікірлер: 57
@rajeshkishore7119
@rajeshkishore7119 6 ай бұрын
So crisp and clear.
@veerendrashukla
@veerendrashukla Жыл бұрын
Super explanation!
@sshian328
@sshian328 Жыл бұрын
good video, love it
@bbi-edu
@bbi-edu 8 ай бұрын
Great . thank you
@sun-ship
@sun-ship 3 ай бұрын
What a great channel!
@irtizahafiz
@irtizahafiz 3 ай бұрын
Thanks so much!
@melk48111
@melk48111 6 ай бұрын
why do we need scheduled job to update cache for every user?
@vidhooakash7440
@vidhooakash7440 2 жыл бұрын
Thanks
@dileeshaabilash5562
@dileeshaabilash5562 6 ай бұрын
🔥🔥
@amertia
@amertia Жыл бұрын
One thing missing in the design is about what happens for influencers and celebrities where the Push mode would not make sense.
@vinaymiriyala4522
@vinaymiriyala4522 Жыл бұрын
Do you mind sharing how to account for this op
@HindiCartoonKahaaniyaTv
@HindiCartoonKahaaniyaTv Жыл бұрын
@@vinaymiriyala4522 use fanout on read model for celebrities. Just before rendering feed fetch all saved feeds along with celebs post which you follow
@kishanprajapati6170
@kishanprajapati6170 10 ай бұрын
I have one doubt in designing data model. What would happen if We do create separate table POST_USER and include User_id in Post table.
@LuluHou
@LuluHou Жыл бұрын
great design and clearly articulate! thanks a lot! i just wonder, why does stream processor needs to talk to feedservice? i thought feed service now just read results from redis cache. could you help clarify?
@irtizahafiz
@irtizahafiz 9 ай бұрын
I should have been clearer. You are right, feed service directly reads from redis cache.
@dhiliph98
@dhiliph98 Жыл бұрын
Thanks for the amazing content! Rather than using a CDC, can we simply write a "post_created" event directly to Kafka from the post service? So the post service does 2 jobs. One, write to the database and two, write an event to Kafka.
@irtizahafiz
@irtizahafiz Жыл бұрын
Yup! That works too. Totally depends on what kind of architecture you have.
@juheelee568
@juheelee568 Жыл бұрын
For the Scheduled Job, you said that you will iterate through all the users in your database and update the Feed Cache. For the Scheduled Job, if it updates the Feed for every single user in our system (let's say 5M), would you be adding 5M rows to the Feed Cache? My thought was that the Feed Cache would only store a percentage (lets say 20%) of daily users.
@irtizahafiz
@irtizahafiz Жыл бұрын
Hi! You can do it both ways depending on what kind of infra you have for database and cache.
@mickeyp1291
@mickeyp1291 4 ай бұрын
3:05 Post_User does not need an ID, the primary key is as a weak entity
@JoaoKunha
@JoaoKunha Жыл бұрын
For pagination ! Lets assume you have 100 posts cached for each user. Would you consider another service to add more posts to this user cache on reaching last available posts ?
@irtizahafiz
@irtizahafiz 9 ай бұрын
You can store all the IDs in your cache, and paginate there. Given you are only storing IDs, and not post details, you can add a ton there.
@jingjingcoming
@jingjingcoming Ай бұрын
there is potential bottleneck on the post api to user-post table before the cdc Kafka. Maybe can partitioning or sharding this part
@jingjingcoming
@jingjingcoming Ай бұрын
Or swap this part with nosql server
@sergiim5601
@sergiim5601 2 жыл бұрын
Hi, great content! Why do we need a Post_User table? We could have a UserId column in the Post table that would record an owner's ID ?
@irtizahafiz
@irtizahafiz Жыл бұрын
Yeah you could do that too. But having a post_user table will let you store more fields about the relationship if needed.
@zymasethecatalyst
@zymasethecatalyst 2 жыл бұрын
👍🚀
@koviroli
@koviroli 3 ай бұрын
You made me to start thinking on a lot of things in my project. Thank you very much! A question to Irtiza or anyone: Step 1) So I fill the Feed cache with new post ids that belongs to a user, that should be displayed for user. Step 2) Probably I should remove the cached posts at a time... But when? When the user saw the post? Or should there be an expiration on each cached post?
@ujwalkumar2374
@ujwalkumar2374 Ай бұрын
1. Feels like it 2. Both seem valid but second makes more sense. Lets say there is a list of posts for some users but the users have not used the app for a while. It does make sense to remove the post after sometime
@user-eq4oy6bk5p
@user-eq4oy6bk5p 2 жыл бұрын
Why do you need separate ID column for Friends and PostUser table when you can just use composite key (postID, userID - PostUser) which uniquely determines a row?
@irtizahafiz
@irtizahafiz 2 жыл бұрын
I always prefer having an auto incrementing ID column for all my tables. It helps with JOINs in the future, if you are not considering all your use cases right now. And it's worth the performance tradeoff given the simplicity of that column.
@nochecku6834
@nochecku6834 4 ай бұрын
what if the redis cache does not have the user id for whom the feed is getting loaded, then the feed service needs to talk to post service? Or will you return no feed for them, which is a poor experience?
@irtizahafiz
@irtizahafiz 3 ай бұрын
Yes. If you run into cache miss, you should always consult the DB with the "same logic".
@narisetiuday2906
@narisetiuday2906 5 ай бұрын
Awesome videos. What is the name of the tool that you used for the diagrams?
@irtizahafiz
@irtizahafiz 5 ай бұрын
Miro :)
@fanhong20010
@fanhong20010 12 күн бұрын
can Post_user and Post in one table?
@nehapurohit684
@nehapurohit684 10 ай бұрын
Is Moderation service updating Post User table if any post found out to be malicious
@irtizahafiz
@irtizahafiz 9 ай бұрын
Yes that would be the idea. But this design is at a very high level, so I might have not mentioned that explicitly.
@JH-zd6en
@JH-zd6en 4 ай бұрын
I have a question. Let's say if A and B are friends. When A creates a post, it writes to the redis cache on server1 to build the feed for friend B. However, friend B gets routed to server2, which means it won't have access to this cache. In other words, if A has 100 friends, and when A creates a post, how do we update the feed cache for these 100 friends? They are in different servers and their cache will not be in server1.
@firezdog
@firezdog 4 ай бұрын
What happens for the posted data when it fails moderation but is still being implemented processes by other workers / has been written into storage
@irtizahafiz
@irtizahafiz 3 ай бұрын
Depending on your tolerance level, you can start processing after moderation, or go ahead and delete records / evict caches after something is flagged as inappropriate by the moderation system.
@d33bo67
@d33bo67 7 ай бұрын
if the posts get stored in the CDC before it hits the Modification Stream Processor, then hits the Feed Stream Processor, how is it going to prevent offending messages from being posted?
@irtizahafiz
@irtizahafiz 4 ай бұрын
That's a great point!
@nagarajutammineni6736
@nagarajutammineni6736 7 ай бұрын
Missing one context, Why feed stream processor interacts with feed service. You were saying "The feed of users". May I know what it is?
@irtizahafiz
@irtizahafiz 4 ай бұрын
The feed is a precomputed set of posts that the user sees on their home/feed page.
@JardaniJovonovich192
@JardaniJovonovich192 2 жыл бұрын
1.) Why does feedStreamProcessor need to talk to Post service? 2.) How does Feed Service fetch the information of a user whose entry isn't present in cache at all? It should be talking to Friend service, Ranking service and then fetch the relevant details and then push it to cache and return the response, right?
@irtizahafiz
@irtizahafiz 2 жыл бұрын
1. The stream processor will need to pull details of the post. It usually deals with IDs only. 2. Yes, that's correct.
@meditationdanny701
@meditationdanny701 2 жыл бұрын
I don't think storing age just as interger will make sense rather storing dob and parsing that to obtain age at run time is the approach
@irtizahafiz
@irtizahafiz 2 жыл бұрын
Yup! I agree. The purpose of the video was to design the whole system, not dive deeper into individual data model. So I decided to keep things simple : )
@meditationdanny701
@meditationdanny701 2 жыл бұрын
@@irtizahafiz gotcha
@flashliqu
@flashliqu 2 жыл бұрын
would you upload the lecture slide
@irtizahafiz
@irtizahafiz 2 жыл бұрын
Hi! Unfortunately, I don't have the slides for this one. For most of the other ones I started uploading PDFs or slides. Hope that helps!
@mayankmaheshwari2544
@mayankmaheshwari2544 9 ай бұрын
api gateways knows which service to hit not load balancer
@stevehyuga9216
@stevehyuga9216 11 ай бұрын
This is great but I don't think it is efficient to create feeds for users that you don't know if they will use the service at all. On Twitter or other social networks there must be millions of inactive users, that maybe are following Elon Musk, so everytime Elon twits you are doing a lot of unnecessary work for those millions of inactive users. Besides that, I'd like to have more details about the Raking service. On the first example I don't see efficient to get all the post in order to send them to the Ranking service.
@sarang5906
@sarang5906 8 ай бұрын
Agreed. It is a trade-off to be made in terms of the freshness of the feed. So one solution could be to refresh the feed only if the user visits and refreshes their Newsfeed Page.
@davidmataviejo3313
@davidmataviejo3313 6 ай бұрын
It is the price you pay for having the users feed already computed. Users will not use it if they need to wait 1 minute to be ready. And this approach only works for regular users. For users with huge amount of followers, don't follow the same approach
@dontdoit6986
@dontdoit6986 3 ай бұрын
Twitter actually does create feeds for every user with every new post. It’s counter-intuitive, but they do own their servers for performing the compute.
Realtime Advertisement Clicks Aggregator | System Design
32:56
Code with Irtiza
Рет қаралды 19 М.
Ouch.. 🤕
00:30
Celine & Michiel
Рет қаралды 25 МЛН
Design Twitter - System Design Interview
26:16
NeetCode
Рет қаралды 479 М.
Design Proximity Services Like Yelp & Google Maps | System Design
33:42
Instagram System Design | Meta | Facebook
16:38
ByteMonk
Рет қаралды 34 М.
Twitter system design | twitter Software architecture | twitter interview questions
36:56
System Design: Design a URL Shortener like TinyURL
16:00
Code Tour
Рет қаралды 82 М.
Design A Scalable Notification System | System Design
28:23
Code with Irtiza
Рет қаралды 37 М.
Design A Scalable Rate Limiter | System Design
24:32
Code with Irtiza
Рет қаралды 10 М.
Web Crawler System Design Concepts Nobody Talks About
21:42
Pratiksha Bakrola
Рет қаралды 6 М.