Design an ML Recommendation Engine

Design an ML Recommendation Engine | System Design

Рет қаралды 17,388

Interview Pen

Күн бұрын

Пікірлер: 20

@shreys3 8 ай бұрын

These are incredibly well made and well articulated videos. Thank you!

@interviewpen 8 ай бұрын

Thanks!

@beautifulveneer 3 ай бұрын

In the beginning the model output is the probability that a given post will be liked by a given user. One thing missing is how the worker jobs processing the new post queue scale to millions of users (or followers if only applying to posts by users that are followed) and what is cached. Another thing is resiliency around post to queue and write to database. A lot of dbs stream data so that is one potential solution. Good job though.

@estebanmurcia8451 3 ай бұрын

7:49 i have a question here, if the inference server is distributed, would the cache still be a single "instance"? if not, how would the api know from what server should it get the corresponding data for a specific user?

@beautifulveneer 3 ай бұрын

The inference service would be stateless and use a scalable distributed cache such as redis.

@interviewpen 2 ай бұрын

A caching service such as Redis is shared, so data can be distributed across instances. We have a whole video on sharding at interviewpen.com :)

@alirezakhorami 8 ай бұрын

Amazing job, I love your videos about data, keep rocking

@interviewpen 8 ай бұрын

Thank you!

@chrisogonas 3 ай бұрын

That was incredibly useful. Thanks

@interviewpen 3 ай бұрын

Glad it helped :)

@chrisogonas 3 ай бұрын

@@interviewpen 👍

@RiwenX 8 ай бұрын

So you have to retrain the model for each new user?

@JohnSmith-op7ls 8 ай бұрын

You can train a model to simply correlate various demographic attributes to liking certain things, then it can guess based on new user demographics. You can get more elaborate and first try to find if there are strong correlations between cohorts of certain baskets of demographics, then use that, alone or in combination with individual demographic attributes. Basically you have to run the numbers to see what produces the best recommendations. You wouldn’t have nearly enough data on new users to be statistically relevant in a data set large enough to itself be statistically relevant. Over time you’d want to,retrain the model as enough new data on new and existing users is added. Otherwise the model’s accuracy can drift.

@interviewpen 8 ай бұрын

Not necessarily--the model can still make guesses based on the behaviors of other users along with whatever data you feed it during inference. However updating the model with a user's past behaviors will increase accuracy. Hope that helps!