These are incredibly well made and well articulated videos. Thank you!
@interviewpen8 ай бұрын
Thanks!
@beautifulveneer3 ай бұрын
In the beginning the model output is the probability that a given post will be liked by a given user. One thing missing is how the worker jobs processing the new post queue scale to millions of users (or followers if only applying to posts by users that are followed) and what is cached. Another thing is resiliency around post to queue and write to database. A lot of dbs stream data so that is one potential solution. Good job though.
@alirezakhorami8 ай бұрын
Amazing job, I love your videos about data, keep rocking
@interviewpen8 ай бұрын
Thank you!
@estebanmurcia84513 ай бұрын
7:49 i have a question here, if the inference server is distributed, would the cache still be a single "instance"? if not, how would the api know from what server should it get the corresponding data for a specific user?
@beautifulveneer3 ай бұрын
The inference service would be stateless and use a scalable distributed cache such as redis.
@interviewpen3 ай бұрын
A caching service such as Redis is shared, so data can be distributed across instances. We have a whole video on sharding at interviewpen.com :)
@chrisogonas4 ай бұрын
That was incredibly useful. Thanks
@interviewpen4 ай бұрын
Glad it helped :)
@chrisogonas4 ай бұрын
@@interviewpen 👍
@motbus39 ай бұрын
I see you used Airflow for generic approach, but Argo has easier integration with MLFlow
@interviewpen8 ай бұрын
Good to know, thanks!
@RiwenX9 ай бұрын
So you have to retrain the model for each new user?
@JohnSmith-op7ls9 ай бұрын
You can train a model to simply correlate various demographic attributes to liking certain things, then it can guess based on new user demographics. You can get more elaborate and first try to find if there are strong correlations between cohorts of certain baskets of demographics, then use that, alone or in combination with individual demographic attributes. Basically you have to run the numbers to see what produces the best recommendations. You wouldn’t have nearly enough data on new users to be statistically relevant in a data set large enough to itself be statistically relevant. Over time you’d want to,retrain the model as enough new data on new and existing users is added. Otherwise the model’s accuracy can drift.
@interviewpen8 ай бұрын
Not necessarily--the model can still make guesses based on the behaviors of other users along with whatever data you feed it during inference. However updating the model with a user's past behaviors will increase accuracy. Hope that helps!
@ALWALEEDALWABEL9 ай бұрын
Why do you hide their names? Why is the teacher's name not put on the video? Is there something you are ashamed of?
@drhxa4 күн бұрын
What are you even talking about? It was a great video