Evaluation Measures for Search and Recommender Systems

  Рет қаралды 11,968

James Briggs

James Briggs

Күн бұрын

Пікірлер: 25
@aminghaderi1902
@aminghaderi1902 9 ай бұрын
Probably best explanation out there.
@parsakhavarinejad
@parsakhavarinejad 10 ай бұрын
Clearly explained. Thank you
@anujlahoty8022
@anujlahoty8022 11 ай бұрын
What a video, hats off!
@goelnikhils
@goelnikhils Жыл бұрын
Amazing Explanation. So clear. Very helpful
@sriks4003
@sriks4003 6 ай бұрын
Very helpful, thank you!
@shrar837
@shrar837 2 жыл бұрын
Your videos are impressive and very informative mate. 👌
@jamesbriggs
@jamesbriggs 2 жыл бұрын
thanks!
@sumantjha8392
@sumantjha8392 2 жыл бұрын
Super informative and great..thanks
@miguelfsousa
@miguelfsousa 11 ай бұрын
This video is great.
@morannechushtan2101
@morannechushtan2101 Жыл бұрын
21:23 Statistically there is probably a cat in the box on image 3
@goelnikhils
@goelnikhils Жыл бұрын
Hi James, I have a question on NDCG or any other ranking aware metrics. How does these metrics work where you have millions of products/items. What I mean is if we have millions of items, then it means we have to first label (manually) all the million items for relevance /rank. And then when our model predicts we use NDCG. Isn't this a big drawback of NDCG. Can you please suggest what is better approach to rank if we don't have relevance labeled data. Thanks in
@Han-ve8uh
@Han-ve8uh Жыл бұрын
1. I got confused at 18:29 when predicted is a nicely increasing sequence making me think are those ranks or item ids. I was also thinking whether the len of intersection act_set & pred_set could simply be len(act_set), then i realized this example here is a very special case where act_set is subset of pred_set. If act_set contains value 9, then we can't use len(act_set) alone and the formula in video is required. 2. Similar to question nikhil goel asked in comments section 2 weeks before this, where does 13:46 actual_relevant data come from? It looks manually labelled, and this labelling occurs per query making it super unscalable?. 3. Assuming we accept manual labelling how is the 0-4 range determined? I feel like drift is a problem, when todays 4 becomes tomorrows' 3 as value judgements change, does this mean relabelling all results again? 4. I noticed some metrics aggregate across queries and k, and some are only within 1 query across k, in what scenarios do we use each? 5. I didn't expect a *relk in AP@K formula, why do we ignore certain precision at certain k? Feels like artificially increasing metrics for the sake of it, which becomes ineffective if every query does it
@HazemAzim
@HazemAzim Жыл бұрын
Super nice .. Thanks
@preetimehta1247
@preetimehta1247 9 ай бұрын
Hi , I have a query If I am working on a song recommendation project by using Spotify API data set, I have used models like cosine similarity, matrix factorization, knn , Latent Semantic Analysis (LSA) model, Correlation Distance method. Now I am confused about how should I approach for evaluation metric in this system.
@Data_scientist_t3rmi
@Data_scientist_t3rmi 2 жыл бұрын
Good video !
@vishalwaghmare3130
@vishalwaghmare3130 2 жыл бұрын
Very helpful ❣️
@tarikkarakas587
@tarikkarakas587 2 жыл бұрын
Biggest problem is labeling the product whether it is relevant or not. It is not possible to label each search. Meanless if you can't handle with that.
@jamesbriggs
@jamesbriggs 2 жыл бұрын
Yeah data prep as usual with ML is the hard part, if you're interested in evaluation methods for IR *without* labeled data look into online metrics for eval (and training)
@joyeetamallik5063
@joyeetamallik5063 2 жыл бұрын
Hi James! can u make some vedios of updating Models if we Keep on getting data(e.g Biweekly)
@jamesbriggs
@jamesbriggs 2 жыл бұрын
cool idea! I'll add to the list :)
@Data_scientist_t3rmi
@Data_scientist_t3rmi 2 жыл бұрын
IN MRR, when our search result doesnt inclued the result that we want, for your example if we want to search for cats and we find only dogs, how can we calculate MRR ? can we give it a big number for exemple rank 20 for all Not included results? 1/20
@jamesbriggs
@jamesbriggs 2 жыл бұрын
yes as you said - or use another metric that better fits to your scenario
@Data_scientist_t3rmi
@Data_scientist_t3rmi 2 жыл бұрын
@@jamesbriggs Thank you for your answer
@mattygrows7667
@mattygrows7667 2 жыл бұрын
love your videos but why do you always seem so sad
@jamesbriggs
@jamesbriggs 2 жыл бұрын
thanks! idk I'm happy I promise lol
Metadata Filtering for Vector Search + Latest Filter Tech
34:14
James Briggs
Рет қаралды 8 М.
Better Chatbots with Semantic Routes
35:28
James Briggs
Рет қаралды 990
Симбу закрыли дома?! 🔒 #симба #симбочка #арти
00:41
Симбочка Пимпочка
Рет қаралды 6 МЛН
Sigma Kid Mistake #funny #sigma
00:17
CRAZY GREAPA
Рет қаралды 17 МЛН
Lamborghini vs Smoke 😱
00:38
Topper Guild
Рет қаралды 56 МЛН
Trends in Recommendation & Personalization at Netflix
32:00
Scale AI
Рет қаралды 28 М.
Choosing Indexes for Similarity Search (Faiss in Python)
31:33
James Briggs
Рет қаралды 22 М.
IR Course Lecture 14: Evaluation
35:34
Venkatesh Vinayakarao
Рет қаралды 4,9 М.
Mean Reciprocal Rank (MRR): Evaluating a Retrieval System
8:23
Computing For All
Рет қаралды 406
Building a MovieLens Recommender System
1:29:20
Toronto Machine Learning Series (TMLS)
Рет қаралды 20 М.
Maciej Kula - Hybrid Recommender Systems in Python
34:41
PyData
Рет қаралды 35 М.
Faiss - Introduction to Similarity Search
31:37
James Briggs
Рет қаралды 61 М.
RAG But Better: Rerankers with Cohere AI
23:43
James Briggs
Рет қаралды 63 М.
Симбу закрыли дома?! 🔒 #симба #симбочка #арти
00:41
Симбочка Пимпочка
Рет қаралды 6 МЛН