Mean Average Precision (mAP) Explained and PyTorch Implementation

Рет қаралды 79,691

Aladdin Persson

Күн бұрын

Пікірлер: 114

@marcintarlaga9734 3 жыл бұрын

The best explanation of AP and mAP i have found in the whole internet. Thank you!

@simoncrase5360 3 жыл бұрын

Aladdin, Thank you. The first few minutes cleared up the thing that was stopping me from understanding mAP: I hadn't realized that Precison and recall were being calculated progressively. It's totally clear now.

@AladdinPersson 3 жыл бұрын

Appreciate the comment!

@linesbymartin514 8 ай бұрын

what do you mean by progressively ?

@gholamrezadar 5 күн бұрын

@@linesbymartin514 watch 5:43 meaning we don't just calculate one recall and one precision.

@mailoisback Жыл бұрын

The best explanation of mAP I have ever seen and you even implementated it in Python. Pure gold! Thanks so much!!!

@1chimaruGin0_0 4 жыл бұрын

The explanation is so good as usual. Congratulation for the error free day at first attempt. :3

@ibadurrahman5954 3 жыл бұрын

This CHANNEL is GOLD. Literally Thanks a lot man. Yesterday i thought that I would never understand yolo. your channel is proving me wrong.

@rg1957 Жыл бұрын

I was struggling to understand the mAP concept for the past 5 months..Your video is great..clearly understood..Thanks a lot!!!

@ChizkiyahuOhayon Жыл бұрын

Aladdin, I have to sincerely thank you because you make those complicated concepts comprehensible to me. You make me realize that I just need proper guideline and teaching, and then I can gradually comprehend anything! God bless you! Shalom!

@akhilezai 3 жыл бұрын

wow this channel is a gold mine

@priyankapaudel8985 3 жыл бұрын

I was struggling to understand the concepts of OD. Can we have a series on Object Tracking as well? Thankyou these videos are beyond great

@grewupthescrewup5881 4 жыл бұрын

A clear theoretical introduction with an error free implementation. That's what makes a great tutorial video. Thanks bro!!

@AladdinPersson 3 жыл бұрын

Appreciate you saying that!

@ElfTRAVELTOUR 9 ай бұрын

best working explaination of AP and mAP!!! all clear now!!

@hongkyulee9724 Жыл бұрын

This videos is very intuitive for me to understand what mAP is. Thank you so much!

@andst4 3 жыл бұрын

This video is the best one in KZbin regarding mAP. I've checked out some more videos on this channel for sure I will watch more of them! Subscribed!

@Mesenqe 4 жыл бұрын

Good job, nice explanation, and thank you indeed!

@universe3406 4 жыл бұрын

This channel is growing 🤩!

@namnguyen7153 3 жыл бұрын

What an excellent explanation of mAP! Thank you!

@hyeonjinlee3692 3 жыл бұрын

Your explanation is really clear. Thank you so much!

@dengzhonghan5125 2 жыл бұрын

Love your videos, they are super helpful. Very friendly for beginners!!!!!!!!!!!

@zukofire6424 Жыл бұрын

beginners friendly? really? :(

@marymarriam7849 2 жыл бұрын

Thank you for the great contribution. A humble suggestion would be to remove FP calculations and instead, define it once TP calculation is completed. We know that TP + FP = [1, 1, 1, ... ] Then, either by using FP = 1 - TP or by using the logical not of TP. (as an example, torch.logical_not( torch.tensor([0, 1, 0]) ).to(torch.int64) #tensor([1, 0, 1]) )

@waterspray5743 Жыл бұрын

I love you soo much! Thank you for making these videos!!

@rakeshkumarkuwar6053 3 жыл бұрын

I'm not able to understand how the recall is 1/4 in beginning of the Precision-Recall table. As recall is TP/(TP+FN). Here TP=1. How we came to know that FN=3. Can anyone pls explain?

@riccardovolpi2938 3 жыл бұрын

it's the total number of gt boxes in your dataset, ideally you would like to get all of those

@ksjhang 3 жыл бұрын

denominator of Recall equation is # of true box. == 4.

@irvingha6138 2 жыл бұрын

We ignore 2 TP bboxes, and there's 1 ground truth we did not detect. So in total, in this confidence threshold, we only predict 1 bbox neglecting 3 ground truths. Since we don't detect the 3 ground truths, they are defined as False Negative. Hence, FN=3. btw, Recall=True positive samples divided by all positive samples.

@swedenontwowheels Жыл бұрын

Fantastic video! I watched a few about mAP but this explains in the best way! Great job! One question though: I understand there are two main ways to get AP for one IoU threshold for one class: 1) AUC as you explained; 2) simply averaging precisions. Is it corrected that PyTorch adopted AUC instead of the other?

@nasimthander9137 3 жыл бұрын

how have you calculated the precision and recall for each of the bounding boxes? Time-Frame: 5:48

@michaelfrancissy4804 3 жыл бұрын

Same question

@SanKum7 2 жыл бұрын

I LOOKED THROUGH COMMENTS ; LOOKING FOR AN ANSWER FOT THIS:

@harshitsamani3933 Ай бұрын

The way the PR curve is obtained is not correct. Ideally, P and R values should be calculated for each threshold value, and then the PR curve should be plotted. Please correct me if there is some gap in my understanding.

@OnlineGreg 2 жыл бұрын

awesome, thanks a lot. finally i completely understood it

@lovekesh88 3 жыл бұрын

you earned a subscriber. Kudos.

@CaterinaCaccavella Жыл бұрын

Hey, first of all thank you very much for this video and the object detection videos in general :) I still have a doubt on mAP though. From looking at other tutorials and code repos, I understood that one precision-recall plot is for 1 object class and different IoU threshold values (-> one point in the precision and recall plot is the precision and recall for all the bbox predictions in the test set for 1 class). However, in this video each precision and recall points in the plot corresponds to the precision and recall value of one specific prediction (as cumulative sum). Intuitively for me, the first approach is the right one since we want to estimate the precision and recall over all the predictions in the entire dataset. Moreover, different methods (e.g. F1 score) are used to estimate the best IoU threshold value to use from the precision-recall plot. Could you clarify this for me, please? Thank you very much :)

@hansalas 3 жыл бұрын

Finaly understood mAP... Excellante ... :thumb:

@tomasmartin6517 3 жыл бұрын

wwwoooowww, easy and great explanation bro :)

@付裕-b7y 3 жыл бұрын

thx for your explanation, really nice video!

@leisana4097 3 жыл бұрын

How come recall is 1/4 5:56. It should be 1/3. recal = tp/tp+fn , which means there is one tp and total prediction boxes for image3 is 3. From where 4 is coming.

@erickcardozogalvez6229 2 жыл бұрын

es el total de cuadros verdes que se tiene en el todas las imágenes. En la imagen 1 se tiene 02 cuadros reales (02 cuadros verdes), en la imagen 2 se tiene 01 cuadro real (01 cuadro verde) y en la imagen 3 se tiene un cuadro real (01 cuadro verde), haciendo un total de 04 cuadros reales (04 cuadros verdes)

@surajacharya2500 9 ай бұрын

Great video. Could you also make the video on calculating the validation loss for a faster RCNN model? Thank you

@mosahosseini63 11 ай бұрын

Hi thank you for the video , i dont understand why you have the else statement in 20:50

@salmanzaidi8104 3 жыл бұрын

Absolutely brilliant!

@ignaciomarin7029 3 жыл бұрын

A really good explanation

@ocbhu8666 Жыл бұрын

Wonderful explanation but what about the confidence with which it’s being predicted as it’s a proper bounding box does it need to be considered and also didn’t get the point regarding step size

@walidbrini6358 3 ай бұрын

All predictions are sorted by their confidence scores in descending order.

@חננאלחדד-ט4ל 3 жыл бұрын

Great video! Can you explain the numerical stability issue you solve with the epsilon?

@eadhundi4138 Жыл бұрын

thank you for the clear explanation. How can we implement the mean average recall, please?

@JammerMate 5 ай бұрын

mAP explanation is great, but the implementation is absolute mindF.

@JammerMate 5 ай бұрын

Now I have watched it 3 times and now I can say confidently that I am starting to get this... For a audio/visual learner. This guy is a boon for ML field.

@nikitabarinov5068 10 ай бұрын

Hello! Thanks for your implementation, but I have a question about averaging on iou threshold? You don’t do it, and when it need?

@sahil-7473 4 жыл бұрын

Sir, doubts Q1. Can you comment/share which code line number performs step as per your presentation so that I can connect it? I am not connecting/seems confusing after watching again and again. So, once you tell me which line number perform which steps as per your presentation, then I am able to study well and connect it. Q2. Here train_idx means image number. Let say I have one image which contain two bbox. So, pred_boxes, how would it look like it initially? Is this like: [[0,1,0.7,x1,y1,x2,y2] ,[0,0.65,x1,y1,x2,y2]]? Sorry Sir! I am hearing impairment. So I'm able to understand partial (sound seems to be low). Thanks!

@chennamollavinay3396 2 жыл бұрын

Thanks for the wonderful explanation. I wanted to know whether the input bounding boxes to be normalized or the original or that won't affect?

@ЕрасылОразбек-ч3ъ 3 жыл бұрын

sorry, what is train_idx, the first element in each prediction box (pred_boxes[i][0])?

@sukhbir24 3 жыл бұрын

this implementation uses greedy assignment of detections to ground truths, you could also use hungarian assignment which is optimal

@КириллКлимушин Жыл бұрын

Could someone explain how we calculate recall at 6:17?

@ShikuExploringLife Жыл бұрын

Hi Aladdin, can you please explain why we have considered the confidence of Image3 which is 0.8 as FP as IoU is less here.

@emanuelhuber4312 3 жыл бұрын

Really nice video and explanation and a plus for using vim haha. Why FNs (when the model predicts more bounding boxes that actually exists) was not taken into account on recall formula?

@vaibhavsingh1049 3 жыл бұрын

we know beforehand the total number of ground truths.

@mosahosseini63 11 ай бұрын

You declare a variable called "num_gts" that you never use. Because iou function can take 2d tensor i suggest that instead of looping over ground_truth_img , u try : iou_values = iou( torch.tensor(ground_truth_img ) , torch.tensor(detection * num_gts) ) if len(iou_valus) > 0 : best_idx = torch.argmax(iou_values ) best_iou = iou_values[best_idx] This makes the code more efficient.

@jamiescotcher1599 3 жыл бұрын

Please could you clear this up as I'm unsure of the order of operations: Do we calculate the mAP (the average of the area under the precision-recall curve across all classes), and then get the mAP for ONE threshold. We then calculate the same thing for each IoU threshold (for the case of COCO: 0.5:0.05:0.95) to get the final mAP?

@CodeWithZeyad 2 ай бұрын

it's just saying anything under 50% overlapping shoudn't be concidered detected, pushing this number to 0 means if they had 0% overlapping you would concider it detected

@Alex-gm6lj 3 жыл бұрын

Thank you for the Tutorial! Is there a reason as to why targets have scores that are not 1 in the Unit Tests?

@Saitomar 3 жыл бұрын

Why is recall 1/4? Is it dependent on the number of bounding boxes in each image ?

@AladdinPersson 3 жыл бұрын

The 4 signifies the total number of ground truth bounding boxes from all images in our dataset, in the small case I showed we had 4 in total. The 1/4 is the fraction of what we have "covered" so far from our predictions (with the necessary IOU)

@Saitomar 3 жыл бұрын

@@AladdinPersson thanks. I got confused because I was focusing too much on FN that I forgot the definition of recall which is how many TPs you have out of all positive sample.

@soumyadbanik 3 жыл бұрын

@@AladdinPersson But in Image 3, you showed only one ground truth bounding box. That was a bit confusing. As you told Recall is ( # correct predictions / # total target bboxes). So, shouldn't it be 1/1 also? I might be wrong!

@ereh3423 3 жыл бұрын

@@soumyadbanik i've got confused in a first moment too. But I understood it after. Are you confuse yet?

@erickcardozogalvez6229 2 жыл бұрын

@@AladdinPersson Cierto, es el total de cuadros verdes que se tiene en el todas las imágenes. En la imagen 1 se tiene 02 cuadros reales (02 cuadros verdes), en la imagen 2 se tiene 01 cuadro real (01 cuadro verde) y en la imagen 3 se tiene un cuadro real (01 cuadro verde), haciendo un total de 04 cuadros reales (04 cuadros verdes)

@anhhiephoang7970 3 жыл бұрын

Very easy to understand your explanation :)) Btw, can you implement some PyTorch code related to NLP? :))

@dmitryzarubin8875 3 жыл бұрын

Hello, thanks for the clear explanation and implementation. But how can we calculate mean avg recall? By torch.trapz and switch order of precisions and recall tensors? or how? And if we calculate mean average recall, can we calculate something like mean avg F1, as we got precision and recall values? Thanks once again!

@gurushreyaass9458 3 жыл бұрын

Try making a series of RL videos, also explain more papers if u can

@sriramnarayanan3619 3 жыл бұрын

Is there any library that can do this? I wanna use your code, but unfortunately, the RAM is overflowing every time I try to populate the arguments to pass into your function. Would love to know if there's an alternative available. P.S. awesome work with the explanation. Thanks

@youngandy6161 Жыл бұрын

Sorry, i have a question. In the code line 62, why the seen bounding box is counted in the FP list? Its iou is still larger than the iou_threshold.

@agfd5659 Жыл бұрын

Because another bounding box has already been matched to the ground truth box at that point

@YasirKhanBahadar-n8b 4 ай бұрын

can you tell me how to calculate mAP for coco format json files?

@FindMultiBagger 3 жыл бұрын

Subscribed ♥️

@KCDRofficialprojects 2 жыл бұрын

Hello sir, Explanation is good.Code is executed but not giving any output. Sir,How to calculate the accuracy for the yolov4 for social distancing video not only this for any video without using cuda,cudnn,pytorch,tensor flow etc.Is it possible to calculate MAP?If possible please ping me.. how to write a code for it.. execution like as Python xyz.py

@manu1983manoj 3 жыл бұрын

i really love ur videos.. but sometime I feel like will I be able to do all that stuff so efficiently

@cristianlazoquispe4472 2 жыл бұрын

How does it work for multiclass object detection?

@saurrav3801 4 жыл бұрын

Bro is is this possible to implement pytorch in django?

@1chimaruGin0_0 4 жыл бұрын

Yes, I'm also using pytorch along with Django but for video stream Flask is better.

@zukofire6424 Жыл бұрын

great video and thx. Yet, anyone can explain what's going on at line 58? what is amount_bboxes[detection[0]][best_gt_idx] in particular? I don't get the second index?

@mosahosseini63 11 ай бұрын

example : amount_bboxes = {0: torch.tensor([0,0,0]) , 1:torch.tensor(0,0,0,0,0,0) } . amount_bboxes is a dictionary where the keys are the image id:s and the values are tensors where each element in the tensor corresponds to a bounding box for that specific label . for example for image id 0 we have 3 bounding boxes because the value for key 0 have 3 elements, ( one for each bounding box). detection[0] is simply image id for the prediction box. "amount_bboxes[detection[0]] " means get the value from amount_bboxes where the key matches detection[0] . for example if detection[0]= 0 amount_bboxes[detection[0]] will give you torch.tensor([0,0,0]) best_gt_idx is the index of the best bounding box for a particuar image . If we assume it is 1 then : amount_bboxes[detection[0]][best_gt_idx] will be the second element in torch.tensor([0,0,0])

3 жыл бұрын

Why would you penalize that there are multiple predictions of the same object in the output of the NN? Wouldn't that increase the reliability of the detection?

@bijayshakya337 11 ай бұрын

Hello, Can you please provide the python code file or notebook for the calculation?

@hafizhzufar9160 3 жыл бұрын

i think i still don't get it what does it mean by false negative in 3:40 anyone could help?:(

@AladdinPersson 3 жыл бұрын

There's a bounding box missing that we should have predicted

@NishantSachdeva9464339997 3 жыл бұрын

what editor is that??

@carlosibarcena 3 жыл бұрын

I have a question, why to do from 0.5 to 1? why not from 0 to 1?

@ilikeBrothers 3 жыл бұрын

This implementation only for all objects in one image? Not for batch of images?

@AladdinPersson 3 жыл бұрын

No you calculate mAP on the entire training/test set

@ilikeBrothers 3 жыл бұрын

@@AladdinPersson Thank you

@ShanzaTariq-h1h 8 ай бұрын

what is iou ?

@kevinding1204 10 ай бұрын

I love you so much

@Wanderlust1342 3 жыл бұрын

what is this numerical stability thing epsilon ? why do we need it

@AladdinPersson 3 жыл бұрын

I usually put these in just to avoid division by zero and things like that. It might not be necessary in many situations but it is a "just in case" kind of thing

@ElfTRAVELTOUR 9 ай бұрын

I think i got it now Precision TP 1/1 because 1 green box on image 3 FP 1/2 - just increase last number as TP no increment TP 2/3 - increase both as this is a TP FP 2/4 - just increase last number as TP no increment TP 3/5 - increase both as this is a TP FP 3/6 - - just increase last number as TP no increment FP 3/7 - - just increase last number as TP no increment Recall same as Precision except change 2nd number to 4 as there are 4 green box TP 1/4 FP 1/4 TP 2/4 FP 2/4 TP 3/4 FP 3/4 FP 3/4

@MohamedUsman-xb3hx 6 ай бұрын

why the recall value 1/4

@johnalvinm Жыл бұрын

💥

@Arjun147gtk 2 жыл бұрын

couldn't understand the calculation. have replayed it multiple times.

@Arjun147gtk 2 жыл бұрын

like how you got 1/4.

@solo_driven Жыл бұрын

I don't think that it is a correct implementation of mAP. At least it not the way many other libraries have implemented. What does it even mean 'procedural' recall/precision ? Makes no sense to me. Recall/precision for the given true/predicted/threshold should be a constant (number) not a bunch of values like you get for drawing your precision(recall) graph. I would be glad if anyone can prove me wrong.

@wolfisraging 3 жыл бұрын

honestly, the name of this metric is just terrible... lol 1. Mean is average, so wth is mean average 2. It's called mean average precision, still it's not calculated by taking the average of precision values. So everyone watch this video... and clear your doubts :)

@machinelearning4611 Ай бұрын

holy shit this is totally incorrect - pathetic that the comments are all positive - thats not how you compute the PR curve

@AladdinPersson Ай бұрын

Would you mind sharing some feedback as to what's wrong?

@MG-gt7pf 2 жыл бұрын

in image 1, the predicted bb with a 0.3 confidence score should be a true negative if our threshold is 0.5. wouldn't it? @aladdin persson

@Multihuntr0 3 жыл бұрын

This is the best resource I have found for both: how to implement mean average precision, but also for simply having a function which calculates mAP for detection. Does anyone know of any library functions that exist to calculate this already? sklearn.metrics.average_precision_score() doesn't allow you to specify missed detections and thus assumes that all ground truths will be accounted for, miscalculating the recall. (Also it uses the left-hand rule instead of the trapezoidal rule to calculate the AUC; but this is a contentious point). ml_metrics.mapk doesn't seem to use confidence scores at all; just calculating how many of the topk are "correct". cocoapi.COCOeval isn't really a library function so much as it is an evaluation suite. Also it requires running `make`, I think. I've attempted to re-implement your algorithm with more vectorisation. I got rid of a lot of loops with fancy indexing. You didn't release your test code, so I can't test if it is exactly the same. It's not that much smaller, or easier to read, but, anyway, here's the gist: gist.github.com/Multihuntr/5a898e1794808ff7c6d30efca2ff52b7

@KCDRofficialprojects 2 жыл бұрын

Did you get how to calculate accuracy? If you calculated Do you help me...how to evaluate the MAP for yolov4 for social distancing or for any video?