I feel that there are a lot of intricacies that are not explained. Great lecture hands down, but I'm starting to feel that I need concrete examples or implementation to understand many of these subtleties.
@dhidhi10004 жыл бұрын
This is a video series, there is like 9 other videos explaining
@davidvultur87042 жыл бұрын
kzbin.info/aero/PL_IHmaMAvkVxdDOBRg2CbcJBq9SY7ZUvs This seems to be the one
@ykpoff5 ай бұрын
@@dhidhi1000 so?
@nischalkhadgi81285 жыл бұрын
Great one. Was really helpful. Hope you put some demonstration as well.
@nidhihada11225 жыл бұрын
One doubt. if we are specifying Bx By Bh Bw then we can specify any anchor box. Then for an image where two objects are present in same grid cell, sharing same shape of anchor box, even this can be solved by using their respective bx, by, bh, bw in output. Where in both anchor box have their own bx by bh bw. I could not understand why andrew says it can not be solved.
@sanjivgautam90634 жыл бұрын
The anchor box concept is not clear. Hands down to great explanation till date though. I want to share few ideas here. During training, the label that contains bx,by,bh,bw is changed to be between 0 and 1. Obviously, bh and bw can be greater than 1. So for each of those "normalized" bounding boxes, we try to determine which of the predefined anchor box is suitable. How do we define the "suitability"? The IOU. So if we choose 5 anchor boxes, we check our normalized bounding box against all those 5 anchor box and determine which one has highest IOU, so we choose that anchor box as what Andrew NG has explained above. Also the loss function is bit of a headache to explain, here in comment. But would be great if Andrew had explained it himself. Nevermind though, we are getting videos from AI god himself!
@heejuneAhn4 жыл бұрын
Thank you, Prof. Ng, I learned a lot. One question or request for clarification. Is it that the anchor box shapes should be taken into account to the network receptive field? So we need to use non-squared convolutional filters? Thanks.
@koeficientas5 жыл бұрын
If I have only 2 classes, I can give hardwired anchor for each class per grid cell and deny the c1 c2 c3? So the output vector can be y=[pc1 bx by bh bw, pc2 bx by bh bw]? pc1 - probability of perestrian, pc2 - probability of car.
@rijulsingh98033 жыл бұрын
So the minimum bound on number of anchor boxes is the number of classes present? Also, is there a way to optimize the size of anchor boxes? I'm a little confused here. Everything else here is crystal clear, thank you so much for this tutorial!
@polimetakrylanmetylu24832 жыл бұрын
If I understand it correctly, as for 1, no, you can specify any number of anchor boxes, and each one will output it's predictions for class. You can also only specify one or any arbitrarly low/high number of them - there is no relation between number of classes and number of anchor boxes. As for 2, your NN will not output the entire bounding box, but instead it outputs the correction of an anchor box. They have to be defined when you create the model. What you can do is collect every bounding box from your dataset as width-height pair, and either plot it and look at it, or run some clustering algorithm to find optimal sizes
@sandipansarkar92113 жыл бұрын
nice explanation
@TheKovosh4 жыл бұрын
if I have a fixed size anchor box, then what is the point of bw and bh
@thomasqiao9164 жыл бұрын
bw and bh define the anchor box
@marcoburkhardt64963 жыл бұрын
just good. thanks a lot :)
@adityarajora72194 жыл бұрын
how it can predict bounding box larger than grid cell................explain, please.....if anyone knows YOLO
@sanjivgautam90634 жыл бұрын
Here is that thing. We actually have bx and by which falls between 0 and 1. However, the bw and bh (width and height) can have values more than 1, so that any object that goes beyond the grid cell is incorporated with that bh and bw. Did you get the point? In one of his videos, he explains how bx and by falls between 0 and 1 whilst bh and bw can go higher than 1.
@adityarajora72194 жыл бұрын
@@sanjivgautam9063 thanks......but still I didn't get intuition.......could you give that video reference.
@sanjivgautam90634 жыл бұрын
@@adityarajora7219 kzbin.info/www/bejne/nXzVlo2Fis5ghZI. I think you are following a playlist that doesn't have one video in it. The video in this link explains the bounding box rules.
@TheKovosh4 жыл бұрын
One video is missed that's why I have problem understanding the rest.
I think anchor box algorithm is for those problems lying somewhere between image classification and pixel classification. Recognizing an object that is either the entire image or a pixel is really tricky.
@anujk.98935 жыл бұрын
If we define the shape and size of anchor boxes, won't we need only 2 outputs to identify it. Bx and By would be enough. We should not need Bh and Bw ? Please explain if someone knows
@tomvandewiele70315 жыл бұрын
We predict an arbitrary height and width so we do still have to output Bh and Bw. With anchor boxes, the IoU is used to pick the best matching anchor box shape of the labeled data. The target shape (together with Bx, By and the class) is only set as a target for the best matching anchor box.
@nithinmesingerme69762 жыл бұрын
As the size of anchor boxes are fixed.. how the same kind of object, one which very close and one which very far works??
@EranM5 жыл бұрын
0:25 right in the nuts
@HabibRK5 жыл бұрын
it's a she
@lovemormus5 жыл бұрын
@@HabibRK how do you know it's a she
@ganonlight3 жыл бұрын
These anchor boxes seem more like a workaround than an actual solution tbh
@vishaljain49153 жыл бұрын
Agreed, do you have a better idea
@ganonlight3 жыл бұрын
@@vishaljain4915 No not really
@vishaljain49153 жыл бұрын
@@ganonlight 😂😂😂 me neither aha
@ganonlight3 жыл бұрын
@@vishaljain4915 😅
@akashkewar3 жыл бұрын
Anchor boxes are one of the many ways you can use for object detection. Algorithms like "CornerNet" don't use anchor boxes to locate objects but keypoints. Some algorithm also uses pose estimation or/and semantic segmentation to give you pretty accurate bounding boxes prediction like Pose2Seg and so on. Just google search "anchorless object detection". Also, tbh most of the stuff you see in machine learning is "workaround", but it's magic to see them work so great. There is no silver bullet that could solve all the problems, machine learning is all about choosing the right tools and being creative to the problem given in hand.
@ShubhamKumar-me7xy2 жыл бұрын
Mid point of pedestrian :xd
@sandipansarkar92113 жыл бұрын
nice explanation
@guardrepresenter50995 жыл бұрын
What is pc and how pc know himself 0,1 before c1,c2,c3 are unknown????
@adityarajora72194 жыл бұрын
PC shows there is "something" with probability and c1,c2,c3 describes what this "something" actually is.
@dota2islife2625 жыл бұрын
what is the name of the course on Coursera
@maxbaugh93723 жыл бұрын
Deep Learning Specialization - Course 4: Convolutional Neural Networks