Just did this yesterday in my organisation. Kudos to you guys for posting videos like this.
@saivarma74952 жыл бұрын
For smart picking: We can apply any of the clustering algorithms on Du. Then pick small number of points from each cluster and provide them to experts for labelling
@AppliedAICourse2 жыл бұрын
But, picking points randomly from each cluster in Du may not always guarantee improvement with M1 as most of the points so picked could be very similar to points in DL.
@abhinayabhi912 жыл бұрын
I pick this video smartly so that I learn manually and written in exams easily Active learning in my view 😊
@chathere23592 жыл бұрын
I really trying hard to arrange money for your course. Ofcourse u r greatest teacher sir, reason is there are many institutions does charge lakhs 😅
@GPravinshankar2 жыл бұрын
For multiclass classification( for example: 3 classes), we have to choose data points for which the model is giving roughly equal probability(0.33) for all the 3 classes.
@AppliedAICourse2 жыл бұрын
This thinking is in the right direction. Would you not pick those points where the class probabilities in a three class classification are .4, .5, .1? In this case, the model is confused between two of the three classes.
@GPravinshankar2 жыл бұрын
@@AppliedAICourse yes, we should consider this case as well.
@shyamshankarmathurjayashan49102 жыл бұрын
For multiclass classification(say we have k classes), can we use a softmax logic and choose those points which give out probabilities around 0.5 for atleast k/2 classes (since it is confused for half of the classes now, such points could be a good pick to our subset)?
@priyushkumar96892 жыл бұрын
for multiclass classification , we will set a threshold for probability that if the max probability among all the classes is less then certain threshold value (< 0.7) we will choose that data point for human labelling.
@RajeshSharma-bd5zo2 жыл бұрын
For multi-class classification, we can go ahead with human labelling of class for which model is not behaving up to the mark or in other words making more % of errors in some classes. By introducing human labelled data for such classes we can bring some confidence in our new model M1.
@AppliedAICourse2 жыл бұрын
But, we don’t have class labels for points in Du. So, how can we determine if a point is erroneously classified or not.
@shreymishra6462 жыл бұрын
Hi please correct me if I am wrong , for multiclass setting we can calculate entropy , since the entropy works on a similar principle where the spread of points is almost equal then the entropy should be high as compared to high seperation , so we can filter all the low entropy points
@AppliedAICourse2 жыл бұрын
Perfect, Entropy is a very popular metric that can be used to numerically quantify the uncertainty in class labels. That's why we use cross-entropy as the loss function in multi-class classification.
@devanshverma53952 жыл бұрын
Excellent approach
@ehshankhan70032 жыл бұрын
Suppose we have three class classifications, our threshold could be 0.33. But if we get a probability like 0.1, 0.4 and 0.5. We can choose both the 0.4 and 0.5 data as a sample for labelling.
@AppliedAICourse2 жыл бұрын
Note that there is only one xi that has these three probabilities for the three classes. It’s just a single point, not two or three. But, you are true that we will pick can xi’s and here the model is less certain based on the fact that no single class has a high probability like 0.9 or so.
@dwarikaprasadteli10302 жыл бұрын
Could you please make a explanation for "Extreme multi label classification" problem, Which is also covered in course (StackOverflow Tag prediction problem) but in that case we limited ourselves to use few labels. what kind of solution we can apply for these type of problem..
@anujsali31712 жыл бұрын
My opinion to how to extend it to multi-class setting: In ML, if we're given with a k-class classification problem, we can solve it using k binary classifiers. So I think One Vs Rest we can employ. Just my thinking.
@AppliedAICourse2 жыл бұрын
While this is a possible solution, can you think of alternative and simpler methods where you don’t have to build k binary-classifiers.
@CRTagadiya2 жыл бұрын
the method you mentioned might not work if we don't have probabilistic model ( the model which only gives class not the probability). what should we do now?
@AppliedAICourse2 жыл бұрын
Most machine learning and deep learning models can be slightly modified to obtain class probabilities. Hence, this is not a major issue in the real world.
@True_Feelingsss...2 жыл бұрын
Go with One vs Rest approach.
@AppliedAICourse2 жыл бұрын
While this is a possible solution, can you think of alternative and simpler methods where you don’t have to build k binary-classifiers.
@True_Feelingsss...2 жыл бұрын
@@AppliedAICourse Method 1: Sample points from Du such that wherever model will fail to achieve 0.9 probability for particular class Method 2: For given point, Pick two highest class probabilities if there difference is less than some alpha e.g (0.3) then that particular point is not sure point.
@aakashverma16222 жыл бұрын
What i think is ....we can do 1 vs all.... let's say we have 10 classes....and we can get probability whether a point belong to that class or not...if the probability of a point belonging to a class is very high...we will leave that point.....and we will do same for all classes......in the end we will be left with points that has no high probability of belonging to any classes....we will choose those points
@anujsali31712 жыл бұрын
A doubt: Why cant we use Unsupervised Techniques like Clustering to label the large Unlabelled data? Just curious🤔
@AppliedAICourse2 жыл бұрын
But, picking points randomly from each cluster in Du may not always guarantee improvement with M1 as most of the points so picked could be very similar to points in DL.
@shyamshankarmathurjayashan49102 жыл бұрын
For multiclass classification(say we have k classes), can we use a softmax logic and choose those points which give out probabilities around 0.5 for atleast k/2 classes (since it is confused for half of the classes now, such points could be a good pick to our subset)?