Machine Learning Lecture 9 "Naive Bayes continued" -Cornell CS4780 SP17

Рет қаралды 24,314

Күн бұрын

Пікірлер: 38

@taesiri 5 жыл бұрын

Here is the question from the quiz (7:10): Suzie is about to go on her very first blind date. Her mom is a little worried, and wants to give her a way to tell good from bad guys. She trains a Naïve Bayes classifier to predict if people are good or evil. As training data she collects six people who she knows are definitely good (Batman, Superman, Spiderman) or bad (Riddler, Penguin, Joker). She defines the following binary features: C = “Guy wears a cape.” M = “Guy wears a mask.” U = “Guy wears his underwear outside his pants.” The label is: Y = “Guy is good”

@phamngoclinh1373 4 жыл бұрын

thank you. That helps a lot!!!!!

@TrentTube 4 жыл бұрын

I don't think Suzie's mom gets out much.

@cuysaurus 4 жыл бұрын

Hello! What book did you guys use?

@hdang1997 4 жыл бұрын

"Not all heroes wear capes and masks". Thank you!

@abunapha 5 жыл бұрын

If you have the page he is holding you can start from 7:10. If not it starts in 20:16

@minerva646 4 жыл бұрын

share that page

@yunshunzhong4491 2 жыл бұрын

Best ML course I've ever seen on the whole internet!

@KW-fb4kv 3 ай бұрын

He's such a good lecturer that 2/3rds of students show up in person.

@hdang1997 4 жыл бұрын

Excellent quiz to demonstrate the use of naive bayes. Thank you professor!

@tarangsuri1611 Жыл бұрын

where can i find the quiz

@gregmakov2680 2 жыл бұрын

yeah, exactly, it could be over or underestimate because it is actually interpolation.

@ugurkap 5 жыл бұрын

Hi, is it possible to upload the mentioned quiz?

@SeverinoCatule 4 жыл бұрын

Carl Sagan and Steve Nash thats a nice duo

@prwi87 Жыл бұрын

The sum Professor wrote in the denominator is called the law of total probability.

@manastripathi6746 2 жыл бұрын

I did not understand the last part of the video, about the multinomial distribution. What was Theta representing, when x is a categorial variable that can take values between 0 to M ?

@mohammaddindoost2634 Жыл бұрын

anyone knows what were the projects? or where we can find?

@georgestu7216 4 ай бұрын

Hi All, can someone give some more information about the notation at 31:40? What does the indicator “I” mean? Thanks

@tarunkaushik4769 7 ай бұрын

Why professor is mentioning binomial distribution for features with categorical values? Isnt binomial distribution used where we have binary outcomes? Its at 33:32

@TylerRainey-of3us Жыл бұрын

quiz can be found here classes.cec.wustl.edu/~SEAS-SVC-CSE517A/lecturenotes/05_lecturequiz_NB.pdf

@lima073 Жыл бұрын

Thank you !!

@rishidixit7939 2 ай бұрын

thanks

@doyourealise 2 жыл бұрын

this question has been used and the villian(Riddler and penguin) are now in batman movie. 7:10

@rodas4yt137 4 жыл бұрын

I tried to reconstruct the table of the exercise at 7:10 BATMAN SUPERMAN SPIDERMAN RIDDLER PENGUIN JOKER Cape 1 1 0 0 1 0 Mask 1 0 1 1 0 0 Underwear 0 1 1 0 0 0

@punktdotcom 3 жыл бұрын

But if we look at 13:36, wouldn't the anwser to the question be "Joker" or in terms of Probabilty that it's a good guy 0%, if we would use this table?

@semihgulum9438 3 жыл бұрын

@@punktdotcom I thought like you when I first heard the question but we are not doing it like the other questions. We simply decompose the "-C" and "-M" from naive bayes. We think so -C and -M are independent events conditioned Y. Hereby P(-C, -M|Y)≠0. He actually explains this after the question. I hope it is useful.

@gaconc1 3 жыл бұрын

Spiderman doesn’t wear underwear outside his pants, batman does

@rakinbaten7305 12 күн бұрын

geez does anyone ever wonder what beast of a middle school Killian went to? 37:29

@KOSem-ke9jn 4 жыл бұрын

Can someone please explain why the Naive Bayes Classifier is the perfect classifier (i.e. recovers the Bayes optimal classifier) if the naive Bayes assumption of conditional independence holds per the professor's comment from kzbin.info/www/bejne/jHWuYaGhn6uba7c#t=21m13s onwards ? I thought the available data can only lead to an approximation of the true underlying probability distribution. Therefore, the empirically derived probability distribution can only ever be an estimate of the true distribution. The only way to get a perfect classifier should be (i) if the naive Bayes assumption holds and (ii) we have data that spans the entire support of the true distribution i.e. we see all possible feature values.

@deepfakevasmoy3477 3 жыл бұрын

Lets say, we already have an empirically derived probability distribution P'(X,Y) which is an approximation of true distribution P(X,Y).. So, AFTER having this approximate distribution P'(X,Y), the best way to predict outcome Y can be done by Bayes Optimal Classifier. Because, we will use Bayes Rule (universally accepted as correct) to estimate P'(Y|X). So, to sum up again, Bayes Optimal Classifier(BOC) is best method to estimate Y|X after having approximate distribution P'(X,Y). It doesnt say that BOC is able to estimate best P'(X,Y). It just says that after having P'(X,Y) (by using MLE or MAP), best way to estimate Y|X is to use universally accepted rule which is Bayes Rule which is what Bayes Optimal Classifier does --> P'(Y|X) = P'(X,Y) / P(X) .. Does it makes sense now? About your first question, if Naive Bayes assumption holds (we prove that feature value X's are conditionally independent on Y) then, it turns out that we just execute Bayes Optimal Classifier with the help of Bayes Rule and as said before, Bayes Rule is correct - so, Naive Bayes is also considered perfect classifier.

@KOSem-ke9jn 3 жыл бұрын

@@deepfakevasmoy3477 Yes, perhaps I misunderstood the statement: the Naive Bayes classifier is the best classifier given a fixed empirical distribution, if the conditional assumption holds. The error due to approximating the true distribution with the empirical one will always remain. Thanks.

@deepfakevasmoy3477 3 жыл бұрын

@@KOSem-ke9jn Exactly. Welcome!

@WipinCumar 2 жыл бұрын

In case of independence, probability density multiplies out.

@bharasiva96 4 жыл бұрын

I was able to follow the method that was used to solve the final question in the "dating" quiz but I don't follow one thing though. I used the following method and got a completely different answer: The question asks us to find the probability of the person being a good guy, given that the person has no mask and no cape. Among the good guys we have superman, batman and spiderman and among the bad guys we have joker, penguin and the riddler. Now, if we remove all the people who have no mask and no cape, we have no good guys left from the six people in our sample space. So shouldn't the probability of the person being a good guy, given that the person has no mask and no cape be 0?

@kilianweinberger698 4 жыл бұрын

If you use this approach, it fails because you have insufficient amount of data. That's where the Naive Bayes assumption kicks in. Instead of estimating P(G|-C,-M) you only need to estimate P(-C|G) and P(-M|G) and P(G) independently, for which you have enough data. Hope this helps.

@Neo-kx3fe 3 жыл бұрын

@@kilianweinberger698 Hi Kilian, I have a similar question. 18:28 We have P(-C,-M)=P(-C,-M|good)P(good)+P(-C,-M|bad)P(bad). For P(-C,-M|bad), shouldn't it be 1/3 since only the Joker in our data is the bad guy who doesn't wear a mask and a cape? The reason why you expand it using NBayes is again because of the limited data? Can we conclude that when the data size is small, we shall always use NBayes to estimate the conditional probability rather than trust our observation? Much appreciated.