P-values Broke Scientific Statistics-Can We Fix Them?

Рет қаралды 408,503

SciShow

Күн бұрын

Пікірлер: 1 200

@SciShow 5 жыл бұрын

There is a typo at 7:37! The P-value for 6 tea cups is 0.05, not 0.5. Thanks to everyone who pointed it out!

5 жыл бұрын

There is a typo at 7:37! The P-value for 6 tea cups is 0.05, not 0.5.

@RT-oy7mu 5 жыл бұрын

@@SirShades23 Nice try, @daniquasstudio was the one who corrected it.

@daniquasstudio 5 жыл бұрын

It is an honor, thank you

@VariantAEC 5 жыл бұрын

That last option should be the way journals proceed. Results shouldn't matter, if they do the science takes a backseat.

@davidalearmonth 5 жыл бұрын

I feel like the stats approach on the tea with milk is wrong at 1 in 70. I would have figured each cup was a 50/50 chance, so picking 8 currently would be 1 in 256?

@mhaeric 5 жыл бұрын

There's something both meta and ironic about a dead fish being used to poke holes in a methodology by a Fisher.

@HaloInverse 5 жыл бұрын

You could _also_ say that he was fishing for data that supported his hypothesis.

@reallyWyrd 5 жыл бұрын

It reminds me of the famous robot and dead herring experiments carried out at the Maximegalon Institute For Slowly And Painfully Working Out The Surprisingly Obvious. Except that this result wasn't obvious. Except that, if we were better at actually doing stats and science, it would have been.

@KnakuanaRka 4 жыл бұрын

*ba dum tss* xD

@lc9245 3 жыл бұрын

No, it didn’t. The methodology by Fisher is just a set of theories, the practice of those theories were what’s troublesome. His method is fine, but the considerations when it comes to statistics, the meta data, weren’t in consideration. Because p-value is easy to calculate, researcher abuse it. It’s not Fisher’s fault, it’s society’s fault.

@leonorf2730 3 жыл бұрын

Looks like 😎 the Fisher became the fished.

@WeatherManToBe 5 жыл бұрын

Just a heads up for everyone; you can tell the difference between milk first vs tea first. If you do milk first, you temper the milk as you pour the tea in, stopping the proteins in the milk from denaturing and clumping together on the top as a skin or foam. (Only concerning freshly brewed tea held in a decent pot staying near boiling point) If milk is added to a near full cup of tea, the first bit of milk gets 'burnt' before the tea is eventually cooled down with additional milk added. If tea is below 82 degrees, there is no difference. This is the same problem with gluten/eggs/other dairy in sauces. Always add hot stuffs to cold stuff, the slower the better.

@raxleberne4562 4 жыл бұрын

It's amazing, the subtleties there are to be overlooked when studying things. I feel as if I will think of this every time I encounter something with no apparent explanation.

@Achill101 3 жыл бұрын

@@raxleberne4562 - the point of statistical tests is to see if there's an effect at all, not yet to understand the causation. If it is nearly certain that there's an effect , people are more likely to look into the mechanism of how it works. We shouldn't criticize a statistical test for not doing what it's not supposed to do.

@sophierobinson2738 3 жыл бұрын

Works with coffee, too.

@laurelgardner 3 жыл бұрын

Yeah, I found it pretty GD annoying that they just assumed it was nonsense when making this video.

@ruairidhmcmillan2484 3 жыл бұрын

@@laurelgardner exactly, it's not scientific to dismiss any potential effects at the level of two experimental media interacting (milk and coffee) just because these effects are not immediately apparent. Science wouldn't be all that useful if everything which was apparent made for an accurate representation of everything which is not apparent.

@Paul-A01 5 жыл бұрын

DM: You encounter a feral null hypothesis. Researcher: I run a study on it! *rolls* Critical significant results!

@calamusgladiofortior2814 5 жыл бұрын

I find this joke... (rolls d20, checks table) amusing.

@MrUtak 5 жыл бұрын

*rolls a 20* Did the DM see it? *rolls again*

@mal2ksc 5 жыл бұрын

I cast Hellish Rebuke as a reaction to discredit the researcher!

@ValeriePallaoro 5 жыл бұрын

f*ckin excellent!!

@dmarsub 5 жыл бұрын

This is why in some pen and paper system critical rolls only happen with 2 rolls now. (And why study reproduction is so important)

@argentpuck 5 жыл бұрын

1 in 20 has always bothered me when I studied statistics in a scientific setting. Any D&D player can tell you just how often a 1 or 20 actually comes up and it's rather more often than 5% sounds like. Edit: This blew up a lot more than I expected and people are focusing on the wrong thing. I used D&D because I figure most people who watch these videos are familiar with rolling icosahedrons. The point, though, has nothing to do with dice probability or the cognitive biases around particular results (although, thinking about it, that does speak to p-hacking). The point I intended is that 5%, especially in a large sample, is quite a lot. If I flood the market with a placebo cure for the common cold and 5% of the 10,000,000 who used it report that it worked, that's half-a-million voices confirming pure nonsense. Cognitive biases being what they are, basically any confirmation can get people to draw the wrong conclusion (e.g., anti-vaxxers), certainly, but a 1-in-20 probability that something is pure chance is rather high odds and this video confirms that it is basically arbitrary.

@richardoteri356 5 жыл бұрын

Yes.

@joegillian314 5 жыл бұрын

The reason it's 5% is because of the empirical rule. In a normal distribution we have the following properties: approximately 68% of all data lie within 1 standard deviation of the mean approximately 95% of all data lie within 2 standard deviation of the mean approximately 99.7% of all data lie within 3 standard deviation of the mean The second property is where the 5% comes from.

@jackielinde7568 5 жыл бұрын

I was thinking about this very thing... with my dice bag a foot away from me on the desk.

@crovax1375 5 жыл бұрын

There is a bias towards recalling a roll of a Nat 1 or 20 over any other failed or successful roll, because players get more excited about a critical failure or success

@interstellarsurfer 5 жыл бұрын

@@joegillian314 So, did the empirical rule lend itself to D&D, or does D&D adapt to the empirical rule? Further research is needed. 😅

@codysmit 5 жыл бұрын

So you could say that the p-value... was born from a tea-value.

@mdunkman 5 жыл бұрын

Cody, it was a result of a Student’s Tea-test.

@microbe_guru 5 жыл бұрын

@Dornatum 5 жыл бұрын

Oh my God that makes so much sense

@markdodd1152 5 жыл бұрын

They kind of tea-bagged the P value

@jonathankool1997 4 жыл бұрын

Is it worse that is such a thing as a T value?

@brentrawlins6490 5 жыл бұрын

As a statistician, it is sad to see such a potentially powerful tool be misused so much.

@jackielinde7568 5 жыл бұрын

As a statistician, do you have polyhedral dice and how often do you abuse statistics when playing D&D? ;)

@interstellarsurfer 5 жыл бұрын

Cooking the books is a problem as old as... books! 😋

@brentrawlins6490 5 жыл бұрын

@@jackielinde7568 Yes, and I roll in the open with witnesses. Also, what is the point of playing a game if you're going to cheat? In my experience failing at something can be just entertaining at succeeding.

@jackielinde7568 5 жыл бұрын

@@brentrawlins6490 Oh, I wasn't saying you fudge your rolls. I was "suggesting" that you run the numbers for probabilities of success. I've seen players do that. Not saying Min Maxing is wrong when playing D&D, but it's just not my cup of tea. :D

@brentrawlins6490 5 жыл бұрын

@@user-jp1qt8ut3s is it possible to switch the wording from "significantly different" to "fundamentally different" I might get you out of having to find the P-value

@film9491 5 жыл бұрын

I love how petty the origin of p value is. I never heard that story before

@sohopedeco 5 жыл бұрын

I still wonder how the woman sensed the order of pouring of her cup.

@marin0the0magus 5 жыл бұрын

@@sohopedeco Eh, perhaps there is something in the way the diferent beverages mix, or how the sugar in the milk reacts with the tea, maybe? Some people can be very sensitive about their tea, from the type of leaves to the water type and temperature and to the time the leaves were infused before serving... So it wouldnt surprise me if that was the case.

@marin0the0magus 5 жыл бұрын

@@sohopedeco "Milk should be added before the tea, because denaturation (degradation) ofmilk proteins is liable to occur if milkencounters temperatures above 75°C. " Huh. Would you look at that o:

@limiv5272 5 жыл бұрын

@@marin0the0magus I was thinking it could be related to the cup's temperature. If the milk is added first the cup is still cold, but if the tea is added first the cup is very hot when the milk is added so it's surrounded by heat from all sides. This is obviously not a well formulated explanation. My dad loved to do these kinds of experiments with me when I was little because I'm a very picky eater and he didn't believe me that things were different and thought I was just being stubborn. Then, of course, I proved to him I can tell the difference between 3% and 5% white cheese and water from the tap and water that went through a filter (-:

@eagle3676 5 жыл бұрын

@@marin0the0magus yes if you're a tea addict, you can notice small differences

@Greg5MC 5 жыл бұрын

This video should be mandatory viewing for every science class.

@ErroneousTheory 5 жыл бұрын

Every science class? Every human

@delphinidin 4 жыл бұрын

Every scientific journal... and science graduate program... and university science department...

@8cordas381 5 жыл бұрын

I am a medical doctor and I will show this video forever to so many colleagues who do not have that insight when using studies to make decisions. Loved it, thank you.

@frankschneider6156 5 жыл бұрын

MDs are no scientists (unless they do this full-time and then they know anyhow), so that's carrying owls to Athens.

@8cordas381 5 жыл бұрын

@@frankschneider6156 No, but MDs get thrown at a lot of studies to guide our decisions, and yes, we do read them, being outdated is not allowed in our job. One current awful consequence of statistics misuse misguiding MDs is the opioid crisis, in plain sight.

@frankschneider6156 5 жыл бұрын

8cordas Yes I agree, but a single study isn't worth the paper it's printed on. It's rather the ratio of cumulated amount of papers in favor of something vs those negating it, thats important. A single paper (even if absolutely thoroughly executed) is rarely sufficient to base decision making upon it. And that's of course far more true, if the authors are biased and hell bend on getting a certain result..

@8cordas381 5 жыл бұрын

@@frankschneider6156 That is the right way, but that is exactly where the danger and manipulation lie. The methodology of how meta-analysis choose which studies to use, to tweak and to search details so certain studies that do not have the result you want do not have the characteristics to be included in the meta-analysis. I see your point, and in an honest world things should work in the way you describe, but some people would do anything for extra cash, and those few people are enough to mess a whole system.

@frankschneider6156 5 жыл бұрын

8cordas I meant the cumulative amount of papers, not meta studies. In theory meta studies should be a great thing significantly increasing the data set and thus the accuracy of the result, but in practice every study has undocumented properties and boundaries that often the researcher himself isn't even aware of. So mixing data (possibly gathered for different purposes with different technologies, different levels of detail, different environments or populations) from lots of different studies typically just mixes apples and pumpkins and out comes ... well .. garbage (GIGO, garbage in, garbage out) and that's still assuming the team conducting the meta-analysis to be well meaning, honest and skilled. So I perfectly share your critical view of meta studies. I haven't seen a single one (at least as far as I can remember) that I would trust farther than I could throw a truck.

@SingularityasSublimity 5 жыл бұрын

A very important topic that not enough people (including scientists) consider. The limitation of p-values focused on in this video are Type I errors (wrongly rejecting the null hypothesis). However, Type II errors (wrongly accepting the null hypothesis) are very problematic as well. Let's say you get a p-value of .25, which is well above the threshold set by Fischer. It still indicates that there is a 75 percent probability that your results are not an artifact of chance. Usually this outcome is the result of small sample sizes but not necessarily and it can lead researchers to stop considering a legitimate finding that just happened not meet this p-value criteria, which would also be a shame if we are talking about a potential treatment that can help or save lives. Beyond Bayesian stats, effect size stats are also very helpful here.

@jeffreym68 5 жыл бұрын

I am always surprised to see how few fields are calculating and publishing effect sizes. I used to think that was the default, rather than the outlier.

@SingularityasSublimity 5 жыл бұрын

It is completely shocking

@entropiCCycles 5 жыл бұрын

I'm reminded of some summary research in Psychology as a field (it may have been the big replication attempt or some other bit of meta-research), where they found that, for studies that used the often cited alpha of .05, the power of such tests were about .06. I'm also reminded of a professor's talk from back in graduate school where they showed that, with sample sizes common in Psychological research, Ordinary Least Squares regression was outperformed, not only by equal weights (i.e. every predictor had the same slope term), but by *random* weights.

@randylai-yt 5 жыл бұрын

The real difficulty is when multiple tests are involved, the interpretation of effect sizes are no longer calibrated. On the other hand, p-values at least could still be adjusted to account for the inflation of type I error.

@piguyalamode164 5 жыл бұрын

@@entropiCCycles Wow, your line of best fit being worse than random. Impressive!

@TechnoL33T 5 жыл бұрын

9:20 is such an AMAZING idea! Kill the drive for success in publishing! Incentivizing skewing results for attention is so bad, and this is definitely the fix for it!

@drdca8263 5 жыл бұрын

Just confirming that you aren’t being sarcastic

@TechnoL33T 5 жыл бұрын

@@drdca8263 Not at all! I suppose this could look like exaggerated enthusiasm, but I find the idea to be legitimately exciting!

@drdca8263 5 жыл бұрын

MangoTek Thank you for confirming! I largely agree. Well, I definitely agree that it is promising, less sure that it is the “One True Solution” in practice? Definitely agree that it is a theoretically really nice solution, by entirely bypassing the incentives there, and it would be really cool if it works out well in practice, and there is a good chance that it will.

@TechnoL33T 5 жыл бұрын

@@drdca8263 it may not be perfect, but it's a whole world ahead of what we're doing now. I don't see any downsides that aren't already dramatically worse right now.

@drdca8263 5 жыл бұрын

MangoTek I think it is likely to work, but let me spitball some potential (potential in the sense of “I can’t rule them out”, not “others can’t rule them out”) issues. This setup would result in a larger number of studies published with null results (and not just interesting null results). Therefore, in order to have the same number of studies with interesting results, this requires a greater total number of studies published. Reviewing the proposals takes time and effort. If we for some reason cannot afford to increase the amount of effort spent on reviewing papers before publication, and so can’t increase the rate of papers being published (this sounds unlikely? Like, probably not actually a problem), then this would result in a lower rate of papers with interesting and accurate results? Which, could very well be worth it in order to eliminate many of the false results, but exactly where the trade-off between “higher proportion of published results are correct” vs “higher number of correct published results” balances out, idk. But yes, I agree it sounds like very good idea, should be tried, hopes it works out.

@jeffreym68 5 жыл бұрын

I agree with the two-step process. I hate the idea of killing statistical significance just because some people use it incorrectly because they either misunderstand it or, much worse, but hopefully much more rarely, because they are purposely misusing them. I'm boggled by the number of times I have to explain, even to scientists, that you have to set your p-value FIRST, typically using similar studies as a guide, THEN analyze the data and interpret the results. Perhaps one solution is more and better teaching of the topic. Amazingly, some fields of graduate study do not require expertise in psychometrics.

@NeoAemaeth 5 жыл бұрын

I guess you mean α not p?

@jeffreym68 5 жыл бұрын

@@NeoAemaeth Actually, I used an abbreviation for the phrase "setting the probability that the results will be due to chance with which we are comfortable in this experiment" because I thought it was more understandable to the general reader. My apologies if it had the opposite effect.

@benderrodriguez142 5 жыл бұрын

The real issue is not setting the p value ahead of time but manipulation or elimination of data to make the value be 0.05. As a scientist who reports to a p hacker at work, it is a major issue.

@jeffreym68 5 жыл бұрын

@@benderrodriguez142 I definitely agree that it's a huge problem, and have been employed by a person who did this (briefly, obviously). But I have more often been hired by people who honestly didn't know how the process SHOULD work. In my experience, making people commit to the whether they will use .01, .05, etc. ahead of time fixes the problem with people reporting a mix of p values because they don't know better. Short of professional shunning, reviewers asking pointed questions or changes in ethics I'm not sure what to do about p hackers.

@benderrodriguez142 5 жыл бұрын

@@jeffreym68 that makes sense. Guess I haven't ran into too many people that didn't understand the process. Although, I know a few that act like they didn't understand what they were doing was wrong, full well knowing it was being misused. Can't wait to get a new job as I feel dirty every time I leave work. My boss also tried to put ** and then label that as 0.1 to trick people it is really 0.01 and what not. Some people lack ethics.

@MyBiPolarBearMax 3 жыл бұрын

Science: “double blind studies are the gold standard because it eliminates the bias of the researchers’ preferred outcome!” Also science: “we dont need two step publishing!”

@agnosticdeity4687 5 жыл бұрын

I would like to point out ( in my most pretentious British accent) that adding the milk to a hot or near boiling cup of tea "shocks" the milk because of the sudden change in temperature, whereas adding the milk first and then the tea raises the temperature slowly and this (according to my old boss) has an effect on the taste. Also I have to admire the intelligence of this scientist. That is a very smart way to get a free whole salmon ;-)

@johncarlton7289 5 жыл бұрын

This is probably the best video you guys have done in more than a year.

@insertfunnyhandlehere 5 жыл бұрын

Heat changes the flavor of dairy products at relatively low temperatures just the act of the tea being cooled by the cup before mixing can make a subtle change in your tea.

@MrTheWaterbear 5 жыл бұрын

But it's by mere degrees difference. It's not impossible, but it's very strange if that were the reason... I mean, unless the cups are super cold.

@dejayrezme8617 5 жыл бұрын

Answering the real questions about this video haha. It makes sense, pouring hot tea into milk will lead to a different temperature difference. The milk will get into contact with far more hot water molecules when tea is poured last, not just because the cup isn't cooling it but because you mix the milk and hot tea constantly while pouring. It might also be that you end up with smaller suspended fatty milk droplets.

@MrDrakkus 5 жыл бұрын

I was about to say something similar! If you start with the tea first, the heat of the tea will "cook" the dairy as you pour it faster than the dairy cools the tea. If you start with the dairy first, then it will cool the tea faster than the tea will cook the dairy. At least, up until you stop pouring and the temperature averages out. Starting temperature and ending temperature would probably be the same either way. The important bit though is that when starting with the dairy, that initial bit of cooling faster than heating will mean less cooked dairy overall, which will have a slightly different flavor and texture. I wouldn't be surprised at all if it was enough to be noticeable.

@insertfunnyhandlehere 5 жыл бұрын

It's actually not so unusual as it's the protein breakdown caused by the heat and proteins in dairy products dont breakdown the same under 200 f as they do over 200 f and the deference of tea in the pot versus tea in a room temperature ceramic cup can change by as much 10 f in the 195 f too 205 f range. I think good eats goes over this in more detail in his milk episode.

@interstellarsurfer 5 жыл бұрын

I believe it's the temperature sensitive chemical reactions between the tea and milk, that are responsible. They're more pronounced when adding milk to hot tea, than when adding tea to a chilled cup of milk. In the same way that adding acid to water is 👌, but adding water to acid can be ☠

@Overonator 5 жыл бұрын

Bayesian analysis is the best alternative and effect sizes. This is why we have a replication crisis and why we have so many false positives and why we have (edit ”ststistically") significant results with tiny effect sizes.

@gardenhead92 5 жыл бұрын

If we started using Bayesian analysis we'd just have "prior hacking" :D

@SolarShado 5 жыл бұрын

"significant results with tiny effect sizes" This has to be one of the worst cases of jargon being misunderstood by those unfamiliar with it that I've seen. To be fair, it's also one of the wider gulfs between the common meaning and the technical meaning. It really drives home the importance of actually understanding the terminology you're reading, or being sure you're getting your information from someone who does and can 'translate' for the layperson.

@jeffreym68 5 жыл бұрын

@@SolarShado So common that people misunderstand these terms and come away with the wrong picture of the research. Short of earlier or more widespread teaching of research methods & statistics, I'm not sure how to bridge that gap.

@SolarShado 5 жыл бұрын

@@jeffreym68 My first reaction is "more people should be taught research methods and statistics", but I know, practically, that even if we tried, it probably wouldn't stick. There's very little reason for the average person to need that knowledge in their daily lives. I think the solution is more/better science reporting, like what scishow does. Though I don't have much hope that they'll ever manage to drown out the more sensationalist voices...

@Overonator 5 жыл бұрын

@@SolarShado Am I not understanding something?

@SuperCookieGaming_ 5 жыл бұрын

I wish you could have made this years ago when I was taking statistics. you explained the concept so well. it took me a week to wrap my head around why we used it.

@jamesmnguyen 5 жыл бұрын

P-Values have basically become an example of reward-hacking.

@ValeriePallaoro 5 жыл бұрын

that's what she said ...

@jamesmnguyen 5 жыл бұрын

@@ValeriePallaoro That literally does not apply to this comment.

@tonyrandall3146 3 жыл бұрын

@@jamesmnguyen *teleports behind you*

@vice.nor.virtue Жыл бұрын

That experiment with the cups of tea is literally the most British piece of science I’ve seen in my whole life

@AlexComments 3 жыл бұрын

I took Business Statistics in college last semester, and it's wild how much more sense this makes than the intro lecture on hypothesis testing that I sat through months back.

@corlisscrabtree3647 5 жыл бұрын

Awesome video. Truly appreciate it. An excellent review of all the things my committee told me when I was doing my dissertation research! I hope you can find a sponsor to discuss sample size and power next.

@ThinkLikeaPhysicist 5 жыл бұрын

This is why, in particle physics, we use the 5-sigma criterion (a p-value of 3x10^-7) for discovery. A p-value is one of the most useful tools in reporting scientific results, as long as you use it correctly! If you want to know more, we've got some good statistics videos over at our channel Think Like a Physicist.

@cantkeepitin 28 күн бұрын

This value for 5 sigma also fully depends on ASSUMPTIONS! There is no guarantee for having a normal distribution in all physics experiments

@coolsebastian 5 жыл бұрын

This was a very interesting episode, great job everyone.

@frankschneider6156 5 жыл бұрын

The first video in a long time, that honors the name SciShow. Keep this level up.

@jp4431 5 жыл бұрын

I had an epidemiology prof keep telling us not to focus on p-values, but on confidence intervals and effect sizes (clinical significance).

@DharmaJannyter 5 жыл бұрын

As a first test I would've just given her 8 cups of one type but told her it was 4 cups each. :P That should lower the chances of her not messing up by merely guessing, no?

@Narokkurai 5 жыл бұрын

Good god, that's a satisfying milk pour at 3:49

@kirjakulov 5 жыл бұрын

As my supervisor says: statistical significance does not mean biological significance. You always have to be very very careful interpreting the data and stats. 👍

@contrarianduude3463 5 жыл бұрын

The fish was making "eyes" at me the whole time during the MRI. How do you tell a dead fish I'm just not that in to you?

@benedictifye 3 жыл бұрын

I believe the point of pouring tea milk first is that the change in temperature of the cup is more sudden when you pour boiling water in it, so the cup is more likely to shatter if it’s not resistant to the temperature change. Putting the milk first and then warming it with tea protects the cup from such a drastic swing in temperature

@TesserId 3 жыл бұрын

This is great. I was actually wanting to see a double blind test of tea/milk order. I also want to se one about microwaving tea, and another on squeezing tea bags.

@trisstock9047 5 жыл бұрын

The statistical probability that Earl Grey tea should be drunk with milk at all is vanishingly small.

@jeffreym68 5 жыл бұрын

I'm British. That probability is, in fact, quite high, even for those of us who like Picard.

@xplinux22 5 жыл бұрын

Also ask anyone in southeast Asia or in the Indian peninsula, and you'll find all sorts of milk teas to be exceedingly popular.

@frankschneider6156 5 жыл бұрын

True, as we all know, the only way to properly drink tea is cold, mixed with red bull and ice cubes.

@jmonteschio 5 жыл бұрын

This video is easily the best recommendation KZbin has made to me for watching in a long time. Great video, and I really hope that all scientific journals completely switch over to the "decide whether or not to publish first" method.

@snowyh2o 5 жыл бұрын

Why couldn’t this come out when I was actually taking statistics? This is literally the last half of the second midterm XD

@inthso362 3 жыл бұрын

Hey, here's an idea: Fisher makes 3/5, 1/7, or 8/0 milk first/last, doesn't tell Bristol how many there are of each, and sees what happens. There, fixed it.

@rollinwithunclepete824 5 жыл бұрын

A very good video. Thanks to Olivia and the SciShow Gang!

@chadchucks6942 5 жыл бұрын

Man I clicked this hoping to learn about a fish

@m0n0x 5 жыл бұрын

I had a hard time understanding why p-hacking is such a big deal, but now its all crystal clear. Thank you!

@MarvelX42 5 жыл бұрын

"There are three kinds of lies: lies, damned lies, and statistics."

@nathanwestfall6950 3 жыл бұрын

Great video! "Publish or Perish " is a mantra I have heard chanted in quite a few institutions. I have never heard "discover the truth" or "do something useful" said though. Maybe all that's needed is a catchy phrase to encourage more academic honesty/integrity.

@ancbi 5 жыл бұрын

After 1:48 "I guess all pictures of tea and tea cups are relavant now." --- The video editor, probably.

@joegillian314 5 жыл бұрын

That's not a correct definition of a p-value. The meaning of a p-value is the probability of getting a result at least as extreme as your data, under the assumption that the null hypothesis is true. To say that it is the probability of the data occurring at random is not exactly right because you cannot forget the assumption of the null hypothesis being true. Additionally, the evaluation of a p-value is based on the level of significance (alpha) which is entirely determined by the experimenter(s). [There are some conventions when it comes to choosing a level of significance, but ultimately a person can choose whatever value for alpha they want].

@imranrashid8615 5 жыл бұрын

Joe Gillian .. we get it you took high school stats. They gave a good and concise summary in everyday language

@gardenhead92 5 жыл бұрын

Moreover, since this is probability we're talking about, *ALL* data occur at "random", by definition.

@fujihita2500 5 жыл бұрын

Keep using that word, I don't think the significance level means what you think it means

@npip99 5 жыл бұрын

Adding complex phrasing doesn't add content. She fully explained that they were calculating the odds "Assuming she couldn't tell the difference between the two types of tea". Just because you decided to call that sentence a "null hypothesis" doesn't mean the original explanation was wrong, nor does it mean you're learning anything by memorizing more terminology as opposed to trying to learn the actual concept instead. This is the epitome of why the school system manages to supposedly teach "something", but infact teach nothinges of real content at all. It's just memorization. 3:52 is the definition, again "even if the effect they're testing for doesn't exist" is the logical reasonable and easily understood way to say "assuming the null hypothesis"

@Lucky10279 5 жыл бұрын

They did say "in a nutshell."

@Lucky10279 5 жыл бұрын

"P-value, the probability that you'd get that result if chance is the _only_ factor.". This is the clearest, most straightforward definition of the term I've ever come across. I tutor basic statistics and I'm definitely borrowing this definition to tell students what the P-value means and why it's not quite the same thing as the probability that your hypothesis is true. That one phrase has made it far more clear to me why this is the case, which will help me explain it. The textbook the school uses emphasises that the P-value is NOT the probability that the hypothesis is correct, but it doesn't clearly why.

@SECONDQUEST 5 жыл бұрын

Of course you can tell the difference right away. It's about mixing properly

@QuantumPolagnus 5 жыл бұрын

Thank you, SR! I always get excited when I hear them gearing up for announcing the President of Space. You've done a lot for the show, and I think all of us long-time viewers appreciate it.

@SrFoxley 5 жыл бұрын

Aaw, thanks! I'm glad you enjoy the show so much, eh! And, again, I just want to point out that the hard-working Sci-show crew are the real heroes here, eh-- without them, there'd be none of this excellent content for us to enjoy!

@rdreese84 5 жыл бұрын

Earl Grey, you say? Hot, I presume...

@persinitrix 5 жыл бұрын

Coming from an "aspiring" industrial and systems engineer a few dots were connected that were left distant from the few statistics and probability classes i have taken at university. Hypothesis testing and Bayes Theorem have made a bit more sense to me. I praise You

@daviddavis4885 5 жыл бұрын

This would have been helpful two hours ago before my Stats quiz...

@justintime970 5 жыл бұрын

100% of surveys show that everybody takes surveys...

@Jcewazhere 5 жыл бұрын

@SR Foxley Thanks buddy, you're supporting about half the channels I enjoy :)

@SrFoxley 5 жыл бұрын

Yay! You have good taste in channels, then, eh!

@blazeinhotwings 5 жыл бұрын

One thing to keep in mind is that the “gold standard threshold” of .05 depends a lot on your field of study (socials sciences use higher p values like .05 and things like cutting edge physics use much smaller p values (

@willdbeast1523 3 жыл бұрын

I can understand the drive to only publish """interesting""" results back when everything was published traditionally, when you have the higher cost to publishing via physical media it makes sense to not publish results that go "wow we had a crazy idea that would revolutionise everything, but it was wrong lol", but now there isn't really the same fixed cost

@jzero4813 3 жыл бұрын

Maybe in the fuzzy sciences 1 in 20 convinces someone, but in Physics the standard is five standard deviations, or a p-value of 0.0000003

@bcddd214 5 жыл бұрын

BEAUTIFUL! I've been yelling the same thing at scholars for years.

@ShubhamBhushanCC 5 жыл бұрын

You don't put milk in Earl Grey.

@metamorphicorder 5 жыл бұрын

Of course not. Only a barbarian would do that. You always put the earl grey into the milk.

@molchmolchmolchmolch 5 жыл бұрын

Maybe you don't but I do

@Lilliaace 3 жыл бұрын

This episode is giving me nightmare flashbacks to statistics, healthcare informatics, biostatistics, and epidemiology

@duckgoesquack4514 5 жыл бұрын

Its hard to paint the world in back and white, with shades of grey.

@SeanPat1001 3 жыл бұрын

Yes! I have found virtually all stat texts emphasize P-value. One thing to bear in mind is that a P-value is a random variable. Every random variable has a confidence interval and they never report that part. Bayesian statistics can help, as long as there is a way to measure the probability of the alternate hypothesis. This is not always possible. In industry, the usual method is to select alpha and beta values, based on the consequences of making a Type I or Type II error. It’s assumed you will not always be right, but things should work in the long run. In all fairness, the same happens in research. People try duplicating experiments and if they get similar results, they are more sure. We know nothing. Everything we think we know is an educated guess. Until 1962, every chemist knew xenon was an inert gas. But Neil Bartlett proved to the world that xenon was not inert by conducting a novel experiment. This led to a realization that we didn’t understand chemical bonding as well as we thought.

@sjzara 5 жыл бұрын

When I was taught statistics years ago we were taught that 0.05 was not suitable for publication - it was only an indication that a situation was worthy of more investigation. Even 0.01 was only borderline for publication. We were also taught the exact meaning of the probability in terms of what was and was not true. There is nothing wrong with using p-values, as they can be calculated with many fewer assumptions than Bayesian statistics. What’s seriously lacking is statistical expertise, with statistics being used by many who have little idea how to interpret them.

@MsZeeZed 5 жыл бұрын

Another point of the dead fish in the MRI is to understand your experimental environment. Muriel Bristol’s leaf tea was drawn from an urn (no tea bags in the UK until after WWII). Tea in an 1920s UK academic common room would be poured into china cups that have a low thermal capacity. It was traditional to put the milk in first for boiling tea, as the cool milk prevents the china cup from cracking. With an urn the tea is already boiled and steeping at around 80C, so the order of mixing with milk has no real effect, but if you put the milk in first the exterior of a china cup will still be *initially* cooler. So if freshly mixed behind Bristol’s back & handed out 1-by-1 the temperature of the cup would be noticeable. Fisher focused on rejecting the null hypothesis, but that only proved Muriel could sense how the tea was made, it does not prove she could taste the difference, even if she thought it was her sense of taste that was determining that.

@eljanrimsa5843 5 жыл бұрын

Fanciful explanation! But the significance of your data shouldn't depend whether you can come up with an explanation you like.

@MsZeeZed 5 жыл бұрын

Eljan Rimsa yes its as impossible to say if this explanation is true as saying it can be judged by taste using p-value alone. Its more likely than the hypothesis that mixing these 2 liquids in a different order creates a different taste using sense organs that don’t work optimally in *hot* or *cold* ranges. If the water was boiling the mixing order may make a difference & it could be a different recipe that formed the conviction that there is a difference in taste. Design your experiment to standardise the tea mixing & think of how to evaluate the human factor too, that is the real science.

@MsZeeZed 5 жыл бұрын

Eljan Rimsa also I find the milk 1st method argument strange, as its a tradition formed for C19th practical reasons that no longer exist for 99% of C21st tea making.

@nothingtoseehere5678 5 жыл бұрын

Two step submissions sound amazing

@xKuukkelix 5 жыл бұрын

Videos name and thumbnail were so weird that I had to click

@ryank1273 5 жыл бұрын

Welcome to my world!

@CarstenGermer 5 жыл бұрын

Woohoo! I finally understood this is very important information that's relevant to my interests! Completely switch to the two-step method and would suggest that, when scientists submit the first part of their study, they must submit an abstract that explains what it's all about to a generally interested audience. Now _that_ would make science more accessible.

@Shazzkid 5 жыл бұрын

Dead fish in fMRI. Me: fMRI...fish MRI

@thinkabout602 5 жыл бұрын

Liars figure and figures lie - I always get questioning when I hear " there's a story " junk in - junk out or I heard ........... she should not have been told there were 4 & 4

@klutterkicker 5 жыл бұрын

I remember complaining about this in psychology research when I was in school. "So you have a 1 in 20 chance of seeing a result... and you're comparing subjects on 20 metrics... and one of them is significant?" One thing I would add in defense of p values though is that lower thresholds of .01 or .001 are often used especially in the "hard sciences." Also, you can't blame a technique for when people abuse it.

@cyanidejunkie 5 жыл бұрын

666th like... you mad bro? Btw, who puts that much milk in their tea anyway?

@scriptorpaulina 5 жыл бұрын

Proposed solution I learned as math major: Multivariate p-values

@hammadsheikh6032 4 жыл бұрын

This is such a difficult topic to teach, and you did a marvelous job. I will use this video in my classes.

@Roll587 5 жыл бұрын

Researcher here - the pressure to publish is no joke.

@aDifferentJT 3 жыл бұрын

You seem to imply that Bayes factors simply require a lot of processing power to calculate, in many fields they are impossible to calculate though.

@austinmckee2117 3 жыл бұрын

I took a medical statistics class in college, and know what a p value is… but this gave me such a better understanding. So thankful for scishow

@swimmingnwinning 5 жыл бұрын

Excellent video! Thanks for making this

@wackohacko24 5 жыл бұрын

I forgot to mention, this is one of the most amazing videos I've seen on You Tube. Thank you for covering this subject.

@AugustusOakstar 5 жыл бұрын

In the 1970s I was the biostatistician for a PhD doing excellent research in the field of chemotherapy for neoplasms. He would design the study, have someone else conduct it and collect data. Have me work the stats; only then would he examine the the outcome and possibly publish, his methodology was very compartmented and fair.

@mw728youtube 3 жыл бұрын

Separating a studies results from its publishablity is a fantastic idea

@paulblaquiere2275 3 жыл бұрын

I used to do fMRI research - this is one of the (many) reasons I left. I stopped believing I was doing good science. I was encouraged to look at the data in many different ways, i.e., I'd eventually get a significant P-value. I remain fundamentally convinced my hypothesis was not correct, but that was never published or even publicised, so for all I know, other poor students have repeated the same study with the same result (but perhaps more of a willingness to play with the data). If there are rigorous fMRI researchers here, I wish you the best of luck, and I hope the culture has changed since I was doing research. I love the 'decide to publish before results' idea (I think my hypothesis was interesting! It just wasn't true). One element missing here on fMRI bits is that, to aid comparability across multiple subjects, all the data is 'fitted' to a standardised brain . The issue being brains are very much not standardised, but most fMRI researchers are neurologists rather than mathematicians or technologists so don't really understand the process by which this is done or the implications of this warping of the data. If you put a fish in there, you'll get a picture of activations on a human brain if that's the programme you run it on...

@EagleAngelo 5 жыл бұрын

Years ago I failed my statistics class multiple times and really never understood the null hypotesis. Thanks for clearing that up for me :D

@erikziak1249 3 жыл бұрын

My father claimed that beer of the same brand tastes better when it is in glass bottles than canned beer of the same brand. I challenged him to prove it. Both the bottle and the can had the same batch from the brewery, the same temperature when serving. I took a dice and started to pour into 10 identical glasses, which were all cleaned by water and let to dry. Whenever I got an odd number, I would pour the bottled beer, with an even number, I would take the can. I was lucky to get 5 odd and 5 even numbers, so the distribution was 50:50. I made a note of what glass is what beer. Then I put them all in a single line. The glasses were not numbered, just placed in a line and I knew what order glass/can was. I made sure that the appearance of the beer was identical, with equal amounts of foam, etc. Then I left the room and let my father in. I gave him the instruction to move the glass of beer up or below the "line" without altering the order of the glasses. Above the line was glass bottle, below was aluminium can. I told him he can take all the time he wants and that I will leave the room as not to disturb him. I left the room and waited for him to give me a signal that he is finished. Then I compared the results. He got 6 right and 4 wrong. I must give credit to my father that after this experience, he did not claim anymore that he can taste the difference between beer from a glass bottle and an aluminium can. Most people would start to make up excuses for why they missed some samples. Btw. I claimed that I cannot tell the difference in taste between the beer in a glass or can, but I was willing to admit that he might have better taste buds. Hehe, buds... But it was the original from the town of Budweis (German name, Czech name is České Budějovice), not the US American watery imitation of a beer.

@alan58163 3 жыл бұрын

Wonderful video! For a humorous and in-depth exploration of this and more, I recommend Jon Oliver's bit called "Scientific Studies"

@jablair51 5 жыл бұрын

Another issue is that journals don't like to publish negative results. Negative results are interesting. So researchers have to keep hunting until they get significant positive results.

@Wallach_a 5 жыл бұрын

Keep making these statistic vids, love em.

@twigwick 5 жыл бұрын

SR Foxley, the only President of Space I recognize :D

@SrFoxley 5 жыл бұрын

Aaaw! But I recognize the others! I've been particularly happy that Matthew Brandt has kept up his patronage so long!

@gexxys5039 5 жыл бұрын

Often enough I had to use p-values in university, but nobody ever bothered or could explain what they actually are, just "below 0.05 is significant". So thanks for giving an understandable explanation

@fernandoaleman607 5 жыл бұрын

Love it. One of the best videos in a while SciShow!

@slolerner7349 3 жыл бұрын

That's the best quick description of bayesian analysis i've ever heard

@CatboyChemicalSociety 5 жыл бұрын

the last time I remember dead animals talking to me was them mocking me for not knowing who John Muir is.

@WeMayBeFarApart 5 жыл бұрын

5% is not a random subjective value, it's the 2-sigma boundary for a normal distribution, often used as the definition of outlier.

@nolin132 5 жыл бұрын

The person who invented the idea literally called it a random subjective value.

@alejandronasifsalum8201 5 жыл бұрын

Lack of computational power was a big issue for Bayesian Statistics in those times, but is also true that Fisher abhorred Bayesian statistics at all, not just for computational reasons. In fact, he was a huge detractor of Bayesian methods, although they became popular trough the 20th century nevertheless, when they solved complicated problems which couldn't be attacked with classical methods. A very interesting book on the subject is "The Theory That Would Not Die", of Sharon Bertsch McGrayne.

@pedrobernardo5887 5 жыл бұрын

I just spent 2 days devouring statistics content on the Internet and then you guys upload this. Amazing timing

@sarahwbs 5 жыл бұрын

People who put the milk in their tea first are MONSTERS.

@Zeldaschampion 5 жыл бұрын

SR Foxley U rock. Keep up da good work!

@SrFoxley 5 жыл бұрын

Thanks Link!

@WithBestRegards 5 жыл бұрын

One does not drink Earl Grey with milk. If one insists on adding something to the tea, one may add a squeeze of lemon juice.

@FinnJenkins 5 жыл бұрын

Brilliant explanation of a massively powerful principle. As someone who sells products based on science, p values are something that are discussed every day... and no one is particularly thrilled about it because highly significant work done in America might not translate into successful solutions in Africa. So science becomes reduced to faith and belief rather than fact. Thanks once again.

@aSpyIntheHaus 3 жыл бұрын

I love the idea of 2-Step manuscript submission

@SmellyKatPants 5 жыл бұрын

I really like that you made this video and mentioned the pressures of publishing and the fact that the .05 p value is an arbitrary cut-off. Science is wonderful but because lots of people know that, "studies suggest..." can be a quick way to bamboozle someone without really giving them all the information.

@mgevirtz 5 жыл бұрын

Damn, FINALLY!!!! I have waited for this paper for nearly 10 years! I love it. Salmon bless the researchers and SciShow.

@tjendenys5028 3 жыл бұрын

Where was this during my statistics class... At uni my professor explained is so incredibly cryptically holy hell. Thanks, the Internet is better than uni.