6 Ways Scientists Fake Their Data

  Рет қаралды 23,641

Pete Judo

Pete Judo

5 ай бұрын

You can protect your privacy and support the channel by getting 20% off DeleteMe at joindeleteme.com/JUDO20
Statistics by Jim: statisticsbyjim.com/hypothesi...
My Website: petejudo.com
Follow me:
Behavioral Science Instagram: @petejudoofficial
Instagram: @petejudo
Twitter: @petejudo
LinkedIn: Peter Judodihardjo
Good tools I actually use:
Shortform: www.Shortform.com/pete
Ground News: ground.news/Pete

Пікірлер: 162
@PeteJudo1
@PeteJudo1 5 ай бұрын
You can protect your privacy and support the channel by getting 20% off DeleteMe at joindeleteme.com/JUDO20
@stephenmcinerney9457
@stephenmcinerney9457 2 ай бұрын
You forgot to _sign that at the top..._
@themartdog
@themartdog 5 ай бұрын
I really, really like your initial point. There is no reason why big journals shouldn't publish when something doesn't work. The fact they don't makes no sense at all
@MimouFirst
@MimouFirst 5 ай бұрын
Agreed. I find it quite against the scientific method to ignore 'no positive result' results.
@sri5086
@sri5086 5 ай бұрын
@@MimouFirst but then the problem is deciding which 'no positive result' to publish.
@MimouFirst
@MimouFirst 5 ай бұрын
@@sri5086 I think we should publish everything in a database, so that researchers know what's been done and what the result and the method of testing was. Would be the best imo. Now there might be studies going unpublished 'in a drawer' that weren't statistical significant due to a mistake and without the mistake it could be significant and useful. We won't know, since it's not published. Financial incentives won't let this become a reality though. Quite sad.
@ThePowerMoves
@ThePowerMoves 5 ай бұрын
There would be too many studies published otherwise
@ahmednematallah
@ahmednematallah 5 ай бұрын
They don't, because it significantly increases the volume of papers. Which people are very unlikely to read or cite, leading to a lower h index. People are interested in positive results. Not saying it's fair, but a better solution would be starting negative results journals with a different measurement metric. Another problem is that allowing negative results to be published and be given the same weight as positive results incentivises authors in a wrong way. The main problem is journals rejecting perfectly good papers to brag about a low acceptance rate, or just because the reviewers are biased or picky.
@9adam4
@9adam4 5 ай бұрын
Why did the urologist accept only certain patient specimens for his data set? He was pee hacking!!! 😅
@hansbleuer3346
@hansbleuer3346 5 ай бұрын
Beer Review
@sunway1374
@sunway1374 5 ай бұрын
Don't you all laugh. This literally happens.
@theondono
@theondono 5 ай бұрын
I left academia altogether because after spending 500h in the lab collecting data, I was constantly pushed to re-do all my measurements with the sole intent of statistically faking them. Previously to that I had spend a month making tests to validate how many samples I should take for each experiment. My results was that I needed at least 250 samples to have a good estimator of the average. I proposed moving forward with that number, but was told first to do 500 samples per experiment (doubling my time in the lab unnecessarily). My measurements look good to my supervisor (they matched his simulations) one day, he left for a conference where a competing research group showed their results, then suddenly my numbers did not match anymore. I asked if there was something wrong with previous simulations, the answer was no. He just told me "This numbers must be wrong", so he wanted me to "run the experiments again", only this time with 100 samples. I plainly refused. He ran the experiments using 80 samples, and then presented a culled set of less than 50 as his "results".
@johndor7793
@johndor7793 4 ай бұрын
Out of curiosity what was the subject? maybe provide a little more detail it would be interesting to know.
@theondono
@theondono 4 ай бұрын
@@johndor7793 My supervisor designed a chip (ASIC), that performed an electronics task (programmable delay cell). I was supposed to measure a particular parameter of that behavior. I did my job, and just for completeness presented a comparative of the same value measured with different number of samples (different runs). Then measured each of the prototypes individually. Sent all the graphs his way, including stdev results (which was the primary figure of merit). Then he left for a conference where another team presented a very similar device (built with different tech), with much better performance (their stdev was orders of magnitude smaller). At his return, he wanted to repeat all measurements with lower number of samples. His justification was that the high number of samples was artificially increasing the stdev. I complained that made no sense, if anything we should use *more* samples to capture more rare events, even if their contribution was small. The "procedure" he ended up using for data processing involved not only reducing the sample amount to 80, but he proposed removing the smaller 15 and the higher 15 values to "remove outliers". Unsurprisingly his measurements had very low stdev.
@griof
@griof 5 ай бұрын
As a mathematician/statistician, I have helped a few friends to drive the statistics for their research (mostly "low profile" medical research). They did a good job collecting data, and actually they got significant p-values (technical details below). The problem was the not-interesting topic they choose... For example: a very simple technique to reduce pain during some simple but common nose interventions. These friend they all get refused by journals and expositions favouring other researchers with less evidence but more "engagement" Tech. Details: No exotic stats, just a standard test for homoscedasticity, normality (shapiro-wilk) and then t-test/chi for null hypothesis.
@coolieo2222
@coolieo2222 5 ай бұрын
.
@hungrymusicwolf
@hungrymusicwolf 5 ай бұрын
Yep, and that's why we get bogus science today. Science is a common good, it should not be measured by its popularity as it currently is in journals.
@kayakMike1000
@kayakMike1000 5 ай бұрын
There's lies, damn lies, and statistics. (I have a math degree I don't really use)
@sunway1374
@sunway1374 5 ай бұрын
"No. 3 Variable manipulation" is what I see happens the most in my field of research. Most of my eminent colleagues don't consider it a problem. It's called discovery. In fact, often there is not even a hypothesis, just run the analysis and see which variables work. Sometimes, a student or a postdoc reports a particular set of variables give the highest correlation. The supervisor says great, here is the explanation and here is the story, it is reassuring this fits the theory. However, in a subsequent meeting the student comes back and says unfortunately he made a mistake in the analysis, here is the correct set of new variables. Well, don't you worry. We find another theory and story, it fits something else! I don't even work in social science, psychology or behaviours science. I am in physical science. It is still possible to fish for any relationship in the data and find a physical explanation backed up by mathematics to explain it. These studies are published in top journals in my field, as well as those in the Nature family.
@tech_priestess_channel
@tech_priestess_channel 5 ай бұрын
Can you, please, explain, why is "Variable manipulation" bad at all? That was unclear from the video, and is unclear from the comment as well.
@PeteJudo1
@PeteJudo1 5 ай бұрын
@@tech_priestess_channel Variable manipulation significantly increases the chance of false positives. This is because the world has some randomness to it. So by engaging in variable manipulation, you are essentially relying on randomness to produce your effect, rather than intentionally testing what works in a systematic way. Sometimes, variable manipulation is justified if the authors clearly state that this is "exploratory" research that needs to reproduced. For example, they might say, "Ah, this wasn't what we intended to study, but we observed this other interesting effect that warrants further research." The problem is that often, as I said in the video, scientists will write up the experiment as if they were intending this the whole time, which leads to lots of legitimate "looking" studies, that are actually just the result of luck.
@sunway1374
@sunway1374 5 ай бұрын
@@tech_priestess_channel No. Sorry. I can't explain it well. I am not sure it's actually bad myself. That's why I didn't explicitly say it this way or that way. But let me try... Say you have a dataset for recorded observation of 100 independent variables (x's) and 1 variable (y) you want to predict. How many of these x variables would you expect to have a statistically significant high correlation with y? It's actually another 'higher' level of statistical significance you can also specify and test. You would expect by some fluke some x variables would be highly correlated with y but there is no actual physical connection. This is called False Discovery. You can look it up. So, if you do variable manipulation, you are just allowing yourself a high possibility of False Discovery. Often, the researchers will then write "More research is needed" in the conclusion. But here is another common problem of the academic culture. The original researchers have already got their moments of fame and moved on to other new ideas. Who want to do the more research that supports other people's old studies? If you do, you are actually doing good science. But you won't be a star and you will struggle in your career. The original researchers will get all the glory.
@meneldal
@meneldal 5 ай бұрын
I think it's okay to be fishing for some relations, but you should apply a greater standard than 0.05 for those. I don't think it's bad if you test 30 things and find one with a significance of 0.005, but 0.05 is likely just random chance.
@deusex9731
@deusex9731 5 ай бұрын
@@PeteJudo1 OK i understood it way better now. So the issue is more the false representation of a "Study", rather than the method. If everything is stated clearly as an exploration of an interesting effect and from there they are working in standard ways to reproduce it, this wouldnt be much of an issue?
@SugarBoxingCom
@SugarBoxingCom 5 ай бұрын
Feynmann summarized your video with s single quote : trying to find a theory in a pile of data
@Batmans_Pet_Goldfish
@Batmans_Pet_Goldfish 5 ай бұрын
Thank you for speaking on this. P-hacking is part of what makes reading scientific studies so confusing for the layperson, and causes results to be so often misinterpreted by communicators.
@heyman620
@heyman620 5 ай бұрын
No bro, you just don't know the field. Communicators often don't give a F about science, kind of like this channel lately (the dude just rides the fraud hype and diminishes the name of many honest scientists to the uneducated public while he communicated pseudo-science for years, I liked this channel during the Gino thing but it got embarrassing). In my eyes, expecting to understand papers without knowledge about a field is just not a serious type of behavior and a severe case of the Dunning-Kruger effect, which was reproduced multiple times (not p-hacked). Like, my field is CS, and I can't understand Math papers. Or Physics. Common bro, a little self-reflection.
@forstuffjust7735
@forstuffjust7735 5 ай бұрын
​@@heyman620Sadly i have to agree with you, while academic word has many shady stuff (heck i would say most sources i used for my msc thesis are not that honest). This channel detoriated into clickbait rage against academia channel, because thats where the views are.
@Kagrenackle
@Kagrenackle 4 ай бұрын
I'm a mathematician and I love that I don't have to do experiments or find "statistically significant" results. It's still publish or perish and it's difficult to establish important results but the logical nature of it makes it harder to fake.
@stephenmcinerney9457
@stephenmcinerney9457 5 ай бұрын
2:01 Stopping rules/'Data peeking' 3:00 Deleting Outliers/Data trimming 4:14-6:03 ad 6:03 Variable Manipulation 7:31 Excessive Hypothesis Testing 8:29 Excessive Model Fitting 11:36 Conclusion and Acknowledgment
@extraleben6734
@extraleben6734 5 ай бұрын
I have accompanied diploma, master's, and doctoral theses for 6 years as a computer scientist specializing in driving simulation. Removing outliers from studies was standard procedure for psychologists, and I have seen quite a bit. Since that time, psychologists and doctors of all kinds are no longer highly regarded by me. It is a shame how I now look down on these fields. But I don't consider variable manipulation to be a problem because it can certainly happen that one comes across something much more interesting in their studies, and omitting that would be sad.
@Qkano
@Qkano 5 ай бұрын
Boy was this kind of "fraud" common... but more often than not it was "innocent" / "well motivated", not becasue the researchers were intentionally setting out to cheat the system. Personally - I had a rigid approach to it - if I collected the data point, it was listed in the results section. When there were good (supporting ) reasons for rejecting a data point, they were listed in the results and the point flagged as excluded from the analysis. The reason could even be as simple as "so erroneous it had to be wrong" but still it was listed. Spurious results can occur for a number of way that are unavoidable, operator error, instrument glitches, transcription (recording) errors etc, Clearly, if I was trying to estimate the distance to the moon, and had 49 results at ca 250,000 miles ish - and one at 25,000 miles few would argue the outlier had to be included in the average. So while discussing the results - my analysis might exclude outliers, their existence was never hidden from the reader.
@srremus9781
@srremus9781 5 ай бұрын
From the chemical scientific field: It's a waste of resources if several groups perform the same experiment's over and over and then don't publish what doesn't work out. Maybe you publish your procedure/idea and someone else finds the key step for success. That's part of the progress in my opinion. at least other scientists won't waste resources on it again if it's a dull. There should be a data base for failed experiments. Like very short publications, what you did, why, what you aimed at and what came out.
@joerudnik9290
@joerudnik9290 5 ай бұрын
Absolutely, failure is valuable information!!!
@kayakMike1000
@kayakMike1000 5 ай бұрын
Scientists often run mathmatical models and try to claim their dumb software is an experiment.
@AlvinRamasamy
@AlvinRamasamy 5 ай бұрын
“If you graph the numbers of any system, patterns emerge.”
@sillysad3198
@sillysad3198 5 ай бұрын
i was tought to delete exactly one outlier on each end. it was given a complicated justification which is roughly it is more LIKELY that we are deleting a measurement mistakes but if we delete more outliers it becomes less likely.
@FalkonNightsdale
@FalkonNightsdale 5 ай бұрын
Exactly… I was studying statistics and we were told to cut 2,5% from each side…
@jivekiwi
@jivekiwi 5 ай бұрын
Thanks Pete. I'm a 44 year old guy who reads a huge amount yet I have learnt so much from your videos. You have brought up many disturbing aspects to research which is a bit gutting, to be honest but a sad truth is better than a promising lie. After the first one I watched, KZbin has been throwing many similar vids my way and this needs to be more widely known. Even Dan Ariely! Deleted his book a week or so ago.
@markcarey67
@markcarey67 5 ай бұрын
I had a statistically significant amount of drugs in my system so I had to resort to pee hacking....
@expensivepink7
@expensivepink7 Ай бұрын
real
@adonm6998
@adonm6998 5 ай бұрын
"trust the science" . I do, its the corporations and scientists i dont trust
@GooseCee
@GooseCee 5 ай бұрын
This video was EXTREMELY fascinating and I was so captivated the whole time! Good work :)
@weeb3277
@weeb3277 13 күн бұрын
I like how it happens so often there are patterns. Soon there will be best practices too.
@spagzs
@spagzs 5 ай бұрын
If you pitch a study that examines homosexually in seagulls…you’ll get a grant 😂
@danielastoica3354
@danielastoica3354 5 ай бұрын
Great job Pete!
@jasonmoy5452
@jasonmoy5452 5 ай бұрын
Selective sampling can be just interaction effect, which is perfectly fine as long as you justify and be honest about it.
@allagnstall
@allagnstall 3 ай бұрын
I'm no scientist. Forgive me, but are you prepared to say how _justify_ is operationally defined?
@mathijs58
@mathijs58 5 ай бұрын
Great video on an initially confusing term. Well explained, and recognizable examples. Now try to find some stock footage of 'scientists' that are not so squeeky clean, real science is quite messy, even if you don't resort to p-hacking etc...
@Guishan_Lingyou
@Guishan_Lingyou 5 ай бұрын
I don't see any excuse for reputable journals not instituting mandatory preregistration of any paper that might be published: the hypothesis, methods, etc... should be given to the journal before the study is run. That would vastly reduce the freedom to use questionable, and non-transparent techniques to make results look more significant than they are.
@0.-.0
@0.-.0 5 ай бұрын
Thanks for your source in the description!
@WisdomThumbs
@WisdomThumbs 3 ай бұрын
Funny. Explaining this to my friends in 2021, and other types of scientific fraud, earned me a cussing out and the “contrarian” moniker.
@luszczi
@luszczi 5 ай бұрын
Often it's not fraudulent and researchers are just fooling themselves. "I KNOW this effect exists, I can show it if I just get rid of those outliers. Those outliers must be due to anyway".
@geokm7717
@geokm7717 5 ай бұрын
That makes lots of sense
@citrinedreaming
@citrinedreaming 5 ай бұрын
It becomes fradulent if or when it’s published though, and that’s the issue. I get your point that there is a difference intentional fraud for intentional profit vs a scientist being delulu (I love that word I’m sorry it’s too perfect) but the net effect is fradulent data
@straightfacts5352
@straightfacts5352 5 ай бұрын
The easiest person in the world to lie to is yourself.
@the19trier
@the19trier 5 ай бұрын
Now you get an even better topic and channel!!
@psychotropicalresearch5653
@psychotropicalresearch5653 3 ай бұрын
P-hunting: The illogical in pursuit of the indefensible [KG] Oscar Wilde [foxhunting: the unspeakable in pursuit of the uneatable]
@kylejohnson8447
@kylejohnson8447 3 ай бұрын
Absolutely baffled. I always assumed that if a result was surprising, the first thing these scientists would do is replicate the experiment before giving the conclusions any credit. How is it that this isn’t the case?! As a mechanical engineering senior who wants to go to grad school, I cant imagine that if an experiment showed any sort of statistical significance for the first time that it wouldn’t be immediately tested by others. Is this isolated to behavioral science/sociology type fields or is this also a problem in STEM
@klikkolee
@klikkolee 6 күн бұрын
Regarding variable manipulation and excessive hypothesis testing: To me, they're different forms of the same idea: you acquire data to test a hypothesis, and the hypothesis is not supported, but the data indicates that a similar hypothesis may be correct. If we just ignore that, then the line if inquiry dead-ends. There needs to be a way to still pursue those ideas. Is negating the increased chance of false positive just a matter of requiring a new, independent data set?
@Ken-er9cq
@Ken-er9cq 4 ай бұрын
Many studies are smaller than they should be, which means things like confidence intervals are large. Then the conclusion is going to be that there is no effect but there may be a large effect, we just don’t know. Journals should not publish this type of paper, because you are rewarding someone for doing bad science. However if I get a result like eating X increases the risk of Y by a factor of 1.01 (95% Ci 0.98,1.04) then that tells me that it is only possible that it has a small effect. Compare with 1.1 (95% CI 0.5,1.7) where it could have a sizeable result in either direction.
@leannevandekew1996
@leannevandekew1996 5 ай бұрын
Another video on Zimbardo from Stanford University published research on the Brown Bread Study and the Prison Study.
@takiyaazrin7562
@takiyaazrin7562 5 ай бұрын
Enlightenment of academic - You are a good channel
@HellRaiZOR13
@HellRaiZOR13 5 ай бұрын
The main reason why I dont wanna go to academia and dont wanna become a scientist after I finish my PhD.
@laminarflow6072
@laminarflow6072 Ай бұрын
Wait.....so you're saying I can be like a scientist too, I can just delete my data!?
@Patrizsche
@Patrizsche 5 ай бұрын
This video isn't about faking data, but faking results 😭😭😭
@poornoodle9851
@poornoodle9851 5 ай бұрын
When scientists pick and choose data, science devolves into belief…basically its religion.
@zackinator1439
@zackinator1439 6 күн бұрын
I am not a professional in the field by any means, but I'm casually interested. But one thing I don't understand is why the example given of the clubbing affecting extrovertedness is wrong. If you collect a sample of people who do and don't go clubbing and have them take a personality test, you can use that data to attempt to find correlations in clubbing and personality type. If you think it may affect extroversion but it actually affects agreeableness, you still found a correlation that is just as valid as if you found one for extroversion no? If the test is a big five personality test that gives results for different aspects of personality and not just intro/extroverted then that test can be said to reasonably approximate your sample's personality trends. So in both cases, your sample is a mix of people who go clubbing and people who don't, and your collected data is personality data. So, other than the fact you're changing the specific trait you thought might have a correlation, both are still just under the same question of "Is there a correlation between clubbing and personality?". As long as your sample is large enough and as free of bias as reasonably possible, and none of the questions asked were specifically tailored to one or the other, yes your question may have been ever-so-slightly different, but in the other case you would have collected the same data, from the same sample, and done the same statistics. In the example of the sample slicing, where you chop up the sample until it fits your hypothesis, that wouldn't necessarily work because if you take a subset of your sample, that subset is by definition smaller than your original sample, and may not be large enough to be statistically significant. If you take a varied sample and find no correlation, then it seems like there might be a correlation for a specific subgroup, you go and collect a new sample of that subgroup that is of appropriate size and bias.
@BrendenFP
@BrendenFP 5 ай бұрын
Researchers should be required to outsource their statistical analysis to (an) independent statistician(s). The person who makes a hypothesis should not be the one who (statistically) tests it.
@1789Bastille
@1789Bastille 5 ай бұрын
can you interview elsevier, springer nature, wiley about this?
@EvilDMMk3
@EvilDMMk3 5 ай бұрын
Say if you were doing a study and you noticed a strong correlation that wasn't your original hypothesis, what should you do? Clearly you should not publish falsly but also there might be a real result there.
@PeteJudo1
@PeteJudo1 5 ай бұрын
You can call it out in your write up, but make it clear that it was not your original intention for the study, and that in needs replication in it's own randomised control trial to verify. What you should never do, is pretend like that was the intention of the study all along, which unfortunately is what happens more often than we would like to admit.
@chinhhoang6304
@chinhhoang6304 5 ай бұрын
You're definitely not wrong about the state of science being crooked to some extent. But you're definitely not right to say that some of those practices are to "fake data". There's a huge difference between intentionally p-hacking and data exploring and robustness check for certain research design. And they all depend on the nature of your research questions. Some practices might be fair in some designs but seem to be unfair in some others. Don't overgeneralise things like that. Still, I like your videos. Keep improving!
@philidor9657
@philidor9657 4 ай бұрын
Can someone explain to me why "variable manipulation" is considered misconduct? The way I understood it explained, it seems totally reasonable to pivot your research to something else when you notice a certain result during an experiment that is more interesting than the one your currently studying. It's really common in my field, chemistry, to discover interesting reactions while you're doing unrelated research and turn that into a paper instead...especially if the work you were doing before wasn't working well.
@zyrohnmng
@zyrohnmng 2 ай бұрын
If, when doing your research, you collect data on many variables (let's say 20 variables in addition to the one you're testing). The probability that 1 of those 21 variables will show a statistically significant result when there is none is much, much higher than the chance that the 1 variable you were planning to test showing a statistically significant result. The ethical thing would be to discard this collected dataset showing that your experiment failed, note that there is a potential this other variable has promise, and collect a new set of data, specifically for the experiment for the other variable.. It's bad practice because it increases the odds of you getting a false positive.
@Wanhope2
@Wanhope2 2 ай бұрын
I strongly suspect that case can be comorbid with stopping unfortunately Ideally people could be comfortable and actually able to publish negative results.
@vw9659
@vw9659 5 ай бұрын
To imply that this is widespread is nonsense. Maybe in the social sciences but certainly not in areas I know. Trial registration is widely practiced, which means that before you start the study you have to publish what you intend to do and how you intend to do it, including what statistical analysis you will use, and what its statistical power is to answer the specified research questions only. And then you do the study. When you want to publish it in a decent journal you have to demonstrate that it was originally registered, and that you did the study in the exact way it was designed. Also, in team-based research, it would be very hard for any individual to tamper with the data. The senior investigators for example often don't know exactly where the raw data is, or how to find any given variable in it (there may be 100's of variables). But in physics, chemistry and other hard sciences, you often don't know what you're going to find until you do the study. If no one has done that research before, you can't define how to analyse the data before you see the data. That's how new knowledge is discovered. If you analyse the data in a way that is not convincing, or draw conclusions that are not justified, other scientists will recognize that. And then when they try to replicate the results, they won't be able to. Most data fraud examples you have covered are are lone wolf researchers in "soft" disciplines, manipulation of things like exemplar images in the published data that has no actual relationship to the study's data, or rare graduate students or postdocs who may have fabricated data. Finally, scientific consensus is based on replication, not single studies. Anyone fabricating data to achieve a particular result knows that their study will never be able to be replicated by others. You do see "null" studies that find no significant result published - particularly when they are counter to an initial study that may have suggested something different. And then the consensus will change to match what those later studies have found. If that happens more than once for a given scientist, they will get a reputation for bad work.
@doctorlolchicken7478
@doctorlolchicken7478 4 ай бұрын
Like everything else, the problem is incentives. Incentivize scientists to be biased and they will be. Your explanation of the various issues is very simplified. Those are all real problems, but it’s all a question of degree. Unfortunately it is not easy to self-identify whether what you are doing is valid or you’ve gone too far. The only “rule” of best practice I can think of is that you must include everything you did in the paper. Also, you should sensitivity/stress test your results. In other words, show how slight variations impact the conclusion. If your conclusion doesn’t hold up with variations then chances are it’s not a reliable result. In a sense, “p-hacking” is choosing to only selectively document some of the research.
@halneufmille
@halneufmille 5 ай бұрын
My feeling about 3, 4 and 6 are - You should openly say what your initial intent was in terms of variables and samples, and tell us it didn't work out. - You should openly say all the other things you tried in terms of alternative variables or samples. - The fact that there seems to be an effect for this other variable or this subsample may be interesting if the statistical significance is high. It may constitute interesting hypotheses for future research and be valuable for the advancement of science. Just imagine if Fleming said: well it looks like this mold is having some antibiotic properties, but since I didn't set out to study this from the start, I will just ignore it and not report it. If he had been out of time or funding, he wouldn't have discovered penicillin.
@brootalbap
@brootalbap 4 ай бұрын
The majority of researchers p-hack in one or more ways. There are rewards for doing so and it hurts one's career not to.
@picahudsoniaunflocked5426
@picahudsoniaunflocked5426 5 ай бұрын
6:32 what of us whose whole being is clubbing???
@aayambasnet548
@aayambasnet548 5 ай бұрын
Can you explain why number 3 is wrong? Wouldn't seeing patterns where you did not expect before the very basis of new science? If you unexpectedly see such patterns in physics, and dig deeper into it, it could give a whole new paradigm. Of course, it may be a false positive like you mentioned, but you can't really say that focusing on such correlation is entirely bad. If the researcher sees the correlation, and then researches more about that correlation, I do not see anything wrong with it.
@metalslegend
@metalslegend 5 ай бұрын
You have to report this in your paper then, that: "1. We found no asociations between variables XY", "2. But we found associations between XZ and we did this and this after that" ... But most papers just report the XZ story to begin with, even changing their original hypotheses for XY to similar ones for XZ. And thats super bad! Not only did you not report the XY story, you claimed that testing XZ was successful, which is not right. XZ came up randomly while testing XY.
@joinedupjon
@joinedupjon 5 ай бұрын
Its not obvious to a lot of people why its super bad... The p value is the chance of getting that correlation by a sheer fluke. If you throw enough data against the wall you'll eventually get a p
@stephenmcinerney9457
@stephenmcinerney9457 2 ай бұрын
@@joinedupjonYes, "Multiple Testing Problem". It's not allowed to keep shopping multiple hypotheses against one dataset until you find one that satisfies the magical p
@echen1716
@echen1716 3 ай бұрын
I don’t agree with number 3. It can be an interesting second order question and an interesting additional finding
@gladiatorzz2061
@gladiatorzz2061 5 ай бұрын
I think overfitting is the most insidious of these. It can be very difficult to detect because science is inherently messy. Worse, it implies a causal relationship where one does not exist. In contrast, many of the others can be identified as anomalous results or as special cases.
@charlescrawford9972
@charlescrawford9972 Ай бұрын
Excessive Hypothesis Testing: Gerrymandering for science!
@pedromenchik1961
@pedromenchik1961 5 ай бұрын
Not the chair tricking us into thinking that Pete has super buff shoulders
@PeteJudo1
@PeteJudo1 5 ай бұрын
I occasionally do lateral raises :P
@user-to2gh7sg3l
@user-to2gh7sg3l 5 ай бұрын
whoa... Wait a minute... What the heck is Butt Cricket? That needs some explaining.... Is it like Crotch Polo?
@conradsieber7883
@conradsieber7883 5 ай бұрын
Journals should be responsible for auditing a sample of the studies they publish...
@dadsonworldwide3238
@dadsonworldwide3238 5 ай бұрын
Theyre following the structuralism in place though. We predetermine a theory of everything then change interpretation of evidence to fit the mythology weve made. You can't follow the evidence where it leads and appease a grand unified evolutionary theory of everything cause & effect won't permit it. Causal unity here pushes effective complexity there. Modernization act in America and basically across the west created a chaldean minded modeled structuralism granting simplicity to the top of the higher archy and pushed the complexity and division down upon many different disciplines. Within each discipline its 2 competing cults ,2 opposite archetypical minds . Then under them we have actual industry where engineers & worker/ lab tech & mechanic all absorb the complexity of this classical physical lawisms. This is anthetical to not only the classical American founding it is also traps and causes serious division between each industry working with the same elements but forced into rationalizing the system they work for benefit of all. It creates different terminology and language that requires mediating translator bishops to bring them together for collaborations. This is the 1900s Modernization act structuralism it is a pagan model we have imposed upon ourselves. The unity & simplicity clearly needs to be in the feild following evidence where it goes and not about fitting it into grand theory of everythings.
@dadsonworldwide3238
@dadsonworldwide3238 5 ай бұрын
Of course this means many boomers 60+ love this structuralism but why wouldn't they ,it allows them to deterministically imagine a simple form and all chaldean minded philosophical arguments to bring evidence back under its theory of everything has 5000 years of record to plagiarize and use if measurements challenge this belief system
@templeodoom4634
@templeodoom4634 4 ай бұрын
All my time in academia taught me is that academic vigilante should be a paid profession
@TheSpiritualCamp
@TheSpiritualCamp 4 ай бұрын
I'm not an expert, so please can someone explain to me what is wrong with #3 ("variable manipulation") ? If I experiment for one specific correlation (like clubbing and extraversion) but the data happen to show significant result for another correlation (like clubbing and agreeableness), what is unethical about it ? Why not consider it as just a lucky finding ? Just like the scientists who tried to develop a drug to cure angina, and accidently discovered viagra ? Is their invention less valuable because it wasn't the goal they tried to achieve in the first place ?
@SquizzMe
@SquizzMe 5 ай бұрын
The number of people who blindly accept anything scientists publish is astonishing and frightening.
@huypt7739
@huypt7739 2 ай бұрын
Coldfusion was just around the corner...
@SeedsofEcofrog
@SeedsofEcofrog 5 ай бұрын
We must never forget when they coerced the children for use as paper-logical to temporarily and marginally "protect" adults.
@picahudsoniaunflocked5426
@picahudsoniaunflocked5426 5 ай бұрын
Statistics By Jim maybe good interview?
@mackss9468
@mackss9468 5 ай бұрын
We must start publishing the NULLS!!! It’s still very important information to have.
@shawnmclean7707
@shawnmclean7707 5 ай бұрын
The outliers is what we need to pay attention to. The rest of the data is just like following the masses, nothing really interesting there.
@user-to2gh7sg3l
@user-to2gh7sg3l 5 ай бұрын
Isn't there an oath or something scientists should abide by? The "New science of today" forms a certain structural framework for research in the future.... Seems like one of the most important professions to not fabricate or polish results...
@blist14ant
@blist14ant 5 ай бұрын
evil scientists
@jorgebuitrago1016
@jorgebuitrago1016 5 ай бұрын
Fake data = Click
@vsm1456
@vsm1456 5 ай бұрын
I think some of these methods have similarity with common cognitive biases.
@davidnorman5488
@davidnorman5488 5 ай бұрын
How trustworthy are the graphs?
@metalslegend
@metalslegend 5 ай бұрын
Any graph?
@QP9237
@QP9237 4 ай бұрын
If they aren’t clearly scaled/bounded, more often than not it’s deceptive, expecting the reader to do any of the formal scaling calculations for the data you presented is unreasonable and dubious at best since the presentation method was your (as the researchers) deliberate choice. Think every time Apple likes to post their stupid graphs when they compare “how much faster x is than y” while presenting nothing more than a generalized curve without any bounds.
@two_horus7337
@two_horus7337 5 ай бұрын
Hey, let me just put this here for the algorithm :)
@rafa1z1
@rafa1z1 5 ай бұрын
Is this the reason why we see so many doctors on youtube? Us youtube a firm I'd publishing?
@user-to2gh7sg3l
@user-to2gh7sg3l 5 ай бұрын
Ah yes... Test subject jerrymandering. That will get you the results you want! Sigh...
@kayakMike1000
@kayakMike1000 5 ай бұрын
Climate scientists do this crap all the time.
@TigerTzu
@TigerTzu 5 ай бұрын
I don't understand why variable manipulation is a problem. Every explanation I've seen of why it's bad is just the presenter saying "it's bad science" "it causes false positives" (this video is no exception), but no one ever actually explains why essentially re-titling a study when you notice a correlation you didn't initially predict is a problem.
@ironyelegy
@ironyelegy 5 ай бұрын
Calling your hypotheticals actual data that represents real stuff does seem like a recipe for disaster, but I am no scientist
@Oler-yx7xj
@Oler-yx7xj 5 ай бұрын
The problem is that the p-value (the probability that the correlation is purely by chance) gets calculated incorrectly. If you look for multiple potential correlations, the chance some one of them to randomly appear in the data is higher, than if you only look for one correlation. And therefore, the p-value is estimated lower than it is and your work is called significant even though it is not. That's like if you throw 5 dice a chance to get at least one 6 is higher, than if you throw 1 dice.
@citrinedreaming
@citrinedreaming 5 ай бұрын
(From the perspective of someone who isn’t a data scientist) I don’t know if it is the re-titling (specifically) is the problem, I think the going back and rewriting the experiment to better demonstrate the result is the problem. If you do an experiment and get a result, then that’s the result of that experiment (of course), but if you then take that result and then rewrite the experiment (without doing the new experiment) and then publish it with the result of the experiment you actually did, that’s fraud (or lying, misleading, inaccurate, false data, whatever term you like). Since many people don’t read full papers though the title becomes important which is why by extension it is problematic
@bammeldammel
@bammeldammel 5 ай бұрын
Thank you for bringing this to the attention of the public. However as a scientist I would ask you to be careful with wording, as I got the impression that you where often talking about the scientists doing something wrong. While the black sheep are luckily still a tiny minority. I am still waiting for a journal to publish vigorously done studies that did not result in confirmation of the hypothesis.
@9adam4
@9adam4 5 ай бұрын
I don't think there's any pressure to perish.
@sunway1374
@sunway1374 5 ай бұрын
I see what you mean. His English is not correct, not normally. But... The "Publish or perish" thing can be considered a single cultural phenomenon and these words are used together so often. So, most people would not find it strange or wrong when they hear it said like here. Still I agree with you. It could be better to say "pressure of publish or perish" instead of "pressure to publish or perish."
@idcharles3739
@idcharles3739 5 ай бұрын
Mainly what you are saying is that statistical significance isn't significant. For example your first point about peeking - "if they had carried on, they might have found that the data took them away from statistical significance". Firstly that could be applied to every experiment ever conducted. Either they had statistical significance or they didn't. The point of statistics is that it's supposed to tell you when there's enough data. And if more data can disturb that conclusion then there's something wrong with the concept of statistical significance in the first place. Ditto your idea that you should know what you're looking for before you start. If statistics works, it shouldn't matter what you're looking for -if the data you find is significant, the truth doesn't care whether you stumbled across it by accident.
@RUHappyATM
@RUHappyATM 5 ай бұрын
Global Warming data fudge...OMG!
@joshuaryan1946
@joshuaryan1946 2 ай бұрын
You are great--BUT FOUR SOLID MINUTES out of twelve is spent on your selling your advertisers. PLUS interruption by a KZbin sponsor. This is ridiculous. And the result is, you go through your presentation so fast that several parts are hard to follow, with no examples to make them clear.
@Wanhope2
@Wanhope2 2 ай бұрын
This right here! Need to spend more time considering the brutal ad ratio. Though I understand that creators are slaves to the algorithm trends on length
@nilshorgby3080
@nilshorgby3080 3 ай бұрын
If you have a large dataset and you find a strong significance in a subset of the data, would it still be valid if the p-value is less than the original specified significance level divided by the number of subsets you examined. For example if I look at men and women separately, this would double the possibility of finding a false significance. If I have specified a significance level of 0.05, and get a p-value < 0.025 for women but not for men, would the result still be considered significant, as the p-value is less than 0.05 even when multiplied by the number of groups?
@finite-element
@finite-element 5 ай бұрын
Number 5 is p hacking? This uploader might need to learn a little more math. Neural network can generalize pretty well with nonlinear decision boundary. Especially when the relation in study is a multidimensional nonlinear relation. Higher order polynomial features are legit. Kernel machines exist for a reason.
@expensivepink7
@expensivepink7 Ай бұрын
is anything real😪
The hidden filth behind AI
11:35
Pete Judo
Рет қаралды 20 М.
Linear Regression, Clearly Explained!!!
27:27
StatQuest with Josh Starmer
Рет қаралды 207 М.
UFC 302 : Махачев VS Порье
02:54
Setanta Sports UFC
Рет қаралды 1,1 МЛН
I Built a Shelter House For myself and Сat🐱📦🏠
00:35
TooTool
Рет қаралды 12 МЛН
How many pencils can hold me up?
00:40
A4
Рет қаралды 18 МЛН
t-Test - Full Course - Everything you need to know
16:14
DATAtab
Рет қаралды 207 М.
He almost faked his way to a Nobel-Prize
16:13
Pete Judo
Рет қаралды 84 М.
Greatest Mathematicians and their Discoveries - Part 2
9:40
ThoughtThrill
Рет қаралды 6 М.
The BROKEN system at the heart of Academia
12:41
Pete Judo
Рет қаралды 70 М.
Chi-squared Test
11:53
Bozeman Science
Рет қаралды 2,3 МЛН
Literally the worst scientist in history
13:24
Pete Judo
Рет қаралды 173 М.
This $8 Million Medical Trial Is A Joke
25:49
Pete Judo
Рет қаралды 70 М.
Ambiguity With Partial ∂ Notation, and How to Resolve It
9:39
EpsilonDelta
Рет қаралды 76 М.
2D water magic
10:21
Steve Mould
Рет қаралды 550 М.
Academia’s Secret Black Market Explained
13:25
Pete Judo
Рет қаралды 28 М.
UFC 302 : Махачев VS Порье
02:54
Setanta Sports UFC
Рет қаралды 1,1 МЛН