"If we’re going to rely on science as a means for reaching the truth — and it’s still the best tool we have — it’s important that we understand and respect just how difficult it is to get a rigorous result. I could pontificate about all the reasons why science is arduous, but instead I’m going to let you experience one of them for yourself. Welcome to the wild world of p-hacking."

caeli:

Oh man, p-hacking is so rampant. STORY TIME

A professor I worked with in undergrad a couple years ago was absolutely terrible about this. I had my own project that I had designed and collected data for. We had a couple of dependent measures, accuracy and reaction time. We had predictions that pertained to accuracy but not reaction time. When I started analyzing data, it was clear that nothing was happening with accuracy -- there was a numerical but not significant difference between the relevant conditions, and it was not at all consistent across subjects. Basically, all the signs were there that this effect doesn't exist and we should conclude that there is no difference, and there weren't any weird confounds in the experiment. However, there was a reaction time effect that was mildly interesting and I turned that into the cornerstone of the project.

Then my professor loses his shit and is obsessed with finding an effect in accuracy. Here's a (non-comprehensive) list of things he wanted me to do:

1. Exclude participants on some weird ad-hoc criterion that just so happened to overlap with the group of participants who did not show an accuracy effect. SKETCHINESS LEVEL: 5000

2. Run more subjects. At first this seems innocent, but analyzing data and deciding to run more subjects inflates your risk of Type I errors (i.e., false positives).

3. Try different analyses. Logistic regression doesn't reveal a significant difference...but what about an ANOVA? This is unwarranted by the type of data collected. Or how about we instead fit curves to the data and compare the parameter values of the curves in each condition? This is relatively standard practice in the field, but my data didn't meet the right conditions for such an analysis.

This is all from a tenured professor who has been doing research for over 15 years. If I wasn't already aware of why these are such huge issues, I might have contributed to bad science by following these sketchy-ass suggestions. Now I doubt his past work and feel really weird about everything. Luckily my current group is awesome about recognizing and trying to avoid this kind of stuff.

EDIT: Let me also add that even the seemingly most innocent things are not okay. I mentioned that I found an unpredicted effect in reaction time and turned that into the cornerstone of my project. That is also sketchy!! Analyzing more than you originally intended to also leads to inflation of Type I errors since with more comparisons comes a higher chance of the effect being not real. Instead, such post-hoc analyses are better used as guidance for future experiments, not conclusions of current experiments.

/sciencerant


posted 3164 days ago