The 6 New Deadly Sins of Research

The new deadly sins

When researchers have wiggle room that allows them to snoop or fish around with their methods and data, their results don’t hold water. This extra flexibility allows folks to fish around until they find the methods/data analysis that gives the desired results. For example, researchers may make outcome-friendly decisions with questions such as:

  • Should more data be collected?
  • Should there be control variables?
  • Should some outliers be dropped from the analysis?
  • Which outcome measure is most “important”?

Researchers in almost every industry have to make decisions like these. That includes Psychology, Medicine, Health, Marketing, Advertising, and the list goes on. These little uncertainties mark instances where researchers can massage their data and methods to capitalize on chance findings. The worst part is that most people don’t even really recognize when they’re doing something wrong– it just feels like they’re making decisions that naturally arise during the research process. Researchers only recently became aware of the scope of the problem because for the last 80 years, wiggle room was the norm.

Academics now call this a “researcher degree of freedom“. It’s similar to “fishing” or “snooping“. Regardless what you call it, wiggle room that allows you to capitalize on chance findings is a problem that can doom your research.

How serious of a problem are the new sins?

Serious enough to end careers.

It turns out that a lot of classical psychological research was created on shaky grounds where researchers had lots of wiggle room. When psychologists realized the scope of the problem, it was deemed a “crisis”. Perhaps nobody was more affected than Amy Cuddy—a Harvard psychologist with the second most viewed TED Talk ever. The axe came down hard on Cuddy, whose work was too shaky to replicate.  The cause of these shaky research findings? Wiggle room in the research process. 

But psychology isn’t the only field with this problem. One FMRI study showed that by using wiggle room to your advantage, you could find statistically significant instances of brain activation in dead fish. 

fish-fish-bones-museum-9365

The finding was total BS—and that’s exactly the point. If you can find statistically meaningful FMRI results with a dead fish, what other research findings might be bogus? To date, people have realized that problems exist with Immunology, Economics, Marketing, Medicine, and Biology. If you do research—regardless of what industry or discipline, you need to understand research wiggle room.

 

The 6 new deadly sins of research

1. Reporting only “significant” outcome measures. As a researcher, you want to get the most out of our participants. Let’s suppose you ask 10 different questions about how likable a product is and whether people will buy it. But only 1 out of then 10 turns out to be statistically “significant.” You might then put this 1 significant result at the forefront of your analysis, and conveniently forget about the others. This is problematic because if you have 10 outcome measures, there’s a good chance that at least one of them will be statistically significant purely based on chance—even if there’s no real effect there.

The remedy: Pick in advance which dependent variables truly matter. If you have lots of outcome measures, statistically adjust for how many variables were assessed when determining statistical “significance.” You could also consider conducting a miniature meta-analysis that looks at all your outcome measures taken together. 

 

2. Stopping the study early when the data look good. Suppose at the beginning of the study, you planned to run 50 participants per ad, which seems reasonable. But you check at 30 and find a significant difference between marketing strategy A and B. Congratulations! Time to pack up the study and collect your paycheck. Except, maybe not. By giving yourself multiple options for stopping the study in advance, you increased the chances you had of finding a significant effect purely based on chance. Here’s another example: imagine I say we’re going to flip a coin to determine who will buy the next round of drinks. We agree to flip for the best out of seven. But then, once the score is 2-1 in my favor, I say, “actually, let’s just do the best out of three.” You feel like you were cheated and you should. The game had wiggle room, and I used it to my advantage.

Remedy: Determine your sample size in advance and stick to it. If you look at the results before the study is over, don’t terminate the data collection once the results look good.

 

 

3. Continuing the study when the data do NOT look good. Suppose you originally planned to stop the study after showing ad A to 50 people, and ad B to 50 people. You accomplish this, and make it to 50 without peeking. Great job! But when you analyze the results, you find no difference between ads A and B. You then go to 60, and still find no difference. You then go to 70, and find the desired effect. However, you changed the rules on the fly and capitalized on chance.

Remedy: Collect only as much data as you originally specify. If you end up collecting more than that, cut your alpha (threshold) value in half for each time you “peek” at the data. If we originally accept 90% confidence (10% false-positive rate, p = .10) in the difference after getting to N=50, we can collect data to N=75 if we move to 95% confidence (5% false-positive rate, p = .05) to account for the fact we had 2 chances to conveniently stop the study.

 

4. Excluding responses (such as outliers).  Sometimes researchers get messy data and decide to “clean” or “remove” some of it. This is usually fine. But when researchers drop responses on the fly, it creates a big problem. By creating a convenient explanation for dropping the data that looks “funny”,  researchers make extra wiggle-room to capitalize on chance findings.  

Remedy: Determine in advance exactly what criteria you will be excluding participants based on. If you use multiple attention checks, insist that you will analyze folks that pass 1 or both. Insist that participants will only be included if they actually finish the study. It’s best to pick a rule and stick with it. 

 

5. Splitting the analysis based on moderators. Suppose we run a study and find no significant differences between ads A and B. Bummer. But someone in the company meeting suggests that there might be a differences by age or gender. We run the analysis split by different age groups, and find nothing. Bummer. We then analyze the data for each gender separately. We find that there’s still no effect among men, but women like ad A (funny ad) more than they like ad b (serious ad). This is problematic, because we simply got lucky and reported a convenient chance finding.

Remedy: Specify which moderators matter in advance. To be fair, there’s absolutely nothing wrong with exploratory data analysis, such as those that test for moderators. The problem is when folks do exploratory analyses and then present them like they’re not. This creates a false understanding of how certain we are of the results. 

 

6. Running analyses with arbitrary “control” variables. Statistically controlling for a variable will always shift around the results, even if the control variable isn’t very correlated with anything. Suppose you find that there’s no significant difference two marketing strategies. But when you statistically control for participant age, suddenly, the effect shows itself. Eureka! You tell yourself that age is a reasonable variable to control for anyway, since people from different age backgrounds will have different opinions about the ads. See the sin here?

Remedy: Specify which variables will be controlled for in advance. If you realize variables need controlled for after the study begins (which certainly happens), report the results both with and without the variable being controlled for. That way, people will know you’re not just fishing around for results. 

 

 

Conclusion

Wiggle-room in the decision-making process of a study or experiment can kill its validity. Most “wiggle room” problems can be eliminated by specifying what you will do in advance. However, research is an imperfect process, and you will almost always have to make decision on the fly. When this happens, be transparent about what you did, and be prepared to show the results in multiple ways.

If you’re an academic researcher, you can use websites such as AsPredicted and the Open Science Framework to pre-register your study methods and hypotheses. If you’re in industry, explicitly write down how you will handle these criteria in advance. Because research is such a chaotic process, run your research design past others and ask them to play devil’s advocate about what you’ll do and how you’ll handle certain issues when they arise. Having it in hard copy will keep you more faithful to the original analysis plan.

Remember: torture your data, and it shall confess.

 

Want to design a water-tight study?

Contact our team to set up a free consultation today.

 

[Form id=”7″]

Related Posts

Leave a Reply