Our mission at COS is to make science more reproducible and dependable. Our strategy to achieve that mission is to make the research process more transparent so that those who follow can understand and then build upon your discovery. One reason why transparency into the research process increases reproducibility is the simple clarity that comes from documenting important materials that are all too often lost. Preserved data, code, and methods allow for others to stand on your shoulders and to push knowledge into new areas.
However, the other method by which transparency increases reproducibility is by bringing clarity not to objects, but to decisions. The decisions that we make when analyzing a data set, and the timing of those decisions, affect the ability to make an inference from the results.
One common example of the need for this transparency comes from undisclosed flexibility in data analysis. Any sufficiently large dataset has many possible ways to measure the relationship between a predictor and an outcome. A great demonstration of the dangers of unreported flexibility comes from Simmons, Nelson, and Simonsohn’s “False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant.” In it they make an assertion from data they collected on real research participants that “people were nearly a year-and-a-half younger after listening to “When I’m Sixty-Four” (adjusted M = 20.1 years) rather than to “Kalimba” (adjusted M = 21.5 years), F(1, 17) = 4.92, p = .040.” What they later go on to demonstrate is that they collected many other variables and repeatedly tested for significance until one of many tests performed came back surprisingly significant. If you want to play around with a large data set to see how different combinations of variables can lead to different and surprising results, play with this excellent tool from 538.com.
Besides unreported flexibility in data analysis, using a dataset to influence how a hypothesis will be tested can also invalidate the ability to make any meaningful inference from your research. In this situation, the precise hypothesis one tests is subtly altered by the incoming data. Known as “hypothesizing after results are known,” (HARKing, Kerr, 1998) any such assertion becomes mired in circular reasoning. A trend seen in a sample is confirmed by that same sample and then the hypothesis suggested by the data cannot be used to make more general inferences to another population.
However, even knowing that these data-led decisions affect the credibility of our results, few of us can clearly recall when the individual decisions were made as we worked through a tough problem. Even if our memories were perfect, the context of the decisions will be lost to future scholars if they are not documented. Of course, our memories are not perfect and we are each faced with motivated reasoning and hindsight bias that cloud our ability to distinguish data-led exploration from the precise tests specified a-priori.
Preregistration documents the process. Preregistration keeps you honest to yourself, and as the Richard Feynman reminds us, the person who is easiest to trick is you.
When creating a preregistration, you create a time-stamped documentation of your ideas as they exist at that time. Including an analysis plan ensures that your ideas are precisely and accurately documented. Creating that document makes clear when future decisions are made, it does not prevent you from making or implementing them.
The most frequent concern I hear about preregistration is that it will stifle exploration; that data-led analyses are how we push knowledge into new areas. I agree that exploration is critical. Preregistration simply creates the line in the sand where confirmation and exploration meet. Crossing that line is a signal to you and to your peers that you are in new, unexpected areas. Perhaps the effect you are measuring only occurs on certain days; if so, that explanation deserves to be put the the test.
If preregistration were to be widely implemented prior to any data collection effort, the result would be a more functional marketplace of ideas. As any economist will tell you, a properly functioning marketplace requires transparency so that the individual players can accurately value the items in that marketplace. The ideas in the marketplace of science are the results of hypothesis testing, confirmatory analyses or the results of hypothesis generating, exploratory analyses. Though they both have value, their values are not equal. Right now, no one can accurately judge the value of most ideas in the published literature. Not the reader, not the peer reviewers, and not even the original author.
In On Liberty, John Stuart Mill laid out the rationale for allowing the marketplace of ideas to exist (though I do not think that term was yet in use). His rationale for fostering a truly free and open debate of ideas and counterarguments is threefold: 1) it allows for false ideas to be countered, 2) it allows for true ideas to be strengthened through the exercise of argument, and, most important of all, 3) it allows for the partially-true concepts to be improved. This rationale lays out why no idea should be stifled, except through counterargument. Our vision for open science mirrors this rationale: ideas must be debated, transparency into the process of science allows that to happen.
Preregistration allows the argument to have meaning. With the status quo, the credibility of most new ideas is hard to judge: are the reported assertions the result of confirmatory hypothesis tests or are they the result of data exploration, deserving of more study? We envision a future where scholarly communication is more than just the advertisement at the end of the study--it's a place where ideas can be freely tested, and the work can be used by the community to advance knowledge.If you want to be part of that future, start your preregistration now.
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-Positive Psychology: Undisclosed Flexibility in Data Collection and Analysis Allows Presenting Anything as Significant. Psychological Science, 22(11), 1359–1366. https://doi.org/10.1177/0956797611417632