A discussion about a grad student’s dilemma on The Black Goat “It’s So Complicated” podcast caught my attention last week (June 28, 2017). The student wrote about a situation in which the collaborating mentor was advocating some questionable research practices, perhaps with the innocent intention of improving publishability, without recognizing implications for the credibility of the evidence. Part of the discussion centered on how awareness and tolerance of questionable research practices varies across researchers.
They also discussed what the student could do without creating conflict or accusing the collaborator of ill intent. Alexa Tullett, one of the hosts, made a very interesting point--if some of the motivation behind advocating questionable research practices is increasing publishability of findings, then providing concrete examples showing that research can be published without applying those questionable research practices could change behavior.
For example, a mentor might encourage a student to recast exploratory, unexpected findings with a narrative that implies that they were derived from theory and expected all along. This recommendation might derive from the widespread belief that hypothesize-test-support narratives are better for reading and publishing. If the student had concrete examples of successful publications with narratives that embraced unexpected discovery, then sharing those could facilitate conversation about how to present one’s own unexpected results as accurately as possible.
Alexa’s suggestion could increase collaborative efforts toward best practices. Here are a few examples from my publication history that might be useful for fostering productive discussions like this.
Questionable Claim: To be publishable or readable, narratives must written implying or stating that the findings were anticipated in advance.
Counterclaim: Introductions can be used effectively to introduce a surprising result that is investigated further in the reported studies.
Illustrative paper: Nosek, B. A., Banaji, M. R., & Greenwald, A. G. (2002). Math = Male, Me = Female, therefore Math ≄ Me. Journal of Personality and Social Psychology, 83(1), 44-59.
In my master’s thesis, Nosek, Banaji, and Greenwald (2002), we reported evidence of gender differences in implicit attitudes toward math and science between men and women. This idea originated from an unexpected discovery. In what was initially conceptualized as a control task, participants completed an Implicit Association Test (IAT) categorizing numbers and letters and pleasant and unpleasant words. In exploratory analysis, a gender difference was observed. Here is how we wrote about it in the introduction:
The paper embraced the serendipity and avoided over-claiming of the exploratory results by reporting the studies in the introduction rather than as the main studies of the paper. [Were I writing the paper today, I would not have included p-values associated with exploratory findings.]
Questionable claim: To be publishable or readable, narratives must report just one way of analyzing the data (the one that “looks best” might be most likely to be selected).
Counterclaim: Reporting multiple analyses can facilitate greater transparency of what analysis was conducted, provide clear information about the robustness of the result with different assumptions, and give reader a fair understanding of the range of observed effects in the data--all without losing a functional narrative.
Illustrative paper: Nosek, B. A., Smyth, F. L., Sriram, N., Lindner, N. M., Devos, T., Ayala, A., Bar-Anan, Y., Bergh, R., Cai, H., Gonsalkorale, K., Kesebir, S., Maliszewski, N., Neto, F., Olli, E., Park, J., Schnabel, K., Shiomura, K., Tulbure, B., Wiers, R. W., Somogyi, M., Akrami, N., Ekehammar, B., Vianello, M., Banaji, M. R., & Greenwald, A. G. (2009). National differences in gender-science stereotypes predict national sex differences in science and math achievement. Proceedings of the National Academy of Sciences, 106, 10593-10597.
In Nosek, Smyth, et al. (2009), we reported evidence across cultures that gender differences in implicit stereotypes associating science with men more than women were associated with gender differences in science and math achievement. We were particularly concerned about the possibility of spurious results because of the small number of countries with sufficient data to estimate the two key variables. There were many choices to make about which dependent variable to use, whether to weight the data by sample sizes, whether to drop outliers, and whether to include covariates. We didn’t have the term p-hacking in 2009, but we recognized the implications of these choices for the confidence in statistical inference. Instead of reporting one combination of analysis choices, we reported all of them. The key summary appeared in Table 1:
The Table reports 6 analysis strategies for the same finding (rows) for four different dependent variables (columns) yielding a total of 24 analyses of the same research question. By vote counting, 20 of the 24 analysis strategies revealed a significant effect (p < .05). In the paper, we concluded that the positive relationship was robust. Further, transparently reporting all of the results allows the reader to evaluate choice and our inference and, possibly, come to a different conclusion.
Questionable claim: To be publishable or readable, the theory must be in the introduction. Theory is what sets up the research.
Counterclaim: Sometimes there are clear questions, but weak theory. Good papers can be motivated by interesting questions, and the implications for theory might follow the evidence. Strong narratives can be achieved with the theoretical work occurring at the end.
Illustrative paper: Nosek, B. A., & Hansen, J. J. (2008). The associations in our heads belong to us: Searching for attitudes and knowledge in implicit evaluation. Cognition and Emotion, 22, 553-594.
In many ways, I think that Nosek and Hansen (2008) is the best paper that I have ever written. There are two stand-out features: (1) it contains more than 150 replications, and (2) most of the theoretical work occurs in the General Discussion. The introduction sets up a decidedly empirical question -- does the Implicit Association Test assess what we would explicitly consider cultural knowledge as distinct from what we would explicitly consider “personal” attitudes? From the introduction:
The introduction also anticipates that there are different theoretical interpretations of the same answer to that empirical question. Both perspectives anticipate that cultural knowledge is related to the IAT, and they differ in the implications of such a relationship.
We conducted the research assuming that we would find empirical support for the shared assumption and then we would spend the General Discussion focused on how the two theoretical perspectives deal with that. Remarkably, we found little to no support for the shared empirical expectation:
This led to a different theoretical discussion about how the perspectives can be updated to make sense of a lack of relationship between cultural knowledge and the IAT:
From my perspective, of all the papers I have written, this one is the most theoretically challenging and generative for research in implicit social cognition. Almost all of that work was in the General Discussion.
At minimum, these three papers are existence proofs that it is possible to publish papers that violate some of the prevailing norms about how papers must be written. More optimistically, they may provide concrete exemplars for publishing papers that adhere more closely to how the research actually occurred rather than trying to fit an idealized narrative about how science is supposed to occur and be reported.
Next time: Publishing replications, publishing null results, and publishing results contrary to prevailing wisdom. And, I hope that others will be inspired by Alexa’s suggestion to share their own examples of trying to embody good practices and still meet prevailing standards for publishing.