Building a Central Service for Preprints

April 30th, 2017, Brian Nosek


Preprints are hot.  The week before last, we described our status and plans for providing open infrastructure to support the growing preprint ecosystem. Last week, we learned that the Chan Zuckerberg Initiative is backing bioRxiv, an important preprint service run by Cold Spring Harbor Laboratory.  

This week, we are sharing our response to ASAPbio’s Request for Application to establish a central preprint service for the life sciences.  We are very pleased to submit this proposal to expand collaborations with Authorea, Manuscripts, Collaborative Knowledge Foundation, CiteSeerx, and Pandoc.RFS.

Our proposal is a community-centered solution to the central service, which for the purposes of the proposal we call “The Commons.”

From the proposal:

"The Commons will connect preprint services in a community-based model. For the typical user discovery interface, The Commons will facilitate discovery of preprints on various hosted preprint services and guide users to engage with the preprint on that hosting service. For preservation and data mining research applications, The Commons will store full-text of preprints that qualify for storage. Our approach provides multiple benefits by following two important principles: standards and best practices are identified most efficiently by facilitating an open marketplace for innovation, but they are adopted most effectively by facilitating collaboration among the community of stakeholders. So with the Commons, interoperability, programmatic access, preservation, and discoverability are accrued through centralization, and innovation, independence, and sharing of best practices are accrued through decentralization."

Emphasizing the community-based approach, we are very grateful to have received letters of support for the proposal from 12 preprint services: arXiv, bioRxiv, AgriXiv, MarXiv, MindrXiv, PaleorXiv, BITSS, SocArXiv, SciELO, PsyArXiv, engrXiv, and FocUS Archive.

Also, The Commons will be built on the OSF, which is already designed and implemented as an open infrastructure for connecting, managing, and preserving the research life-cycle.  

This means that:

"the underlying tools and services powering The Commons are independent of the interfaces. This fosters economy of scale and mutual benefits across stakeholder communities. Feature enhancements to The Commons will extend to the other services built on OSF (e.g., preprint services, registries, repositories, research management) and visa versa. For example, the RFA requests integration between preprints and a data repository to encourage open data. OSF is a data archiving service, but it is also a commons of open repositories. Authors will be able to connect their preprints including data in general and domain-specific biomedical repositories."

We were also very pleased to receive 15 letters of support from biomedical-relevant repositories to foster this connectivity: Dryad, Protein Data Bank, TalkBank, NIAGADS, NeuroMorpho, NAHDAP, ICPSR, Figshare, flybase, Mouse Phenome Database, Dataverse, Protocols, Sage Bionetworks (Synapse), DIP, and Vector Base.

Finally, The Commons should be a public good, and sustained accordingly.  This is highly aligned with our mission and approach for infrastructure development and sustainability as described in our strategic plan.  Our thanks extend finally to the five additional organizations that provided letters of support for us and our approach to preprints: HHMI, Association of Research Libraries, Electrochemical Society, Global Biological Standards Institute, and the Health Research Alliance.

We won’t know the outcome of ASAPbio’s review process for a couple of months.  Even so, with a well-defined roadmap, active team, and supportive community, we will continue to deliver open infrastructure to foster this growing community of preprint services.