Journals! What Are They Good For?

NP-Complete Breakfast

2017-10-10

Here’s my publishing manifesto, or something like that. Not that I have any of this stuff figured out, of course. But I feel like I need to plant a flag somewhere, so I have something to point at.

Step 1: PPPR

I’m not going over the reasons that the current scientific publishing system is broken. I’ll start by assuming that you, the reader, are basically on board (or at least pretty familiar) with these ideas:Assuming you’re some kind of scientist. If you aren’t then all of this might be pretty confusing.

Our work should always be open-access.
In fact, we should publish pre-prints so that our work gets out before the review process is finished.
And hey, if we’re going to put our work out there as a pre-print, we might as well do the reviewing in public, too.

For brevity I’ll refer to that train of thought as the Post-Publication Peer Review (PPPR) scheme. You can make a strong argument that this scheme leads to all kinds of benefits: faster scientific progress, a more level playing field, wider dissemination of knowledge, and so on.There are also some criticisms and concerns with how viable the PPPR scheme is, but I don’t think I can do them justice so I won’t try to summarize them. You can read a more thorough discussion about this in this blog post from Michael Eisen, or honestly in almost any random blog post from Michael Eisen.

There have been some cool ideas that grow out of this, like the concept of an overlay journal. This is a journal that points to existing preprints rather than re-formatting and re-publishing them on its own servers or even (if you can imagine such a thing) on paper. An example of an overlay journal that points to arXiv prints is Discrete Analysis, but they certainly aren’t the first or the last.

Step 2: ???

I have drunk copious amounts of the PPPR Kool-Aid™—I’m convinced that we need to publish early and do so in an open manner. My main concernAnd the only thing that makes this post different from any other post about PPPR is about what PPPR doesn’t fix, and about what arises from the rubble of the current system of journals.

Getting Scooped Still Sucks

One of the main concerns people have with pre-prints is that it will cause them to be scooped—they’ll put their research out there, some other lab will read it, and that lab will race to publish the same work “first” in a refereed journal.⊕Some of this belongs in the possibly-mythical post about academia that I’m planning to write.

This is a pretty silly concern because you can’t scoop something that’s already published (and pre-prints are published). Furthermore, malicious scooping is probably quite rare—it’s more likely that the other lab just doesn’t know what you’re working on.I’ll get to that problem in a minute.

But those answers sidestep a bigger issue, which is that scooping shouldn’t be a thing in the first place. Research projects take years, but being a month later to publication is called “being scooped”. That creates a crazy environment for doing research, and we all are worse off for it:

It’s ridiculously stressful, particularly for less-established scientists and trainees. Through no fault of their own, their career trajectory can be permanently altered. This kind of thing drives people out of science.
It discourages labs without major resources from working in competitive areas. These are often the areas where we’d benefit most from fresh ideas.
When there is an “obvious” advance to be made, there’s a mad rush to get there first and claim credit. This is a waste of time and resources, it creates dangerous incentives for doing bad science, and it leads to widely-covered multi-million-dollar patent disputes.

PPPR doesn’t address this issue, so far as I can see. Instead of being scooped when you see the work in print, it happens when you see the work as a preprint. If we assume that preprints are deposited at about the same point that papers get submitted for review, this means a difference of several months but rarely more. In the cases when getting scooped really hurts—after multiple years of effort—it’ll still really hurt.⊕The focus on priority is still relevant to questions of intellectual property and patents and whatnot. There’s a whole secondary discussion to be had about those issues. They’ll probably continue to exist regardless of the system we adopt, but first-to-file may have obviated some of the problems there.

Whatever system we have, it shouldn’t punish people for finding similar results simultaneously. If anything we should celebrate such discoveries: they are wholly independent replications of the finding! We don’t get enough of that as it is.

Redundant Work is a Waste

While it’s great to get independent validation of a result, it would likely be even better if those groups knew they were working on the same topic, and instead of competing with each other to publish first, they formed a loose collaboration and worked together. This doesn’t need to be a formal endeavor—tight collaboration requires a lot of work to maintain, and when done poorly it leads to a lot of headaches. A loose collaboration is more about sharing protocols, preliminary and negative results, and ideas. This can be common within an institution but is typically rare when there is no pre-existing relationship between labs.

Often a type of collaboration can happen late in the game: two groups will learn of each other and try to publish together. That might not involve any real exchange of knowledge, and barely qualifies as collaboration in my mind.It might be better described as collusion Those groups are trying to solve the prisoner’s dilemma that comes from the threat scooping—if one publishes first they get more glory, but the arbitrary delays of publication mean they risk losing it all to the other group. Publishing together is safer. But most of the hard work was done in parallel, without any communication between labs. This is a waste of a lot of people’s time and money, and leaves a more confusing scientific literature for everyone to look at later.

Step 3: Non-profit!

Here is where I am supposed to lay out a proposal for solving those problems. Unfortunately I don’t really have one. I do have some vague ideas about how I wish research worked, and I’ll outline those ideas here.

Open Notebooks

Rather than doing our work privately and then publishing a complete story at the end, we should be transparent about the (usually messy!) process, and highlight the results we think they are notable enough to share with the community.

I’m hardly the first person to suggest this—many scientists are already doing it.Which is super impressive—it’s scary to put your work out there that way To change the culture of research, however, it can’t just be a handful of idealists: it needs to be the standard practice. Unfortunately there isn’t a whole lot of reason to do it. The potential benefits for other people are easy to imagine:

Access to data as soon as they are collected allows other researchers to build on results quickly.
Many eyes on data helps prevent errors and allows for novel interpretations, free from the bias or preferred outcome of the experimenter.
Public notebooks prevent p-hacking and all kinds of other shady practices.
It also provides an honest account of the progression of the research, instead of the Newspeak-like “we have always been working on protein X” narrative of a paper.

That last point is important: being more open about how research is done is good for public engagement and it’s good for the mental health of trainees, who get to see that their own experience is the norm. Research is hard and the research community should stop pretending otherwise.

⊕To be clear, I realize that the paper lab notebook isn’t going away. It’s just too useful, and paper tends to last longer than bits despite our best efforts. We still need a way to share our progress as it happens.

What’s less clear is why any individual would pursue this strategy. Beyond gaining some recognition and reputation for having the chutzpah to be open, it seems to present much more risk than reward. This is where the whole ecosystem needs to change, in a major way: we need a system that allows open collaboration at any level of engagement, and can aggregate that collaboration and contribution into a CV that the researcher can point to in the future. Luckily, we have a reasonably good model for what this could look like.

The GitHub Model

Open source software provides a model, albeit imperfect, for how to build this ecosystem. GitHub in particular is the de facto clearinghouse for a developer to display their credentials.To be honest I don’t know how true this is nowadays Every contribution is recorded: from personal projects and major contributions to public resources, all the way down to opening a bug report on an obscure repository. Moreover, the quality of the work can be inspected: one can read the code that they are writing and judge its quality. One can see how they interact with other projects and whether they provide helpful feedback or pointless criticism.

The equivalent ecosystem for researchers is really quite similar: a siteOr a federation of sites! where researchers can deposit their data, analyze and discuss their results, and collaborate with others at many levels.⊕Just as with code, there will always be “private repos” of work that is not yet ready to be released. But as the ecosystem evolves I think these will become less common. Everyone starts with a mess and refines it slowly—everyone mislabels data and makes mistakes and recalculates. These mistakes are normal and don’t need to be hidden, and starting from scratch in the open allows us to build bootstrapping materials that get projects started faster. Collaboration could range from helpful trouble-shooting to in-depth peer review—with open data and reproducible analysis pipelines, peer review can involve independent validation and reproduction of the analyses. Every researcher always has an up-to-date CV that includes their own work, their collaborations big and small, and their service in the form of peer review and any other contributions to the scientific community.

Journals: what are they good for? Journalism!

To return to the title of this post: whither the existing journals in this future utopia? It’s a common expectation that PPPR will make journals obsolete, because the ecosystem of preprints and public review will destroy the need for subscriptions.This is mentioned in Eisen’s post, for instance I don’t know if it would, but I definitely don’t think it should do such a thing. Journals are legitimately useful in many ways: they highlight notable research (as best they can); they report on news relevant to their target community; and they provide a secondary perspective, typically from an eminent third-party, for particularly interesting work. Journals also tend to publish reviews which can provide a useful overview of a given topic—typically these reviews are invited publications and thus fall outside of the typical peer-review process.

All of these services are useful for researchers, and it’s not difficult to imagine a journal that provides only those services: one that highlights notable work (in preprint form or even straight from a public notebook) rather than trying to filter out the most “impactful” work from an ocean of research. A journal that focused on science journalism would be less biased by famous names and cozy relationships.but surely still biased, just like any media can be They would be reacting to impactful research rather than decreeing what research should be impactful.

In this scenario it seems likely there would be far fewer journals—given the vast number of predatory and/or poorly-edited journals, this is probably a good thing. Journals that provide a value-add beyond “you’re published” will still be viable—certainly the likes of Science and Nature will stick around to report on matters of interest to the entire scientific community. This will include reporting on the most notable research being done, but their role will be quite different: they will be obligated to discuss the work that the community deems notable, rather than being the gatekeepers who selected it.