Damiano Mazza -- On scientific publications

On publications in theoretical computer science

27 January 2012

Today's digital means make typesetting, formatting, editing and distribuiting scientific articles considerably easier, more efficient and cheaper than in the past, at least in theoretical computer science and neighboring disciplines (e.g. mathematics, theoretical physics). Moreover, these digital means are naturally oriented towards the open distribution and retrieval of articles, in stark contrast with the needs of traditional publishing companies. In this scenario, the traditional publishing process seems more of an obstacle to the diffusion of scientific knowledge than a means of facilitating it. (For more information on open access journals, and why they should supplant pay-per-view ones, take a look here and here).

Personally, I see the future of publishing in theoretical computer science (and, perhaps, other scientific disciplines as well) in open access, interactive paper repositories. What I have in mind is an integration of currently existing electronic preprint archives, such as arXiv or its french cousin HAL, with online submission and interactive discussion systems, such as EasyChair. A rough description of the publishing process may be the following:

the author submits his/her paper to the repository, much like papers are submitted to arXiv;
the submitted paper is automatically associated with a discussion forum, on which reviews may be written, anonymously or not, and through which the author may respond, perhaps also improving his/her original submission;
in the meantime, the paper may be freely presented at international conferences, whose role in the dissemination of scientific work will always remain essential;
the discussion forum will stay open forever (or until the author decides, for some reason, to withdraw the submission), with the paper always open to access, review, criticism and improvement, even decades after its "publication".

In theory, at this point there would be no need to distinguish between a "preprint" and a "publication"; simply looking at a submission's forum, its evolution, the amount and quality of discussion it produced, etc., is in my opinion a perfect way of gauging its scientific value.

Although I do believe that the above scheme can eventually be adopted by the international research community, I realize that it would be considered by the great majority of contemporary computer scientists idealistic at best and unacceptable in practice. However, this does not make the idea of open interactive publication repositories less appealing. In fact, the above scheme may take the following, more conservative form:

submissions to the repository are free but are handled by an editorial board, just like today's journals. The editors have the right to refuse a submission due to it being out of scope, irrelevant or outright ridiculous (think of the random amateur computer scientist submitting his/her "proof" of P=NP).
If the submission is approved by the editors, it is immediately rendered available online, under the status of "preprint", with its related discussion forum open for comments. In the meantime, the editors assign the submission to a couple of referees; their (anonymous) reviews will contribute to populate the discussion forum and may actually constitute the starting point of the discussion.
Again, it is debatable whether write-access to the forum should be completely open. To prevent the random amateur computer scientist from posting his/her claims about P being equal to NP in the forum related to some random (but otherwise perfectly respectable) paper, we may imagine that write-access is restricted to people who are authorized by a recognized research institution. We may also imagine that the forum is somehow moderated by assistant editors and such.
As with the above scheme, the "preprint" may (and should) be presented and discussed at international conferences. Acceptance criteria to such conferences may perhaps benefit from the discussion which is already going on in the forum.
Whenever the editor in charge of the submission, helped by the discussion forum, estimates that its scientific value is high enough, its status is updated to "publication". Otherwise, the paper remains a "preprint" forever, or until the author decides to withdraw it. In any case, the preprint/publication remains freely accessible and open to further comments, reviews, modifications, etc.

Who pays for all this? Open interactive repositories, if they will exist at all, will most likely be sponsored, hosted and maintained by academic institutions. The cost of managing the system (personnel and equipment) will probably be a fraction of what Universities and other research institutions presently spend for obtaining access to scientific journals. In the end, the system may actually turn out to be economically convenient.

At the very worst, the above scheme reproduces exactly the present-day process of submission, refereeing and acceptance of a journal paper, with all of its pros and cons, except for three points:

the paper is immediately available to everyone;
the reviews are public (although still anonymous);
research institutions may end up saving some money.

Point 3 does not need any comment. Point 1 is actually a very minor difference with respect to today's reality: many papers submitted to theoretical computer science journals are already concurrently available on arXiv or similar preprint archives (or even on the author's web page). Point 2 is more radical. It may meet the opposition of some, but I think that a review elaborated with the knowledge that it may be read by more people other than the author and the editor will be written with more care. This, in my opinion, cannot hurt the peer-reviewing process.

At best, the use of open interactive repositories may substantially improve the efficiency, quality and equity of publishing (the growing importance of online scientific interaction is discussed in [1]). If working properly, the system can entirely supplant both journals and conference proceedings. The time and energy currently dedicated to refereeing dozens of conference papers may be shifted to online discussion forums, restoring conferences and workshops to their natural role of vehicles of personal interaction. The system will also have a strong impact on the evaluation of a researcher's work: a publication, which is currently little more than a bare bibliographic reference, will come together with a sample of the interaction (or lack thereof) it generated, giving richer, more constructive and more flexible elements of evaluation than bibliometric indices. (On this matter, it will be important to ensure that comments in forums do not become "grade assignments"; I would be horrified by seeing a discussion forum about a scientific paper filled with "thumbs up" or "thumbs down" icons like a YouTube video. Of course, the idea that a scientific paper may be assigned a grade or absolute index of any sort is simply ridiculous).

References

[1] Hendler, James (2008). "Reinventing Academic Publishing - Part 3". IEEE Intelligent Systems 23(1):2-3.