The study doesn’t analyze why these specific journals disappeared, or their quality, but it found that over 50% of them had an academic affiliation. As far as topics, over 50% of the vanished journals were about social sciences and humanities, although health, physical science, mathematics and life sciences were also represented.
“There is usually an immense amount of time contributed by a lot of different people behind every article,” from the authors, to the editors, all the way to peer-reviewers, Laakso told CNN.
“For all that work to be nullified and cut off from ever making an impact on the world, for such a trivial reason as not having a backup system in place for PDF files is not something that should be accepted,” Laakso added.
The study, published as a pre-print, is available on arXiv, an open access archive of scholarly articles.
Tracking vanished journals
With little documentation available on what content falls offline, researchers said they had to do some “detective work” to gather data, something they believe speaks to the need for better tools to capture this phenomenon.
In terms of absolute numbers, the study finds that only a small proportion of open access journals disappeared within the past two decades, but the authors warn against reading that with optimism.
“We think that more journals might be at risk of vanishing in the future,” Lisa Matthias, a Ph.D. candidate at the Free University of Berlin and a co-author of the study, told CNN.
The study identified 900 “inactive” journals that may be at risk of vanishing, since over three-quarters of the journals that ended up falling offline did so within 5 years from the last publication.
In an email to CNN, the Directory of Open Access Journals said the study “reinforces our view that DOAJ must help those journals, indexed with us, to preserve their content, and we need to find a model where, depending on their economic profile, the cost of doing so is not always passed back to the journal.”
‘An ever-shifting set of sands’
Why does digital content disappear from the internet? There are plenty of reasons, ranging from technological advances that make webpages obsolete, to Web hosting bills going unpaid.
“The Web is an ever-shifting set of sands,” Kahle said.
The issue affects all kinds of digital content, but when it comes to scholarly literature, there are still gaps in knowledge about what is even out there to be saved.
The Internet Archive set out to find and archive all journal articles available online in 2018, and more recently, it received funding from the Mellon Foundation to pursue this goal, Kahle explained.
“By our analysis, 18%, or over 3 million, open access articles since 1945 are not independently archived, either by us or by other preservation organizations,” Kahle said. The Internet Archive and the authors of the study on vanishing open access journals have joined forces to address the problem.
The cost of preserving knowledge
According to Ruttenberg, the study on vanished open access journals is “a wake-up call for us to pay more attention.”
What is needed, according to Ruttenberg, are coordinated approaches as the scientific community moves from a commercially dominated mode of publishing to open access.
“This story is about resource allocation and coordination,” Ruttenberg said.
Subscription-based digital scholarly content is not exempt from the issue of vanishing from the Web, but content from smaller or more independent open access publishers lacks some of the protections and resources that commercial content is more likely to enjoy.
“The publishing technologies employed to address preservation and archiving are mostly US or European initiatives where the solutions come with a price,” the Directory of Open Access Journals told CNN in an email.
“For traditional commercial or society publishers, the fees to implement such a service and then deposit in them are negligible, compared to the income from subscriptions or open access publication charges. For small, scholar-led publishers or for single journals, often with no steady revenue stream, the fees can be prohibitive,” DOAJ explained.
There are also technical issues to consider.
“To get the content into a service can require specialized knowledge and often involves some form of testing and sampling. The individuals running these journals may not have the time, skills or funding to be able to do this,” DOAJ explained.
The value of open access content
Internet Archive founder Brewster Kahle cautioned that looking at blind spots in how open access journals from the past were preserved shouldn’t suggest that commercial publishers are better equipped to handle preservation than open access publishers.
“Those guys are designed to be archived, they are designed to be picked up and used for new types of research,” Kahle told CNN.
“When you can gather these materials, you can start to do studies on the whole body of knowledge. You can do what’s called meta-science, or the science of science,” he said.
Such studies allow for the detection of biases, or new patterns.
A work in progress
“The challenge in the transition is to make sure that we end up with the infrastructure for libraries to be able to coordinate their investments in open content, the same way that we have all kinds of tools to coordinate our investments in subscriptions or purchased content,” Ruttenberg said.
At a time when so many are turning to online resources for their learning, due to the pandemic, the conversation on open access knowledge is all the more relevant.
“Covid and the mass transition to virtual research and learning is a huge demonstration of the need for open access,” Ruttenberg said.