Open access and the Internet ArchiveJanuary 6th, 2017 by Andrew
Late last year, I wanted to find out when the first article was published by F1000 Research. I idly thought, oh, rather than try and decipher their URLs or click “back” through their list of articles fifty times, I’ll go and look at the Internet Archive. To my utter astonishment, they’re not on it. From their robots.txt, buried among a list of (apparently) SEO-related crawler blocks –
The Internet Archive is well-behaved, and honours this restriction. Good for them. But putting the restriction there in the first place is baffling – surely a core goal of making articles open-access is to enable distribution, to ensure content is widely spread. And before we say “but of course F1000 won’t go away”, it is worth remembering that of 250 independently-run OA journals in existence in 2002, 40% had ceased publishing by 2013, and almost 10% had disappeared from the web entirely (see Björk et al 2016, table 1). Permanence is not always predictable, and backups are cheap.
Their stated backup policy is that articles (and presumably reviews?) are stored at PMC, Portico, and in the British Library. That’s great. But that’s just the articles. Allowing the IA to index the site content costs nothing, it provides an extra backup, and it ensures that the “context” of the journal – authorial instructions, for example, or fees – remains available. This can be very important for other purposes – I couldn’t have done my work on Elsevier embargoes without IA copies of odd documents from their website, for example.
And… well, it’s a bit symbolic. If you’re making a great thing of being open, you should take that to its logical conclusion and allow people to make copies of your stuff. Don’t lock it away from indexing and crawling. PLOS One have Internet Archive copies. So do Nature Communications, Scientific Reports, BMJ Open, Open Library of the Humanities, PeerJ. In fact, every prominent all-OA title I’ve checked happily allows this. Why not F1000? Is it an oversight? A misunderstanding? I find it hard to imagine it would be a deliberate move on their part…