Archive for November, 2015

Page and colour charges: they’re still a thing

Tuesday, November 24th, 2015

So, I have a paper out! Very exciting – this is my first ‘proper’ academic publication (and it came out the day after my birthday, so there’s that, too.)

Gray, Andrew (2015). Considering Non-Open Access Publication Charges in the “Total Cost of Publication”. Publications 2015, 3(4), 248-262; doi:10.3390/publications3040248

Recent research has tried to calculate the “total cost of publication” in the British academic sector, bringing together the costs of journal subscriptions, the article processing charges (APCs) paid to publish open-access content, and the indirect costs of handling open-access mandates. This study adds an estimate for the other publication charges (predominantly page and colour charges) currently paid by research institutions, a significant element which has been neglected by recent studies. When these charges are included in the calculation, the total cost to institutions as of 2013/14 is around 18.5% over and above the cost of journal subscriptions—11% from APCs, 5.5% from indirect costs, and 2% from other publication charges. For the British academic sector as a whole, this represents a total cost of publication around £213 million against a conservatively estimated journal spend of £180 million, with non-APC publication charges representing around £3.6 million. A case study is presented to show that these costs may be unexpectedly high for individual institutions, depending on disciplinary focus. The feasibility of collecting this data on a widespread basis is discussed, along with the possibility of using it to inform future subscription negotiations with publishers.

The problem

So what’s this all about, then?

We (in the UK particularly) have spent a lot of effort trying to reduce the cost of the scholarly publishing system, which is remarkably high; British university libraries collectively spend £180,000,000 per year on subscriptions, comparable to the entire budget of one of the smaller research councils. The major driver here is open access – trying to make research available to read without charges – and so there has been a lot of interest in trying to arrange matters so that the costs of publishing open access don’t rise faster than the corresponding reduction in subscriptions. The general term for this is the “total cost of publication” (TCP) – ie, the costs of all the parts of the system, including both direct spending and indirect management costs (it’s surprising how much it costs to shuffle paperwork).

This is a sensible goal – it keeps the net cost under control – but the focus on OA costs and subscriptions misses out some other contributions to the balance sheet.

Historically, a lot of the cost of scholarly publishing was borne by authors or their institutions through publication charges – page charges, colour charges, submission charges, and a few other oddities. These became less common (for various reasons, and there’s an interesting history to be written) through the 1980s, and – outside of open-access article processing charges – compulsory publication charges are now rare for most journals in most fields. To many researchers (including a lot of those who’ve helped set OA policy), they simply don’t exist as a significant concern.

However, during 2013-14 it became rapidly apparent to me that my institution was spending a lot of money on page charges, which didn’t fit with what was being reported elsewhere, and didn’t fit with the general recommendations from the funding bodies on how to allocate costs. These charges were not being taken into consideration in the various TCP offsetting schemes, with the effect that we were seeing a lot of spending going direct to publishers, but outside the carefully constructed framework for controlling costs.

The study

I dug back through the recent literature on the costs of journal publishing – there had been a flurry of studies in the early 2000s as people began to work out how to handle OA costs – and tried to determine what the levels of other “publication charges” had been just before OA spending took off. It turned out to be tricky to come up with a firm estimate, but my best guess was that non-OA publication charges were around 3-5% of subscription costs in 2004-5, and had dropped since then. By now (ie 2013/14), it’s probably around 2%, assuming a continual gentle decline.

Firstly, this is quite a lot of money. If British universities spend £180,000,000 per year, then 2% is a further £3,600,000 – comparable to forty or fifty PhD studentships. It’s particularly striking when we bear in mind that this is money many institutions may not realise they are spending.

Secondly, it’s clear that the cost is distributed very erratically. My own institution spent the equivalent of 15-18% of its subscription budget on non-OA publication charges, driven mainly by very heavy page charges in certain well-used earth sciences journals. (From another angle, Frank Norman has since reported that his institution, in biomedicine, had non-OA publication charges equal to about 10% of subscriptions, and in the early 2000s it was three times that.) Given the disciplinary concentration, it’s likely that spending in universities is similarly patchy – individual departments may have dramatically higher publication costs than the overall average.

Thirdly, this spending is, currently, invisible to policymakers. Of the 29 institutions who provided article-level spending records for 3,721 papers in 2014, only fifteen individual papers could be identified as having page or colour charges (mostly at Leeds), with another ten mentioned in the general reports. Twenty-five papers is clearly not going to get us anywhere near the overall spending estimates. This data isn’t being collected centrally by RCUK/JISC – who are otherwise doing sterling work on tracking APCs – and it’s not clear if it even gets collected centrally by universities. The majority of non-OA publication charges may just disappear into the morass of “miscellaneous spending” in grant budgets.

Where next?

Firstly, we need to get a good idea of what’s actually being spent. My 2% estimate is a pretty wide one – I wouldn’t be surprised if it was 1% or 3%, or further away. The methodology we used was quite time-consuming – effectively identifying every paper with possible charges and chasing the authors to confirm – but it did work. Perhaps a better method, for larger institutions, would be sampling the departments with probable concentrations of page charges, or it might be that some institutions have robust enough finance systems that a lot of cases can be identified with a bit of research. Perhaps we can even obtain this information direct from publishers. Whatever method is used, the existing RCUK/JISC APC reporting infrastructure offers a good way to report it to a central body for aggregation, deduplication, and republication.

Secondly, we need to account for non-OA publication charges as part of the total cost of publication. They are smaller than APCs, but they are very significant for some institutions. While it may not be appropriate to use the same offsetting schemes, if they’re not brought into the equation there will be an risk that publishers are tempted to increase them dramatically – an extra revenue stream which is not capped and controlled in the way that subscriptions and APCs are. There’s no sign that anyone is doing this now – and most of the major commercial publishers no longer use page charges – but it remains a concern.

Lastly – the “more research is needed” section – there are two big questions still outstanding for the total cost of publication, even with this new element added.

  • What about the indirect costs of subscription publishing? We have a good handle on the indirect costs of running repositories and handling OA payments, but we have no idea what the infrastructure to keep a subscripton system working costs us. This might include, for example, things like – the cost of staff time to manage subscriptions; the cost of staff time to run authentication and proxy servers; the cash cost of third-party authentication services like Athens; the cost to the publishers of maintaining security barriers; the cost in wasted researcher time trying to obtain material; &c.
  • If everything is expressed as a proportion of subscription spending, how much is that? My £180,000,000 figure is an inflation-adjusted estimate, based on data from SCONUL in 2010/11. There have been more recent SCONUL surveys, but not published. A firm understanding of how much we actually spend is vital to actually make sense of these results.

Watching the Antarctic days roll by

Sunday, November 22nd, 2015

[Note: this post embeds some very large gif files. Cancel now if on a slow connection…]

A while ago, I was playing with imagemagick (it’s an amazing tool) and trying to make animated gifs. It worked, sort of. One of the things I’d been meaning to try for a while – but never quite got around to – was animating webcam images. Last week, I finally got around to it.

At work, we have a webcam pointed at the Halley VI Antarctic station. It’s turned on year-round, sending back one picture hourly, fairly reliably. Being on a pole in the middle of Antarctica, it’s also free from the major problem that arises when trying to animate webcams – someone moving them around every now and again.

And the pictures are remarkable. Halley VI is an imposing-looking building at the best of times, but on a dark morning, looming out of a snowstorm, it’s like something from a film.

Twenty days in late November 2014 – note the sun tracking by the top of the image each day.

Ten days at the end of January 2015, with 24-hour daylight and a lot of activity around the station.

One shot each day (at 12.30pm UK time, so about 10am? local solar time), chained over 373 days – so slightly more than a full year. It opens in mid-November 2014, about the time the first aircraft arrive and the summer activities begin, passes through the (very busy) summer season, then quietens down as winter approaches. The nights appear as momentary flashes, then get longer and longer until they’re permanently dark in June/July. Then it slowly returns…

The code for this is pretty simple. Assemble all the files in a single directory – either sourced locally or downloaded with wget/curl – and ensure they’re named in a sequential way. All of these, for example, were of the form halley-2015-01-02-12-30.jpg – the 12.30 shot on January 1st.

Make sure to delete any that returned error messages in the download or are below a certain size. I had one or two zero-content frames that made the system hiccup a bit, and find images/*.jpg -size 0 -delete is good for handling these.

Then run:

convert -resize 500x500 images/*.jpg animation.gif

That’s it. The resize is to prevent it getting disgustingly large; adding -optimize shaves a little more off the filesize. Even so, though, you’ll find that assembling more than a few hundred frames makes your system quite unhappy (it may lock up) and the resulting gif is far too large to be useful. For the images above, some examples of filters on the merge:

convert -resize 500x500 images/halley-2015-01-2*.jpg animation.gif

convert -resize 500x500 images/*12-30.jpg animation.gif

– so it only pulled together the frames we were interested in. Of course, you could do a simpler (or more complex) merge by copying the relevant ones to a separate directory and just merging everything there.

Given the size problems of gifs, making a larger one is probably best left to video. Here’s the entire year, using every frame (23 MB):

A year at Halley VI

Note how short the day/night pulses get towards the ends of the spring/autumn.

For this, you don’t have to resize, and you can produce it at the full size of the webcam images (in this case, 1920×1080):

mencoder mf://images/*.jpg -mf w=1920:h=1080:fps=25:type=jpg -ovc lavc -lavcopts vcodec=mpeg4:mbd=2:trell -oac copy -o halley.avi

The key part here is the images list (you can filter again as before) and the fps=25; I ran it at various speeds and found 40fps seemed to be a happy medium. 25fps is just a little jerky. The version above is reduced to 512px wide:

mencoder mf://images/*.jpg -mf w=1920:h=1080:fps=25:type=jpg -vf scale=512:288 -ovc lavc -lavcopts vcodec=mpeg4:mbd=2:trell -oac copy -o halley.avi