Today I came across a lovely project from JSTOR & the Folger Library – a set of Shakespeare’s plays, each line annotated by the number of times it is cited/discussed by articles within JSTOR.
“This is awesome”, I thought, “I wonder what happens if you graph it?”
So, without further ado, here’s the “JSTOR citation intensity” for three arbitrarily selected plays:
Blue is numbers of citations per line; red is no citations. In no particular order, a few things that immediately jumped out at me –
- basically no-one seems to care about the late middle – the end of Act 2 and the start of Act 3 – of A Midsummer Night’s Dream;
- “… a tale / told by an idiot, full of sound and fury, / signifying nothing” (Macbeth, 5.5) is apparently more popular than anything else in these three plays;
- Othello has far fewer “very popular” lines than the other two.
Macbeth has the most popular bits, and is also the most densely cited – only 25.1% of its lines were never cited, against 30.3% in Othello and 36.9% in A Midsummer Night’s Dream.
I have no idea if these are actually interesting thoughts – my academic engagement with Shakespeare more or less reached its high-water mark sixteen years ago! – but I liked them…
How to generate these numbers? Copy-paste the page into a blank text file (text
), then use the following bash command to clean it all up –
grep "FTLN " text | sed 's/^.*FTLN/FTLN/g' | cut -b 10- | sed 's/[A-Z]/ /g' | cut -f 1 -d " " | sed 's/text//g' > numberedextracts
Paste into a spreadsheet against a column numbered 1-4000 or so, and graph away…