J.K. Rowling and computational linguistics

One of the areas which some computational linguists are interested in is what is sometimes calledstylometrics. In this field, linguistic features of texts are analysed to make attributions of authorship. The use of computers has made it possible to apply these techniques in a precise and quantitative way.

Although J.K.Rowling has outed herself as author of the novel The Cuckoo’s Calling, published under the pseudonym Robert Galbraith, in the period where this was a matter for speculation a leader in the field of stylometrics looked at the evidence. Patrick Juola has written an account of his investigation published via the wonderful Language Log.

Juola looked at a rather small body of data (by the standards of computational linguists these days), but he did include samples from a number of writers for comparison (known in the trade as ‘distractors’). His results, as he explains scrupulously, do not lead to a clear attribution of authorship; they do point to Rowling as the most likely author, and they pretty much rule out all the other authors included.

It’s an interesting read, giving a flavour of what people do with computers and language, and also the limitations of this sort of approach. Any students (or potential students) reading this and feeling twinges of interest should note that I teach an introduction to computational linguistics every second year (offered again in 2014). Feel free to enrol!

Related links