Science News on Stylometry and Oz

Ran across this 2003 article from Science News about stylometry — using mathematical models driven by computers to determine authorship of disputed works.

For example, two years after L. Frank Baum died The Royal Book of Oz was published and billed as Baum’s final Oz novel. But stylometric analysis of the book suggests that it was probably written by Ruth Plumly Thompson, which Oz fanatics had long suspected.

According to Science News,

Binongo’s work on The Royal Book of Oz is a good example. He started by collecting other samples of Baum’s and Thompson’s writings and breaking the samples into 5,000-word chunks. He then found the 50 most frequently used words in the body of texts and counted how often each word appeared in each chunk. This process distilled each chunk to 50 numbers.

Just as two numbers specify a point in two-dimensional space, and three numbers a point in three-dimensional space, the 50 numbers associated with each chunk of text specify a point in 50-dimensional space. Any differences in the scatter of Baum’s and Thompson’s points could be potential clues to the writers’ different styles.

. . .

There’s no guarantee that a pattern will show up in this plane. In the case of the Oz books, however, a pattern leaps out. The Baum texts cluster in one half of the plane, while the Thompson texts sit in the other half, showing what Binongo calls a clear “stylistic gulf.”

When chunks of The Royal Book of Oz are plotted in the same plane, they all land squarely in Thompson’s half.

“With this unerring consistency, we have confidence in our identification of Thompson as the author of the 15th book,” Binongo said in the spring issue of Chance.

The article details other such finds, along with a good outline of both the strengths and weaknesses of using this sort of stastical model to establish authorship (essentially, the more undisputed material by an author and the better preserved the text-in-question, the more reliable the technique is).


Statistical tests are unraveling knotty literary mysteries. Erica Klarreich, Science News, December 20, 2003.