So there was a discussion amongst the members of the Fantasy Inn earlier this week, stemming from a furious argument on Twitter over the use of the Oxford comma. FYI, the Oxford comma is absolutely a vital piece of punctuation.
Anyway, upon researching which authors did and didn’t utilize the best punctuation mark of all time, we discovered that a certain well-read author actually preferred not to use ANY commas at all in their lists. This author would simply break up every item with an “and”, which seemed to me an interesting stylistic choice—one that I hadn’t actually picked up on when initially reading the book.
Is using so many “and”s overkill? How many can you use in one novel anyway? What is the average number? Does a higher number correlate with denser prose?
These were the sort of questions that went through my head as I started this investigation. I decided to take note of how many “and”s were in some of the most popular fantasy novels, and compare that number to the overall word count of the book. This allowed me to calculate what percentage of the total word count was made up of “and”s, and how many Words Per “And” (WPA) each novel had.
For clarity’s sake, a book with a WPA of 29 would mean that every 30th word in the book was an “and”, on average.
The results can be seen in the table and the graph below:
So the results here are fairly interesting. I had assumed that a higher WPA (meaning less “and”s as a total percentage) would equate to simpler, easier-to-digest prose. Brandon Sanderson’s WPA of 59.11 would seem to back that up, along with Sir Terry Pratchett’s WPA of 51.14.
However, Steven Erikson’s Gardens of the Moon—the first book in the Malazan Book of the Fallen—has a WPA of 50.01. Malazan is commonly considered to be a series with notably dense prose, with Gardens of the Moon thought to be a particularly difficult read. Of course, a lot of the difficulty in Malazan is a result of the lack of information given to the reader, so the dense prose is not entirely to blame.
There can be quite a difference between books of the same author, too. The WPA scores for Robin Hobb’s Assassin’s Apprentice (28.94) and Ship of Magic (39.14) are significantly different. Is this down to the contrast between First Person and Third Person? Or is Fitz just a very observant guy?
Something else that I thought interesting was the difference between J.R.R Tolkien and Mark Lawrence. Tolkien is well known for his (perhaps overly) descriptive language, whereas Lawrence is commonly praised for his concise and evocative descriptions—built upon strong word choice. Here, the difference between the two authors is readily apparent. Tolkien has the lowest WPA of the bunch at 25.19 (with nearly 20,000 “and”s in the Lord of the Rings!), whereas Lawrence has a relatively high WPA for Prince of Thorns at 48.34. Some of this could be down to Tolkien’s unique style; approximately 10% of Tolkien’s “and”s directly follow a period or semicolon, but it is still interesting to highlight the difference in WPA between two well-known “descriptive” writers.
What does all this mean? Well… I can’t really draw any conclusions from the data, and I’m not too sure that there are many to be drawn. This data is interesting, but perhaps largely useless.
Brandon Sanderson and J.R.R Tolkien are at opposite ends of the scale here (a scale which has a fairly massive spread of 33), and those two guys sure seem to have done alright sales-wise. So either approach can work for you, depending on your writing style.
What About “The”?
I was talking to a friend—Daniel E. Olesen of The Eagle’s Flight fame—about this blog post shortly before it was going to be published, and he made the point that it would be interesting to compare the spread for “and” and “the”. His hypothesis was that while the usage of the former would vary wildly depending on the authors’ level of description, the usage of the latter would be much more “locked”, as “the” is such a necessary word in the English language.
So, back to the spreadsheet I went…
As you can see, the spread for this count is much more narrow than the previous, which confirms Daniel’s hypothesis. I suspect that the books with a lower Words Per “The” (WPT) score are those which commonly have to refer to a significant in-world proper noun.
This would make sense with The Tower of Babel in Senlin Ascends, and also with The Grey King in The Lies of Locke Lamora. I imagine that the Ring has a significant effect on the WPT score of The Lord of the Rings, too.
Gardens of the Moon and A Darker Shade of Magic are outliers here. Off the top of my head, I can’t imagine anything from these books that would lead to such low scores. Perhaps just a quirk of the authors’ writing styles?
Finally, to round off this fairly large and fairly pointless blog post, we can consider the combined count of “and”s and “the”s as a percentage of the total word count. It’s a bit bizarre to consider that these two words make up almost 10% of The Lord of the Rings. That’s over 45,000 words in total, which is half of the average adult novel!
I guess if there is a point to this post… it would be to make you less of a self-conscious writer, as counter-intuitive as that sounds. There’s a list of some great, successful authors up there, and the data shows just how diverse they are. Next time you think you’re over-using a word a bit too much, remember that Tolkien wrote half a novel with just two words.