Once upon a time in a faraway land – or to be exact, some 550 years ago in England, a non-native English-speaker from Flanders who was a compositor or typesetter for the printer William Caxton, decided to add the letter <h> into the word ghost. Or so the story goes, as per e.g. and i.a., the OED1:
In English, the usual spelling in the late 15th century was gost; the compositor was probably following the spelling of Flemish gheest. You still see this spelling convention in various European languages (other than English), in which a consonant can be either ‘hard’ or ‘soft’ depending on the following vowel. Usually, in these cases, the consonant becomes ‘soft’ or palatalized before front vowels (i, y, e, æ), and remains ‘hard’ before back vowels (u, o, a). For instance, Italian caffè /kaffe/ ‘coffee’, but cinque /tʃiŋkwe/ ‘five’. When the other sound is desired – /ki/ instead of /tʃi/ – an <h> is added: chiave /kjave/ ‘key’.
So it makes sense for Flemish to have inserted an <h> after <g> for a hard /g/ sound in gheest, preventing it from being read something like /ji:st/ (forgive my lack of Flemish). …but it makes No Sense for that <h> to be inserted before the back vowel <o> in English gost!
Well, finding the source of that <h> is beyond this short blog post, but I wanted to see if I could have a look at what was going on in the spellings of GHOST in 15th- and 16th-century English.
According to the OED (NB this is the first edition of 1899: I’m sure they already have much updated information, although the entry itself has yet to be updated), the <gh>-spelling “remained rare until the middle of the 16th cent., and was not completely established before about 1590”. If we have a look in EEBO, using the ever-wonderful EEBO-TCP N-gram Browser, we see that the printed record agrees with the second part of this interpretation, but rather than being rare, the <gh>-spelling dominates over the simple <g>.
This is, of course, just the printed record; previous readers of this blog will know that my next question is: what about in manuscript texts? So I looked in the CEEC (Corpus of Early English Correspondence). Unfortunately, GHOST is a fairly rare word, and I found only 68 instances in the entire corpus of 5.2m words. But although the numbers are small, they do appear to tell the received story: gost hangs in there during the 16th century, but by the end of the century, ghost is the winner.
But I also found a few instances of goost – so without an <h>, but with a doubled <o>. In the 16th century CEEC data, it isn’t a contest between <g> and <gh>, but a three-way battle.
This prompted me to go back to EEBO to have another look. And whadda you know:
If we add the <go> numbers to those of <g>, <gh> doesn’t take the lead until the 1550s.
I wanted to have a better look at what happens in the manuscript record, particularly in the 1500s, but as far as I can tell, there isn’t really any suitable corpus out there. After dipping into various sources I’ve used before (like ETED: nil hits), I found two (semi-)diplomatic editions of wills on the ever-wonderful British History Online: Lincoln Wills (Lincoln Record Society vols 5, 10 & 24), and London Consistory Court Wills, 1492-1547 (London Record Society vol 2). Lincoln wills end in 1532, and London wills in 1547, but together they cover nearly half the century where the main action seems to be. The total number of hits is 64, so relatively many more than in CEEC – undoubtedly due to the text type (more of this below).
This looks remarkably like the results for CEEC, except <gh> appears but once.
Obviously, a word like GHOST strongly correlates with text type. Sadly, in Early Modern English this doesn’t mean ghost stories! 37 of the 68 hits in CEEC are for the Holy Ghost; and many if not most of the adjectival uses are to ‘my ghostly father’, meaning spiritual father, and thus chaplain or confessor. In wills, these two concepts of course occur quite frequently, explaining the presence of GHOST.
How does this g[h]ost story end? With the revelation that the killer is known to be <gh>, but there are two bodies instead of just one, as <g> was joined in its long struggle for life by <go>.
…but is there a ghost in this story? I’d say it’s the <h> in ghost – doomed to curse learners of English spelling to remember its presence without a sensible rule allowing them to anticipate it. And it’s far from the only ghost in English orthography…