Tuesday 30 June 2015

Parallels Between Cryptanalysis and Crossword Solving

Fialka Cipher Machine During World War II, recruiters for Bletchley Park had looked for crossword solvers to join them, as it was believed that crossword experts would be good codebreakers. [The story is that in 1942, a challenge was issued to solve a Daily Telegraph crossword in under 12 minutes. Shortlisted candidates were invited for a live crossword test, after which six solvers were recruited.]

Bletchley Park's successes during the war proved that their crossword-inclined hiring policy worked. But have you wondered if this correlation was incidental? Interestingly, NSA's chief cryptologist William Friedman might have wondered so – in the treatise Military Cryptanalysis (originally dated 1938, 4th Edition dated 1952), under the heading "Mental equipment necessary for cryptanalytic work", he made a strong case against correlating cryptanalysis with crossword solving.

The present author deems it advisable to add that the kind of work involved in solving cryptograms is not at all similar to that involved in solving crossword puzzles, for example. The wide vogue the latter have had and continue to have is due to the appeal they make to the quite common interest in mysteries of one sort or another; but in solving a crossword puzzle there is usually no necessity for performing any preliminary labor, and palpable results become evident after the first minute or two of attention.

He went on to elaborate on the points of difference between the two skills (section I, p2).

Now try this: exclude that specific warning and replace 'cryptanalytic' with 'crossword solving', 'cryptogram' with 'puzzle', etc. in the manual's description of the "mental equipment". Do you also see that large excerpts of the text could pass off as a guide for crossword solvers?

I was especially struck with these parallels while reading Simon Singh's The Code Book, a narrative about the evolution of cryptography. In the hunt for structure and hidden meanings in text, in the battle of wits between codemakers and codebreakers, the similarities between cryptanalysis and crossword solving are hard to miss. A few examples of cryptanalytic techniques and their crossword parallels:

Frequency analysis

Every letter of a language has distinct character. The frequency with which the letter appears in text, the way it positions itself with respect to other letters in the alphabet, the types of letter clusters it appears in – all make the letter uniquely identifiable.

The frequency distribution of letters (singly or in sequences), a powerful tool for cracking substitution ciphers, has similar applicability in crossword solving – though we usually process this information instinctively rather than formally.

The appearance of more than one low-frequency letter (J K Q X Z) alerts the solver that the grid could be a pangram.

Longer letter sequence frequencies provide strong hints about possible answers in the crossword grid. Solvers watch for clusters like IN-, RE-, TH-, -ING, -TION in answers since they are occur commonly in English. When tackling a checked answer slot, say one that starts with ?H- or ends with -T?C, one can eliminate many letters from matching the ? because the resulting clusters do not appear in natural English words. 

Wartime cryptanalysts exploited the knowledge that certain words occur more frequently in military communication than in normal conversation. Likewise, experienced crossword solvers are well up on crosswordese and expressions typical of crosswords e.g. phrases normally used with 'your' in the real world tend to appear with ONE'S in crosswords (e.g. PUT ONE'S FOOT IN ONE'S MOUTH).

Cillies

At Bletchley Park, the term cilly was used for predictable message keys that the cryptanalyst could guess based on the radio operator's habits: for example, an operator might have the habit of using consecutive letters such as QWE on the keyboard. Another type of cilly was the repetition of a message key by an operator, presumably the initials of a loved one.

Deciphering messages based on hunches about cillies is similar to cracking crossword clues based the setter's signature style. An Anax puzzle is more penetrable if you are aware that he often disguises verbal indicators as nouns, an AfterDark puzzle might unravel faster if you look out for something special in his grid.

Cribs

Cryptanalysts used the term crib to refer to any known or suspected plaintext in an enciphered message. They would foretell cribs from the time and source of a message (e.g. a 6am message from X station would be a weather report), and this knowledge would give them a way to break the rest of the code.

Guessing a crossword's theme and possible answers based on the date and setter/publication is similar to using a suspected crib in cryptanalysis. We apply such information all the time to figure out a puzzle even before we have read any clues. Factors like an out-of-turn appearance by a setter, a special date of publication or a crossword numbering milestone, give us ideas about what to expect in the grid fills.

Vigenère Cipher with Running Key

The Vigenère Cipher with running key (i.e. key as long as the ciphertext) was considered unbreakable when it was first designed. Codebreakers soon found that this cipher too, like its predecessors, does not guarantee security if the key has structure and consists of recognizable words.

The process of breaking the Vigenère Cipher with running key goes like this: start with a crib (e.g. the), refer to the Vigenère Square to derive the key and check if the derived key looks meaningful. If it does, fill in other gaps and proceed to derive the rest of the message by alternating between the plaintext and the key. If you reach an impasse (i.e. meaningless text in either the plaintext or key), backtrack and resume on another track till you end with meaningful key as well as plaintext.

The snapshot below from The Code Book shows three routes a cryptanalyst might take to decipher a message with this approach. In the first, the plaintext is gibberish and cannot be right. The second looks possible but if we proceed on that route, it soon ends in deadlock. The last, with the plaintext string 'at the', seems the most promising – and it does turn out to be the right one.

Deciphering the Vigenère Cipher 

The cipher's plaintext is analogous to crossword answers in one direction (Across or Down), the key to its intersecting answers in the perpendicular direction. When we use a grid-centric approach to solving, we essentially do the same as cracking the Vigenère Cipher: derive answers based on factors like enumeration (e.g. 4,2,7 might be NEXT TO NOTHING), and validate that they fit into the grid generating workable crossings.

In the grid below, without glancing at the clues, one can tell that something is not right with the filled-in Down answers as this grid will end in deadlock at 14Across. At this point the solver would revise the Down answers, just as a cryptanalyst would discard the key trial if it led to such a pattern in the plaintext.

Partly Solved Grid

While cryptanalysis shares its approach with crossword solving, some forms of encipherment have shades of cryptic clueing. Navajo, for example, which was based on a native American language and formally developed for secret communication by US Marines. This language had a complex structure which made it unintelligible to anyone who did not know the rules for parsing it. The code used creative substitutions for military terms (e.g. platoons = mud clans, mortars = guns that squat), similar to lateral clue definitions. Some words that could not be translated directly to Navajo were split and encoded using a phonetic alphabet or through homophones. Nicknames were added to the Navajo lexicon to refer to nations - Australia became 'Rolled Hat', Spain was 'Sheep Pain' – somewhat like rhyming slang.

Related Posts:

If you wish to keep track of further articles on Crossword Unclued, you can subscribe to it in a reader via RSS Feed. You can also subscribe by email and have articles delivered to your inbox, or follow me on twitter to get notified of new links.

5 comments

Kishore said...

Heil Hitler!


No, I am not adulating Adolf. I just remembered that this was one of cillies that helped break the Enigma, since a large number of messages had this either at the beginning or end or both ends of the messages. Once, this realisation occurred, cracking the code was a wee bit easier.

Shuchi said...

@Kishore: You had me shocked for a moment there!

Kishore said...

Yeah, Shuchi, I can see it took you over two and half hours to recover from it ...

Deepak Gopinath said...

Kishore, it's takes time to decipher your messages

Kishore said...

And that, Deepak, is the incontrovertible truth which makes me, perforce, to concur with your observation, which is in line with similar results noted by other interlocutors in the past and will continue to be confirmed again and again by future correspondents.