I bet your answer is a vehement "No" (unless you are Spiffytrix's friend). Grid fills maybe, anagram suggestions - but entire clues? Not possible. This isn't Sudoku to get generated by software.
That's what I thought, which is why Enigma took me by surprise. Enigma, the brainchild of David Hardcastle, is a computer program that auto-generates cryptic clues for any word input. David built this program over a period of four years, as part of his thesis for PhD in Computer Science, Birkbeck, University of London.
In his thesis [p245], David says:
there is a widely held (and probably well-founded) belief that computers can generate English language but not “natural” English language. A key goal for ENIGMA is to challenge that belief, and for the system to generate clues with fluent surface texts.
How Enigma Works
Given a word, the first step is to figure out all the ways the word can be clued using the puzzle rubrics configured in the system. The user can then select a particular type from the list and generate clues using that type.
Let's see this work for the input word VIEWERS. The system comes up with a number of "clue plans".
"Exp" represents the number of possible combinations through which the clue could be expressed, where 4 means 104 =10,000 etc.
We select the fifth clue plan - (anagram(WIVES) around ER) - and click the Generate button, then the system generates clues which are shown ordered by rank.
Enigma generates the clues by treating the clue plan as a set of chunks and generating text for one chunk at a time. For example, (WIVES)* translates to chunks of text like 'strange wives', 'wives about', 'reorder wives' etc. The system discards those chunks that don't work syntactically (e.g. 'wives problem') or don't work semantically (e.g. 'jumbled wives').
The (rather unfortunate) phrase 'battered wives' scores above other similar alternatives such as 'fancy wives' since it is matched as a "collocate" i.e. a phrase in English with these exact words. The system recognises thematic associations between words by computing word distance between pairs of words in a 100 million word corpus (the British National Corpus) and using a statistical algorithm to determine whether or not a given pair of words is unusually correlated in the text.
Next the system finds ways of representing ER, such as hospital department, pause, etc.
Then the system explores all frames representing 'A in B' with all the combinations of the 'anagram of WIVES' chunk and the 'ER' chunk to try to build a new, meaningful chunk for that whole piece of the clue.
The auto-generated explanation for this clue is:
How Good Are Enigma-Generated Clues?
When Enigma was built, David conducted an evaluation in two ways –
1. Turing-style test – For the same light, two clues were provided – an Enigma-generated clue and a Sun newspaper clue. 30 pairs of such clues were presented and solvers were asked to pick the Enigma-generated clue from each pair.
2. Domain expert assessment - Crossword compilers Jonathan Crowther and Don Manley, editors Kate Fassett and Mike Hutchison, and expert solvers provided their feedback on clue quality.
60 people participated in the Turing-style test and on average they correctly guessed the clue from the Sun newspaper 70% of the time. The best score (parity with the newspaper) was 50% and the worst (obvious to tell apart) 100%. The pairs (now marked with which is Enigma-generated) are here.
The domain experts were harsher critics of the system and found most clues lacking in human wit. The surface reading was right some of the time but not all the time. Another criticism was based on originality. For example, ENIGMA scored the clue "Drain fresh ewers (5)" for SEWERS high, however this has been used often in cryptic clues and a human setter familiar with crosswords would avoid it. While this similarity with human output is something of a success, ENIGMA's inability to recognise originality which comes to human setters is a failing.
The major drawback of Enigma is its dependency on the encoded semantics, which in its current state is rather shallow. For example, in the VIEWERS clue, a human compiler would have spotted 'without hesitation' as a construct for '… around ER', and using same elements have made "Witnesses battered wives without hesitation (7)", a much more fluent clue. This Enigma has not managed – because 'B without A' has not been encoded.
The problem gets compounded when the clue length increases, with the surface becoming nonsensical. The system has no strategic planning component to organise the surface beyond clause and sub-clause level.
Read the detailed evaluation here [p228-262].
The domain experts were asked if any of the clues of a set of 42 were of publishable quality. Mike Hutchinson highlighted 10 clues, Jonathan Crowther highlighted 8 and Sandy Balfour 9. One can conclude that, while Enigma is no threat to human crossword setters, it does get it right at least some of the time.
A Puzzle to Try
Have a go at this crossword with clues generated entirely by Enigma:
- Cryptic crosswords – a threat to criminal justice?
- Failures In Solving Crosswords: A Statistical Study
- Who says documentaries are boring? Watch "Wordplay"
If you wish to keep track of further articles on Crossword Unclued, you can subscribe to it in a reader via RSS Feed. You can also subscribe by email and have articles delivered to your inbox, or follow me on twitter to get notified of new links.