Tuesday, February 3, 2015

Buzzer Trivia

What quirks and patterns do Buzzer's crosswords for The Hindu reveal? We've examined puzzles by Gridman, Neyartha, Sankalak, and Arden earlier – now Buzzer takes center stage.

The first thing that leaps out is the personal touch in Buzzer's work, starting from his first-ever THC clue:
Excited like a new setter shedding hesitation (5) ABUZZ; A BUZZ[er]

Buzzer's themed puzzles too reflect a certain signature style – the THC setters theme and the B-clues theme are strong examples. Running a few data analysis trials gives us riveting new points for rumination.

Clue Volume & Length

Buzzer-CluePerPuzzle Buzzer is extremely economical with word usage in clues, measuring an average of only 6.40 words per clue. And his crosswords, on average, accommodate fewer clues than usual in a 15x15 blocked grid: 27.39 clues per puzzle.

Buzzer-THC-Setters-Comparison

Buzzer's shortest clue is a compact CD:
K-kid? (10) GRANDCHILD

His longest is this whopper with 17 words / 98 characters:
Mary Kom's own story is hard to digest, right with couple of setbacks but showing pluck ultimately (11) UNBREAKABLE
UNBEARABLE (hard to digest), with R moved back two places and K (pluck, ultimately) inserted

Something interesting can be seen when Buzzer's clue length distribution is placed next to that of other setters. In a batch of 1000 clues by Arden, Sankalak, Gridman and Buzzer, this is the kind of clue length distribution we'd find, with clues ranging from 1 word long to 16 words long.

Buzzer-ClueLengthDistribution
[Graph based on 134 puzzles by Arden, 72 puzzles by Sankalak, 528 puzzles by Gridman, 69 puzzles by Buzzer.]

Although Gridman and Buzzer have near-identical average clue length (~6.4), Buzzers clues are concentrated in the band of 5-8 words whereas Gridman's clues are scattered across other lengths. [For the statistically inclined, Buzzer's clue length variance is 3.79; Gridman's is a much higher 5.60.]

Solution Length & Vocabulary Freshness

77% of Buzzer's clues have single word answers. Of the 23% multi-word answers, the longest is a 8-word/30-character solution spread across two 15-letter grid slots. [Given a length of (4,2,3,6,3,6,2,4), guess the answer without the clue :-)]

Here's what Buzzer's split of solution by number of words looks like, in the context of similar data of other setters.

Buzzer-SolutionWordSplit

The percentage of solutions with over two words is higher for Buzzer than the other three setters.

It's still early days to measure Buzzer's solution word repetitions, but if we extrapolate slightly to 1963 clues (the base taken in the analysis for Sankalak's vocabulary freshness), we can benchmark the word repetition counts against other setters.

THCSetters-WordRepetition

Buzzer has been giving us new words to solve over 96% of the time, just short of Sankalak's 97%. The only word he might want to put a cap on is AJAR!

Clue Type Insights

Buzzer has tracked the wordplay used each clue, which gives us valuable information into his style. 72% of his clues are based on a single clue type (e.g. anagram only), the remaining 28% use a combination of more than one clue type (e.g. anagram + container).

Buzzer's overall clue type distribution is below - the dark bar shows the percentage of usage in clues based on a single clue type, the light bar shows the percentage of usage in clues based on a combination of clue types:

Buzzer-ClueTypeDistribution

Buzzer's clue type distribution is as one would expect in a typical cryptic crossword: charades, anagrams and containment the most used, in that order, together accounting for over 60% of wordplay. One striking feature is the relatively high use of partial homophones – close to a third of Buzzer's homophones appear in combination clue types, such as:

[Charade + Homophone] A number reportedly eat a painkiller (7) ANODYNE
A NO (number) DYNE (~dine; eat)

There's a generous helping of definitions by example (DBE): 13 clues in 69 puzzles are "pure" DBE cryptic definitions or reverse wordplay clues. Many others have the normal wordplay + definition structure, in which the definition segment is by example.

[CD type] Señor's accent (5) TILDE
[Reverse wordplay type] Ordinary tree as seen in street? (6-2-3-4) MIDDLE-OF-THE-ROAD; s[TREE]t
[Normal wordplay + DBE] Quiet function held inside the Louvre perhaps (6) M(USE)UM

In case you've wondered whether there is any correlation between the length of the answer and the number of words per clue, Buzzer's data says no. When the number of letters in the solution, arranged from smallest to largest, is plotted against the clue length, the line for clue length does not ascend.

Check out the solution length (letters) vs clue length (words) graphs for Buzzer's clues based on a single clue type: Acrostics (21 clues), Anagrams (290 clues), Charades (303 clues), Cryptic Definitions (87 clues).

Buzzer-ClueLengthSolutionLength

If anything, there's a dip in clue length with an increase in charade solution length – that's because these clues have used bigger charade segments, as in:
Excellent landlord I say (7,6) CAPITAL LETTER

Another (rather expected) observation is the longer clue lengths for acrostics.

Clue Text Wordle

The clue text wordle is turning out to be a tool to showcase words that are the cornerstone of cryptic clue writing: "one", "around", "time" and "old", in particular, since they have appeared among the most-used words in every setter's clue database tested.

The larger the text, the more frequent the appearance of the word in clues by Buzzer:

Buzzer-ClueTextWordle

For all the differences in style, the word usage patterns of our setters, when aggregated, are quite alike.

To extend the observation about "may" vs "perhaps", though:
Gridman and Sankalak, as noted before, favour "may" over "perhaps". Arden is not categorical in preferring either, but Buzzer, with a prominent "perhaps" and a minuscule "may" on the wordle, has his loyalties firmly with "perhaps"!

Related Posts:

If you wish to keep track of further articles on Crossword Unclued, you can subscribe to it in a reader via RSS Feed. You can also subscribe by email and have articles delivered to your inbox, or follow me on twitter to get notified of new links.

5 comments

Shrikanth T said...

As usual, brilliant analysis and presentation. Lot of insights about one of my most favorite setters. Surprisingly to learn that the length of Buzzer's clue is on the shorter side compared to other setters. I had been thinking that his clues appear longer. One more info that will be interesting to know is his length of the definition. Buzzer in my opinion uses multi-word definitions as opposed to synonyms which makes the clues that much more interesting.

Kishore said...

Bravo! And right on schedule! The THCC world is abuzz in anticipation of the weekend.

Shuchi said...

@Shrikanth: Thank you! Buzzer's tracking of clue types has been very helpful. That's a great point about length of definition, and it also suggests that words in a clue may not be proportional to letters in the answer. If only we could automate the identification of definitions in clues! For now, can't find an easy way to confirm the estimate about Buzzer's longer definitions. You may be right, though :-)

@Kishore: I'm looking forward to the big gathering too.

Shrikanth T said...

I can't forget this clue easily.'One among listeners is a murder suspect' (7). The definition is 3 words as against the wordplay which is again 3 and an anagrind. Brilliant. And the solution is 7 letters too.

Rangarajan Ramanujam said...

One among listeners is a murder suspect (7)
EARDRUM

Absolutely amazing information about Buzzer's crosswords - one of my favourites