Tuesday, January 27, 2015

Arden Trivia

Since 2010, The Hindu Crossword has seen a shift in its puzzle contribution format. From a small group of long-time setters who supplied 6+ crosswords per month, the crossword team expanded to include new setters who set just a puzzle or two per month. A few of them moved on after a year or so with The Hindu, our dear Sankalak passed away, and newer setters came on board.

Two THC setters who debuted in July 2011 have created a sizeable number of puzzles by now, and I thought it would be an interesting exercise to take a closer look at patterns in their puzzles.

The spotlight today is on Arden's work – a set of 134 crosswords created by him till date for The Hindu.

Clue Volume & Length

Arden-Clues-Per-Puzzle Many observe that Arden's style is similar to Sankalak's, and clue volume metrics apparently second that. Of the three setters' works analysed on the blog earlier, Arden's averages of clues per puzzle (29.36) and words per clue (7.20) are closest to Sankalak's.


A careful reading reveals something more. Though their averages are similar, Arden tends to write more clues in the 3-word to 8-word range, while Sankalak's clues show a wider length variation – more clues under 3 words as well as over 8 words. [When Sankalak Trivia was published, Sankalak had remarked that the biggest surprise to him was the finding about clue length, as his general attempt was to keep clue words to the minimum.]

To put things in perspective: in a batch of 1000 clues by Arden, Sankalak and Gridman, here's the kind of clue length distribution we'd find, ranging from 1-word clues to 16-word clues.


[Graph based on 134 puzzles by Arden, 72 puzzles by Sankalak, 528 puzzles by Gridman.]

No surprise that Gridman, with an average clue length < 6.5, has the tallest bars for under 6-word clues, with the pattern reversing beyond that mark.

Arden's shortest clue is a two-word double definition:
THC 10822: Queer game (5) RUMMY

His longest clue by word count [16 words]:
THC 11197: We in India have too much of it — but dismissing opponents in a game is tough (7) H(OODL[es])UM

…and by character count [88 characters]:
THC 10534: Initially rechargeable energy storage system follows a characteristic, which she betrays (9)TRAIT R E S S

Solution Length

Arden tends to put in fairly long solutions in his grids. He uses no 3-letter solutions, and often clubs two slots in the grid with a "See <clue ref>" to make a bigger grid entry.

The graph below shows the %age distribution of Arden's solution lengths, which vary from 4 to 21 letters.


Interestingly, despite the leaning towards lengthy solutions, Arden does not write too many clues for 11-letter words/phrases.This is another trait in common with Sankalak's graph which shows a dip for 11-letter answers.


Arden's longest solutions are of 21 letters, spanning more than one clue slot in the grid.

THC 10853: A thing doctor gave for care is a worry (1,6,2,5,7) A MATTER (OF GRAVE)* CONCERN
THC 11069: It's FM's headache now - bill to enumerate shortfall (7,7,7) CURRENT AC+COUNT DEFICIT

Since Arden writes more clues leading to 8-14 letter solutions than Sankalak/Gridman, does he also use more phrases as solutions?

No, says the data.


Arden tends to opt for single word grid fills where another setter might clue a phrase. 1 in 8 of Arden's clues has a multi-word solution; in Gridman's clues, the ratio is close to 1 in 4.

Clue Text Wordle

Which words does Arden frequently use in his clue text? This wordle gives us that information – the bigger the text size, the higher the occurrence of the word in Arden's clues.

*For meaningful results, common words like articles & prepositions have not been included in the visualization.

What do we find?

  • "One" is the most prominent word, a feature of Sankalak's and Gridman's wordles too. Neyartha's looks different in that respect.
  • Arden uses "get" and its variants frequently, in charades, as container indicator, as connector between wordplay and definition.
  • While Gridman and Sankalak clearly favour "may" over "perhaps", Arden goes for "perhaps" almost as much as "may"!
  • "Time" shows up often in Arden's clues, giving T/AGE/ERA in the answer. Guess who else used "time" as much?

Bhavan said...

Fascinating insight into the mechanics of Arden's clue writing.

Personally I think the setters you've picked for comparison have the intangible and unmeasurable quality of entertaining their solvers. I know it brings a smile to my lips whenever I see one of these three names as the byline for a THC.

Shrikanth T said...

Wonderful analysis. I second Bhavan's thought on the choice of setters. :)

On the matter of 'length of solutions', it could be influenced by an external factor than the setter's choice alone. In THC, setters choose a set of grid patterns which they recycle. Since the pattern repeats over so many crosswords, the results seen here maybe somewhat skewed. I don't know whether a free choice of patterns would give us a different idea (on this parameter alone)