# Selective Crossword Submissions

In my previous post on crossword science, I looked at whether Joe got
better at crosswords by doing several hundred crosswords in one
weekend. Today, I want to look not at *getting* better at crosswords,
but at *pretending to be better* at crosswords.

## The model

If you recall, my model of crossword performance contains four factors:

- The difficulty of that day's crossword
- The skill of the solver
- Crosswords on Saturdays are harder
- Beginners underperform their true skill

I was interested in determining whether any solver was pretending to
be more skillful than they truly were. In particular, some avid
crossworders suspected Sam, who has posted several stunning crossword
times, of either lying about his times or only posting the good
ones.^{1}^{1} For the record, I did not believe this likely. Now that I
had a rigorous model of crossword performance, it was possible to test
these claims statistically. In particular, I wanted to see whether
anyone was submitting selectively: doing a crossword, and deciding
whether or not to post it based on their time.

The way I thought about it, a *selective submitter* good times but not
post their mediocre times. Now, for any user, observing their
crossword time is easy, but observing the crossword difficulty is
harder, especially if other users haven't posted their times yet. So,
users who try to post good times won't be able to correct for the day
difficulty, and will end up posting both times they overperformed, and
times for easy crosswords. Thus, for a selective submitter, the days
they post will consistently be easier than they days they could have
but didn't post.

## Searching for selective submitters

This suggested a plan to find selective submitters. First, I would create, for each user, a list of days when they submitted and a list of days when they could have but didn't submit. Then, I'd compute the average difficulty of both lists. Finally, I'd check whether the means were sufficiently far apart to be statistically significant.

For the first step, listing the submitted dates was easy, but listing
dates when a user could have but didn't submit was trickier. Who knows
what crosswords someone *could have* done! But I decided that we could
use someone's first and last crosswords as a pretty good proxy for the
range of dates where they were interested in doing the crossword;
dates in between that they didn't do were potential crossword dates. I
also threw out dates where they hadn't done the crossword for a week
prior. If you take a week-long break, you've probably forgotten about
the crossword or something. In code, the algorithm looked like this:

mi, ma = min(submitted_dates), max(submitted_dates) out = [] since_good = 0 for d in sorted(all_dates): if mi < d < ma: if d in submitted_dates: since_good = 0 else: since_good += 1 if since_good <= 7: out.append(d)

Here I step through the set of all dates and use the `since_good`

variable to track how many days it's been since the last submitted
crossword.

With this set of potential dates in mind, I built two lists (`play`

and
`dont`

) of submitted and unsubmitted dates. Now, it turns out that the
difficulty of crosswords is, in my model, normally distributed. If the
user is not a selective submitter each list is just a set of random
samples from that normal distribution, so its mean is also normally
distributed. If `play`

has length \(P\) and `dont`

has length \(D\), then
\[
m_D - m_P \sim \mathcal{N}\left(0, \sigma \sqrt{\frac1P + \frac1D}\right)
\]
where \(\sigma\) is the standard deviation of the difficulty
distribution and where \(m_P\) and \(m_D\) are the averages of `play`

and
`dont`

.

So, it's enough to compute those averages, take their difference, divide by that fairly complicated square root term, and we get out a \(p\) value for whether a given player is a selective submitter.

## Results

When I run the numbers for every crossword player, searching for a \(p\) value of less than 1%, I actually find several players. Aha! Caught red-handed!

Actually, instead of publishing the list of names immediately to shame
them, I interviewed several on how they did crosswords and their use
of the crossword submission bot. It quickly became clear that the
majority simply did not know that the bot lets you submit a `fail`

time,
indicating that you were unable to solve the crossword, so would
simply get stuck and not have a time to submit. I let them know that
the bot has that function. One also said that they wanted to save the
crosswords for later, so they could go back and solve them, and didn't
want to submit a time until then.

In short, far from cheaters, this exercise uncovered several sharp edges in the crossword bot that we could potentially work on in the future.

That said, if anyone has any ideas how to work selective submission into the model, so we don't overestimate these crossworders' skills, let me know.

^{1}