Wordle is a daily word game available at https://www.powerlanguage.co.uk/wordle/.
We started playing Wordle on the ol’ group chat. All five of us were pretty good at getting the Wordle of the day within six guesses, and the obvious way to compete is by comparing how many guesses it took each of us to achieve Wordle.
But a single day’s number of guesses is an unsatisfying competition. Did Aileen make logically superior guesses, or did she get lucky?
Wordle provides a measure for this: Do your Wordles every day, and your Wordle statistics will become stronger and stronger. But the thing is, we’ve just started playing Wordle! Ten samples isn’t great statistics!
We are a team, in the group chat. How can we help one another get better at Wordle? The obvious answer is, of course, play more Wordle. Vincent rapidly cloned the game and primed it with a set of commonly used five-letter words. This allowed us to really pump Wordle iron and improve our natural ability to guess five-letter words, but it’s slow as a method of building an objectively good strategy.
The second answer came with the word AEONS, from Ian’s brother Graham. AEONS seems like a great starting word! One I wouldn’t have thought of. Look at all those vowels! And an S at the end, that seems helpful, right?
Can we quantify if one starting word is better than another?
Of course we can. Vincent noted that the author of Wordle appeared to be using the Scrabble legal words as the dictionary for valid guesses. So I loaded up the list in Matlab, trimmed it to the 12,972 five-letter Scrabble-legal words, and wrote some Matlab code to see how many of the 12,972 words are eliminated by the guess AEONS.
(At this point, some of you are thinking “wow it’s kind of cool to be able to just code things up like that,” and some of you are thinking “Matlab? Not even, like, Python? Bet that’s slow.” To the first group, it’s true: it’s fun. To the second, someone paid a lot of money for me to have a Matlab license so I wouldn’t have to learn real programming.)
So, AEONS. If you play AEONS and the win word is PEONS, you don’t yet know that the win is PEONS. But you DO know that is has to be one of JEONS*, NEONS, or PEONS. If the win word is CRIMP, AEONS reveals no letters, and there are 565 words that are all still valid candidates, from BIRTH to ZIPPY.
(* Wordle itself doesn’t draw win words from the full Scrabble-legal list, which includes such charmers as AALII and ZOOEA)
If you play AEONS against each of the 12,972 Scrabble-legal 5-letter words (SL5LW):
- On average (the mean of the 12,972 cases), you’ll have 350 candidates remaining. Or did you want the median? 246 remaining.
- At worst, you could have 861, which occurs with words like VITAL, where only the A is detected
- In 16 cases, you’ve fully solved the puzzle: AEONS, AEROS, AESIR, ALOES, AMINO, ANOAS, AROSE, ASANA, AWETO, BEANO, GENOA, KAONS, REALO, SEGNO, SENNA, SLOAN. (I don’t expect REALO to be the win word, but AROSE is a Good Word.)
We submitted all the first-guess words we’d tried and several second-guess words.
Then we really drilled down and looked at the most common letters in each position:
>> mode(wordz) ans = ‘SAAES’
Let’s eliminate one S and one A. S appears 1565 times in position one and 3958 times in position five, so it keeps position five. Ditto A and position two. The mode of the remaining list is CARES, but T is a more common letter overall than C, so let’s try… TARES.
TARES (mean remaining: 302) is a bit better than AEONS (mean remaining: 350). Does all that fussing about letter position matter? How about RATES? RATES (mean remaining: 311) is better than AEONS but marginally worse than TARES. Because this is coded in Matlab*, each word takes 3 minutes to check. (* Also, unprofessionally.)
(Update: Extreme Wordlephiles may know that this guy https://notfunatparties.substack.com/p/wordle-solver found that RAISE is the best word. He used a more correct list of around 2,000 win words, among other differences, and I think that list omits words like OMITS and WORDS.)
So now the whole group chat is cooking with gas. No one is disadvantaged just by not having thought of the finest first word. People who prefer fishing for vowels play AEONS and people who like to shake out common consonants are playing TARES and people focusing on playing the funniest word are playing PENIS, which is a fairly good word. I started playing the previous day’s win word as my first word, just to keep on my toes.
Back to the question of who is the best person. Is it me, for identifying TARES? That’s what some people might say, but that is not how we are determining the best person.
We can use the Matlab script to see how many candidate words remained after each guess. For example, take this total boner of a guess string for the win word BOOST:
- After FAIRY, there were 2451 words left. And that was merely unlucky. Usually there would be more like 1000.
- CHATS doesn’t seem like a great followup because it reuses A, and we already know there’s no A. But it does well, reducing the field to 97 words.
- ADIEU (The I! The A! Again!) trims that to 21: BOOST, GLOST, SLOOT, SMOLT, SMOOT, SMOWT, SNOOT, SOTOL, SPOOT, STOLN, STOMP, STONG, STONK, STONN, STOOK, STOOL, STOOP, STOPT, STOWN, STOWP, SWOPT, of which only BOOST, SNOOT, STOMP, STOOL and STOOP are Good Words (maybe STONK if the author is feeling sassy)
- SMELT eliminates 20 of these in one fell swoop, leaving only BOOST.
- The player could not think of the word BOOST, but QUEST revealed the position of the S, which helps the ol’ human think box free-associate to the answer
- BOOST. Hooray!
Consider now this stronger series:
You can see the initial lead, reducing the field to 502 words, the consistent progress of the next two guesses, and the transition to guessing the real answer- STOUT reduces the field to BOOST, GHOST, and GLOST, only two Good Words. What’s interestimg is that, while PENIS CROAK STOUT BOOST seems to me to be a much better string, they narrowed the field to a single word in the same number of moves. And some of the guesses I thought were pretty bad, like CHATS (you already knew about the A!), were actually very helpful.
Graphing our series allows us to compete on a word-by-word basis and introduce concepts like “early leads.” We can now quantify how much each guess narrowed the field and introduce a new statistic: the percent reduction of the field for a person’s median guess, for example, or the average words remaining at guess three. There’s like four times as much data now.
The group chat has not fully established the optimum strategy for Wordle, but are there opening salvos — sets of two or three words- that can give you a quick advantage? Of course there are!
The first such set we tried was DOILY CHATS RUMEN, which includes common letters, no repeats, all six vowels, and S and Y in the fifth position (where they disproportionately appear). Even with no additional strategy, this is a super good combination. It buys you a complete solution for a whopping 37% of SL5LWs, leaving an average of just 4.6 candidate words.
We have still not beaten DOILY CHATS RUMEN with any pre-determined three-word salvo. We do recommend playing it as RUMEN CHATS DOILY, to maximize data at each step. Vincent tried it against several reasonable win words in the clone program and we liked it.
How about a set of four words? We haven’t tried a ton of fours, but TAMPS CHUNK WORDY BILGE solves 74% of SL5LWs. This leads me to believe that a decent strategy could guarantee a win within six guesses, even without any luck.
Anyhow: no one is playing four pre-determined words. How about two?
For two-word salvos, we didn’t think we were going to beat TARES CHONK, but then we discovered something terrible: CHONK isn’t a word. This is incredible in a world where CHOTT is a word. The best we’ve found is TARES FILMY.
So, you can get pretty far on two pre-determined guesses, and very often solve the puzzle after three. But three seems wasteful. Experienced Wordlers know you’re usually no longer just fishing for letters by word three. I intuitively play in three phases: Fishing for letters, directed probing, true guesses. FISHY PROBE GUESS. Three doesn’t feel like the right length of the fishing phase.
I don’t want to just program a Wordle-solving robot (though if you want that, here’s the guy to beat: https://notfunatparties.substack.com/p/wordle-solver) – I want to know how I, a B+ crossword player, can best play. So lets narrow the question: If I start with TARES and follow these rules for my second guess, do I beat TARES FILMY?
- Try all the yellow letters in new positions
- Do not repeat any black or green letters
- No double letters
So, first-thought-best-thought, here is my second guess for each of the 32 possible combinations of yellow letters:
I could have had the computer pick words, but these are words I thought up, so they’re representative of what I would actually play. It’s also not quiiite correctly implemented, because of a green letter issue: if the win word is BUSTS, the next guess off this list is LIGHT, but I wouldn’t guess the T where I know the S is- I’d pick something like PINTO. Whatever. Probably still good enough to capture the trend.
How does this compare to opening with a straight TARES FILMY? It’s better, but it’s not a revolution in Wordleing. (Now you are thinking, “But Olive, what if some of those were simply better generic combos than TARES FILMY or TARES MOLDY? Shouldn’t you check that the mixed strategy does better than each individual combination before concluding that this is a strategy improvement?” Excellent, excellent, welcome to the group chat.)
At this point, the group chat has moved on from MetaWordle to Forensic Wordle, where we try to reverse engineer each others’ guess sequences from the emoji grid and a group-chat-induced mind meld.
We’re fun at parties.