Wordle With Friends

Wordle is a daily word game available at https://www.powerlanguage.co.uk/wordle/.

We started playing Wordle on the ol’ group chat. All five of us were pretty good at getting the Wordle of the day within six guesses, and the obvious way to compete is by comparing how many guesses it took each of us to achieve Wordle.

Two screenshots of the game wordle, showing Aileen winning is three moves and Olive winning in four moves.
Aileen vs. Olive. Aileen Wordled in fewer guesses, and is therefore a better person.

But a single day’s number of guesses is an unsatisfying competition. Did Aileen make logically superior guesses, or did she get lucky?

Wordle provides a measure for this: Do your Wordles every day, and your Wordle statistics will become stronger and stronger. But the thing is, we’ve just started playing Wordle! Ten samples isn’t great statistics!

A screenshot of three sets of statistics displayed by Wordle after the daily game. The statistics shows that Aileen typically solves the Wordle in three guesses.
Vincent vs. Ian vs. Aileen. Aileen reliably wins in fewer guesses, so she is a better person than Vincent and Ian. But is Vincent a better person than Ian? We have to know.

We are a team, in the group chat. How can we help one another get better at Wordle? The obvious answer is, of course, play more Wordle. Vincent rapidly cloned the game and primed it with a set of commonly used five-letter words. This allowed us to really pump Wordle iron and improve our natural ability to guess five-letter words, but it’s slow as a method of building an objectively good strategy.

The second answer came with the word AEONS, from Ian’s brother Graham. AEONS seems like a great starting word! One I wouldn’t have thought of. Look at all those vowels! And an S at the end, that seems helpful, right?

Can we quantify if one starting word is better than another?

Of course we can. Vincent noted that the author of Wordle appeared to be using the Scrabble legal words as the dictionary for valid guesses. So I loaded up the list in Matlab, trimmed it to the 12,972 five-letter Scrabble-legal words, and wrote some Matlab code to see how many of the 12,972 words are eliminated by the guess AEONS.

(At this point, some of you are thinking “wow it’s kind of cool to be able to just code things up like that,” and some of you are thinking “Matlab? Not even, like, Python? Bet that’s slow.” To the first group, it’s true: it’s fun. To the second, someone paid a lot of money for me to have a Matlab license so I wouldn’t have to learn real programming.)

So, AEONS. If you play AEONS and the win word is PEONS, you don’t yet know that the win is PEONS. But you DO know that is has to be one of JEONS*, NEONS, or PEONS. If the win word is CRIMP, AEONS reveals no letters, and there are 565 words that are all still valid candidates, from BIRTH to ZIPPY.

(* Wordle itself doesn’t draw win words from the full Scrabble-legal list, which includes such charmers as AALII and ZOOEA)

Wordle sequences for AEONS POENS NEONS and CRIMP BIRTH ZIPPY AEONS.

If you play AEONS against each of the 12,972 Scrabble-legal 5-letter words (SL5LW):

  • On average (the mean of the 12,972 cases), you’ll have 350 candidates remaining. Or did you want the median? 246 remaining.
  • At worst, you could have 861, which occurs with words like VITAL, where only the A is detected
  • In 16 cases, you’ve fully solved the puzzle: AEONS, AEROS, AESIR, ALOES, AMINO, ANOAS, AROSE, ASANA, AWETO, BEANO, GENOA, KAONS, REALO, SEGNO, SENNA, SLOAN. (I don’t expect REALO to be the win word, but AROSE is a Good Word.)
The Wordle clue color patterns from AEONS-AEONS, AEONS-AMINO, AEONS-AROSE, and AEONS-REALO

We submitted all the first-guess words we’d tried and several second-guess words.

A spreadsheet of the top rated opening words: TARES, RATES, RAISE, AEONS, CARES, OASIS, PENIS, STEAK, EARLY, SMELT, LUTOS, ADIEU, STRAW, SPORT, HAIRY, RUMEN
Top rated opening words.

Then we really drilled down and looked at the most common letters in each position:

>> mode(wordz)
ans = ‘SAAES’

Let’s eliminate one S and one A. S appears 1565 times in position one and 3958 times in position five, so it keeps position five. Ditto A and position two. The mode of the remaining list is CARES, but T is a more common letter overall than C, so let’s try… TARES.

TARES (mean remaining: 302) is a bit better than AEONS (mean remaining: 350). Does all that fussing about letter position matter? How about RATES? RATES (mean remaining: 311) is better than AEONS but marginally worse than TARES. Because this is coded in Matlab*, each word takes 3 minutes to check. (* Also, unprofessionally.)

(Update: Extreme Wordlephiles may know that this guy https://notfunatparties.substack.com/p/wordle-solver found that RAISE is the best word. He used a more correct list of around 2,000 win words, among other differences, and I think that list omits words like OMITS and WORDS.)

TARES.

So now the whole group chat is cooking with gas. No one is disadvantaged just by not having thought of the finest first word. People who prefer fishing for vowels play AEONS and people who like to shake out common consonants are playing TARES and people focusing on playing the funniest word are playing PENIS, which is a fairly good word. I started playing the previous day’s win word as my first word, just to keep on my toes.

Back to the question of who is the best person. Is it me, for identifying TARES? That’s what some people might say, but that is not how we are determining the best person.

We can use the Matlab script to see how many candidate words remained after each guess. For example, take this total boner of a guess string for the win word BOOST:

The Wordle guess sequence FAIRY CHATS ADIEU SMELT QUEST BOOST.
A plot of the number of words remaining after each word in the sequence FAIRY CHATS ADIEU SMELT QUEST BOOST
  1. After FAIRY, there were 2451 words left. And that was merely unlucky. Usually there would be more like 1000.
  2. CHATS doesn’t seem like a great followup because it reuses A, and we already know there’s no A. But it does well, reducing the field to 97 words.
  3. ADIEU (The I! The A! Again!) trims that to 21: BOOST, GLOST, SLOOT, SMOLT, SMOOT, SMOWT, SNOOT, SOTOL, SPOOT, STOLN, STOMP, STONG, STONK, STONN, STOOK, STOOL, STOOP, STOPT, STOWN, STOWP, SWOPT, of which only BOOST, SNOOT, STOMP, STOOL and STOOP are Good Words (maybe STONK if the author is feeling sassy)
  4. SMELT eliminates 20 of these in one fell swoop, leaving only BOOST.
  5. The player could not think of the word BOOST, but QUEST revealed the position of the S, which helps the ol’ human think box free-associate to the answer
  6. BOOST. Hooray!

Consider now this stronger series:

The guess sequence PENIS CROAK STOUT BOOST.
A graph showing that, at every guess, the person who guessed PENIS CROAK STOUT BOOST had fewer remaining candidate words than the person who guesses FAIRY CHATS ADIEU SMELT QUEST BOOST.
My dream is that someone will clone this functionality for the web, so I can stop typing our daily guesses into Matlab.

You can see the initial lead, reducing the field to 502 words, the consistent progress of the next two guesses, and the transition to guessing the real answer- STOUT reduces the field to BOOST, GHOST, and GLOST, only two Good Words. What’s interestimg is that, while PENIS CROAK STOUT BOOST seems to me to be a much better string, they narrowed the field to a single word in the same number of moves. And some of the guesses I thought were pretty bad, like CHATS (you already knew about the A!), were actually very helpful.

Graphing our series allows us to compete on a word-by-word basis and introduce concepts like “early leads.” We can now quantify how much each guess narrowed the field and introduce a new statistic: the percent reduction of the field for a person’s median guess, for example, or the average words remaining at guess three. There’s like four times as much data now.

Higher strategy

The group chat has not fully established the optimum strategy for Wordle, but are there opening salvos — sets of two or three words- that can give you a quick advantage? Of course there are!

A black t-shirt with the guess sequence DOILY CHATS RUMEN. The letters I, E, and N are yellow and the letter S is green.
Vistaprint.com, your source for fake t-shirt images.

The first such set we tried was DOILY CHATS RUMEN, which includes common letters, no repeats, all six vowels, and S and Y in the fifth position (where they disproportionately appear). Even with no additional strategy, this is a super good combination. It buys you a complete solution for a whopping 37% of SL5LWs, leaving an average of just 4.6 candidate words.

A spreadsheet showing the relative performance of five different sets of three words each.

We have still not beaten DOILY CHATS RUMEN with any pre-determined three-word salvo. We do recommend playing it as RUMEN CHATS DOILY, to maximize data at each step. Vincent tried it against several reasonable win words in the clone program and we liked it.

How about a set of four words? We haven’t tried a ton of fours, but TAMPS CHUNK WORDY BILGE solves 74% of SL5LWs. This leads me to believe that a decent strategy could guarantee a win within six guesses, even without any luck.

Anyhow: no one is playing four pre-determined words. How about two?

A spreadsheet showing the relative performance of thirteen different sets of two words each.
Since writing this, we’ve actually found that TARES MOLDY is better than TARES FILMY, but I’m not redoing all the pictures.

For two-word salvos, we didn’t think we were going to beat TARES CHONK, but then we discovered something terrible: CHONK isn’t a word. This is incredible in a world where CHOTT is a word. The best we’ve found is TARES FILMY.

So, you can get pretty far on two pre-determined guesses, and very often solve the puzzle after three. But three seems wasteful. Experienced Wordlers know you’re usually no longer just fishing for letters by word three. I intuitively play in three phases: Fishing for letters, directed probing, true guesses. FISHY PROBE GUESS. Three doesn’t feel like the right length of the fishing phase.

A black t-shirt printed with the Wordle guesses FISHY PROBE GUESS.

I don’t want to just program a Wordle-solving robot (though if you want that, here’s the guy to beat: https://notfunatparties.substack.com/p/wordle-solver) – I want to know how I, a B+ crossword player, can best play. So lets narrow the question: If I start with TARES and follow these rules for my second guess, do I beat TARES FILMY?

Rules:

  • Try all the yellow letters in new positions
  • Do not repeat any black or green letters
  • No double letters

So, first-thought-best-thought, here is my second guess for each of the 32 possible combinations of yellow letters:

A table showing 32 different possible second-guess words for use after TARES.

I could have had the computer pick words, but these are words I thought up, so they’re representative of what I would actually play. It’s also not quiiite correctly implemented, because of a green letter issue: if the win word is BUSTS, the next guess off this list is LIGHT, but I wouldn’t guess the T where I know the S is- I’d pick something like PINTO. Whatever. Probably still good enough to capture the trend.

A black shirt with the Wordle guess series TARES CLONE.

How does this compare to opening with a straight TARES FILMY? It’s better, but it’s not a revolution in Wordleing. (Now you are thinking, “But Olive, what if some of those were simply better generic combos than TARES FILMY or TARES MOLDY? Shouldn’t you check that the mixed strategy does better than each individual combination before concluding that this is a strategy improvement?” Excellent, excellent, welcome to the group chat.)

A table showing the adaptive strategy on the two-word sequence chart. It has a mean remaining words score of 29, while TARES FILMY has 36.

At this point, the group chat has moved on from MetaWordle to Forensic Wordle, where we try to reverse engineer each others’ guess sequences from the emoji grid and a group-chat-induced mind meld.

A graph of several word sequences for the win word TIGER.
TIRED.
A text conversation between two people. Grey: WEARY, DERTH, GREET, TOPER, TIGER. Green: LOL we were just commenting how like three of us played "tired." What is DERTH? Grey: I should have done better than GREET. Green: wait, also what's TOPER? Those just look like misspellings of "dearth" and "torpor." Grey: Derth is an antiquated and therefore misspelling of dearth. I was thinking "topper," so I'm batting 1000 today. green: Well here's what's weird, GREET was actually an excellent guess.

We’re fun at parties.

2 responses to “Wordle With Friends”

    • SAUTE is a very good word (mean remaining: 392 SL5LWs, median: 282, worst case: 988). Looks like it ranks between CARES and OASIS, with a pleasing culinary theme. Excellent.

      Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Create your website with WordPress.com
Get started
%d bloggers like this: