A Crisis comes to Wordle: Reusing old words

27 comments

The original Wordle came with a pre-baked ordered list of 2315 "secret" words, off which the daily secret word was looked up (I think based on local time). The list was right there in the javascript code of the game (alongside the list of 12972 allowed guess words). It covered dates from 2021-06-19 to 2027-10-20.

Then in January 2022, the NYT bought Wordle, and started tweaking both lists, first shrinking the secret word list to 2309 entries, but leaving the logic otherwise intact. Fast forward to today, I looked up the current code [1], and it seems that there are now 14855 allowed words. The first 12546 are ordered alphabetically (0: "aahed", 12545: "zymic"), and the next 2309 are not. This may suggest that the latter are the secret words, but the logic for picking them has changed: I found no obvious sequence, when compared to the last few days' secret words. So it's either a more complex sequence, or the secret word is picked server-side.

In any case, I guess they decided to re-shuffle the list now at day 1689 / 2309 in order to avoid giving particularly assiduous player an additional bit of information: they can exclude all previous secret words. (To be accurate, I think this would be 1.897 bits, but my information theory is rusty.)

[1] https://www.nytimes.com/games-assets/v2/9003.896ec900f2a1ce8...

Wordle now credits an "Editor" for each day, so I guess that person is the one who hand-picks the word for that day?

If I remember correctly, the original version of wordle used a word list that was run past the creator's wife, who had learned English later in life. The result was a really accessible game - none of the words felt like ones you wouldn't know. It probably makes sense to reuse words than risk losing that accessibility.

(I kept a copy of original wordle, and it seems to have 2,315 words that are possible answers.)

It’s this. There are many five letter words that are not “wordley”. Words such as, idk, bokeh, are technically part of the lexicon but would never appear as a solution. The wordle bot will even tell you this if you guess them — “good guess, but unlikely to appear as a solution”. The crossword has a similar sort of unwritten rule, maybe not as strict, but really hard technical words seldom appear.

> The crossword has a similar sort of unwritten rule, maybe not as strict, but really hard technical words seldom appear.

Not my experience at all.

Ask me how I know what an EPEE is

EPEE is a common fill word from a lexicon informally known as crosswordese.

https://en.wikipedia.org/wiki/Crosswordese

Really no harder than memorizing all the 2 and 3 letter words in Scrabble and many players will pick most up in a few months.

I didn’t know it was called crosswordese! I wonder what the most common term used is. As a very occasional player, for some reason ARIA, IBIS, and VENI/VIDI/VICI stick out, but I’m sure it’s actually one with an E.

VENI/VIDI/VICI are easy for anyone who studied Latin (as indeed used to be common), and ARIA is similarly easy for anyone who knows about opera. Basically, the crossword is for snobs.

I agree that crosswords often include cultural references that lean towards certain demographics / assuming particular education, and that can feel exclusionary if you don’t share that background - and there's even an argument to suggest snobbery might be behind those choices.

But I disagree that that makes it for snobs. Snobbery is more about an attitude of looking down on others or their tastes, whereas knowing Latin or being a fan of opera is really just about exposure.

Sure, there exist some (too many) opera fans who would say something like "it's real art compared to pop or hip hop being low class trash", but that's not a defining part of liking opera and plenty of people who like opera aren't snobs. Ironically it's a different form of snobbery (sometimes called reverse snobbery though personally I hate that term), to dismiss anyone who learned Latin or who likes opera as being a snob!

Major crossword offenders:

ERR, ORCA, OBOE, ALOE, ORE, ODE

The middle 4 are all fairly common words. "Ode" isn't super common, but I hear it in "An ode to..." phrases. And "err" I've only ever heard in 1 phrase: "To err is human."

See also: "Err on the side of caution."

Epee is not an obscure word. It's an olympic event for goodness sake.

[deleted]

An épée is one of three types of sword used in the three styles of Western fencing. As such, it's about as technical as, say, the words "touchdown" or "mitt".

It's also just the regular French word that means "sword". But although crossword puzzles frequently ask you to know common French words, I've never seen one clue the answer EPEE that way.

> EPEE

They love that one.

If you took fencing at an Ivy League school for you PR requirement you would know all about foil, saber, and epee fencing. Not everyone gets to row crew.

Wholly offtopic but just posting because I thought it was awesome...

During Covid I saw an ad for a fencing school how it was the best sport during Covid.

You wear a mask

You keep your distance

And if someone doesn't, you stick em with the pointy end

:)

It's actually a terrible sport for covid, involving heavy breathing in close proximity to other people indoors.

Any outdoor sport would be better.

And any 4 letter instrument is usually OBOE and a fish related clue is EELS

> Ask me how I know what an EPEE is

That’s when you’re like, only tangentially involved with the making of a movie or tv show, but too famous to go without a credit?

Ah yes, good old ARA Parseghian. That guy.

IMO scrabble would be improved by a similar limitation. There's too many nonsense words.

Scrabble is a competitive game, not a puzzle, and therefore subject to a different set of constraints. (Players in a competitive game are trying to win; a puzzle author, if they're any good at their job, is ultimately trying to lose.)

In particular, you have to consider the equilibrium. If you only allow a subset of words in Scrabble, this replaces the competitive advantage from knowing lots of words that no one uses in real life, with a competitive advantage from knowing the exact contours of the border between acceptable and unacceptable words. I would argue that this is even worse; at least if you learn lots of Scrabble words you're learning something about the real world.

By contrast, Wordle can self-impose whatever constraints they want on solutions, and people don't have to know what those constraints are in order to solve the puzzle. (It can help a little on the margin, which in a perfect world would not be the case, but it's much less of a problem for the puzzle-solving experience than the Scrabble equivalent would be.)

Ya that's a good point for competitive scrabble. However today I think a lot of people's main exposure to Scrabble comes from WordsWithFriends (and recently, the new NYT games version). In those games, there's no penalty for getting a wrong word, it just won't let you play it. In that context, I at least think it would be nice to have a setting with a more limited list... it could be like Chess timed variants.

It's obviously an impossible challenge to draw those contours in language. Wordle did pretty well though! And going the other direction, just allowing everything that could possibly a word, just starts getting ridiculous.

Even in casual Scrabble-like games, I expect using a restricted set of words would create a lot of "come on, that's totally a real word, why can't I use it" moments. Most people know at least a few uncommon words that most other people don't (because it's different words for each person).

The Wordle list of legal guesses is not substantially curated; AFAIK basically all five-letter words legal in Scrabble are on it (except on offensiveness grounds, which was a highly controversial decision). If this were not the case, I predict you'd get user dissatisfaction as per above. Wordle's list of possible answers is much more curated, but that's my point; it can err on the side of conservatism, because users won't notice if a word that they'd expect to be on there is missing, whereas they will notice if such a word is not allowed as a guess.

Will Anderson has an excellent Scrabble related channel on YouTube, would recommend to anyone who is interested

Wouldn’t that make Scrabble only harder and more annoying to play? With that limitation you’ll get situations where you play a perfectly valid word, but it gets rejected because it’s not in the list of approved words. To get good at that version of the game, you’ll have to study the Scrabble word list instead of the dictionary.

With Wordle the limitation is only put on the words the game generates as answers. You can use obscure words to guess, they just won’t be the answer.

this is already the case with scrabble; there is a strictly defined scrabble word list that determines whether a word is acceptable or not, and it often leaves out words that you might find in some other dictionary that is not the official scrabble one (collins for most of the world, or a custom dictionary for american scrabble)

Ah ok. Shows what I know about Scrabble.

Caulk is in there, I would say that’s fairly technical. My wife didn’t know it.

I am not a native speaker but how does your wife name the caulk in the shower? Silicone? Or do you maintain it in such pristine condition that no word was ever spoken about it?

Yeah, silicone or just sealant. Maybe it’s an Americanism.

I don't think so - wooden ships have been caulked to seal the planks for a long time.

Yes, that's correct! Took her about a year off and on, he had made a little app for her to go through and categorize everything.

As an aside, for about $200, you can ask a true/false question of every word in the English language with a frontier LLM, and get mostly good answers. I make word games in my free time and was sort of shocked when I realized how cheap intelligence has been getting.

$200? Does this use reasoning? Does it involve forgetting to use KV caching?

This should cost well under $1. Process the prompt. Then, for each word, input that word and then the end of prompt token, get your one token of output (maybe two if your favorite model wants to start with a start-of-reply token), and that’s it.

Yes, it uses reasoning. I tried without it, and at the time with OpenAI's api, it was not giving such good answers. Reasoning improved it a fair amount.

Yes there’s no point using technically correct words if hardly anyone know them.

Language or the way we use it is often used to exclude "undesired", so there is a point in using them. Not a very nice point, but a point nevertheless.

Sure there is, as long as your audience does.

Also they seem to never use vulgar words like my opener, penis.

This may well be why the game became such a hit among everyone.

1. Wordle's word list is going to be a lot more curated than TFA's word list because people want to guess words they use or have heard of, not "aahed".

2. Only a tiny group of people care to "card count" Wordle to rule out words that have already been played because they think that sort of min/maxing is fun. Most people don't even think about that, so whether Wordle reuses words every few years is trivial to them.

I will say that having used the same starter word the whole time that has not come up yet, it's a little disappointing that it may now take even longer to appear.

My favourite starter word has come and gone. So I’m in the opposite situation where I feel relieved to be able to go back to using it.

Have you checked it didn't come up before you started?

You may want to swap out aahed if that's what you're rocking.

> Wordle's word list is going to be a lot more curated than TFA's word list because people want to guess words they use or have heard of, not "aahed"

The Times sure doesn't think that about the people who do Letter Boxed. One LB had "polymethylmethacrylate" in its dictionary.

I've saved the daily dictionaries from 2024-03-30 and that's the longest word out of the 93 393 total distinct words in the 674 dictionaries I've saved. They average 1199.47 words per dictionary.

They have some truly ridiculous words, such as "troughgeng". WTF is a troughgeng? Googling that gives a couple of pages in Chinese (or a similar looking language) and a Scottish dictionary entry for "Throu" which in one of the examples of "throu" as an adverb lists a bunch of phrases is it used in, including:

> (8) througang, throw-, throoging, trough-geng, -geong (Sh., Ork.), (i) a going over or through; a passage (I.Sc. 1972); specif. (ii) a narration, a recital (of a story); (iii) a full rotation of crops, a shift; (iv) a thoroughfare, lane, passageway, corridor open at either end (Sc. 1808 Jam.; Sh. 1908 Jak. (1928); Rxb. 1923 Watson W.-B.; Ork., w.Lth., wm.Sc. 1972). Also attrib.; (v) = (5); (vi) energy, drive (Bnff. 1866 Gregor D. Bnff. 192);

The Wordle list is available here (in addition to many other places): https://github.com/pseudosavant/ps-web-tools/blob/main/wordl...

Has anyone confirmed if they still use only this original list? I would think the NY Times could change the word list however they choose.

They changed some words pretty much right after the acquisition. There was some controversy when they started doing "themed" words (like Christmas stuff in December) vs more "random" words. Some words were also removed for having negative vibes/political liability

They removed WENCH from the list of upcoming solutions fairly quickly, but forgot to add it back to the list of available words so you couldn't use it as a guess for a little while. It made it back to the list eventually.

I believe these lists are more like what is described in the blog post. Diction of words, filtered to 5 letter words, no plurals, etc. It most likely has 99%+ of the words, but maybe some they don't actually use in Wordle.

> people want to guess words they use or have heard of, not "aahed"

That isn't a correct diagnosis; people have heard of aahed. You'll find it naturally in the expression "[someone] oohed and aahed".

People don't want aahed, and their instinct that it shouldn't count is reasonable, but unfamiliarity isn't the problem with it.

Ooh and aah aren't words, they're sounds (onomatopoeia). A sound is just a sequence of letters used for their phonological values.

You can spell the sound "ah" however you like: ah, ahh, aah, aahh, there's no wrong way to spell it.

If you write "the washing machine tringged when it finished", 'tring' is not a word, even though it's following the rules of English morphology, you could have written any sequence of letters that most faithfully reproduces the sound of the washing machine. You could have written katrigged or puh-tringged.

It's true that onomatopoeia isn't always a word, but in the particular case of "aah", I think that particular choice of letters is conventionalized enough that it is a word.

Ooh and aah most certainly are words. Is meow not a word? Can I spell it miough and sit smugly correct?

That is false; the fact that you can conjugate aah (or tring) into the past tense is sufficient to prove it's a word.

"Crisis" is a massively overblown word for this. And the "wordle community" is a drop in the bucket of regular players, and not remotely representative.

I did have a similar reaction personally to the "exciting news" framing but I'm not actually sure it's wrong. The original list of words was an excellent list, and it's been over 4 years.

> "Crisis" is a massively overblown word for this.

Given that it is Wordle, “panic” would be a far more appropriate word.

Alarm, dread, scare, shock, start, worry.

Alarm is a good guess. On average I can solve a wordle in 3.6 turns when I start with this guess.

Repeated letters are wasted utility. Wouldn't that make it suboptimal for a first move?

Suboptimal - likely. There is some utility: a green letter is more useful than a yellow. Checking for a in two locations when a is a very commonly used letter is __useful__. Still likely much more useful to check for the presence of a fifth letter than a chance at knowing more precisely the location of an a.

I used to use alert, until it was the word one day (got it in one!). Then I switched.

Apparently I should switch back, since it could be the word again.

I always used the previous day's word as the starting word. IMHO that should have been how the game worked all along.

There seems to be a progression of Wordle strategies.

Playing with a set start word (or words, e.g. "SIREN OCTAL DUMPY" or people who go the "AUDIO ADIEU" route).

(Many people also go down the rabbit hole of looking for "optimal" starting words or choices based on the original word lists.)

Then, once you've played that for a while, you find it's not that much of a challenge unless you end up in one of the forms of madness like _A_E_, and you'll switch to playing in "hard code" (e.g. correct/green must be played again in the same place in all subsequent tries, yellow letters must also be reused each time).

The hard mode starting with the same word gets a bit boring, so people move on to varying the start word each day, either pulling them from a list or just using the answer for the day before.

There's no "correct" approach obviously, people can play the game however they want and extract the fun/anger however they want.

Why would you rob yourself of the chance of a wordle-in-one?

Because a wordle-in-one is meaningless. It doesn't mean you're any good at Wordle, the way a hole-in-one suggests you're good at golf. It definitely doesn't mean that you're a "Genius" as the game puts it, because you were operating with zero information and didn't employ any skill or intuition. It just means you burned some luck points on something that doesn't matter.

I used to use “stare” or “stale” as the starting guess when I played Wordle, thinking you’d want to start off with the most common letters, like R-S-T-L-N-E from Wheel of Fortune.

"stale" was used a while back - since then I've been starting with "slate"

But now you can use it again!

It seems about right. They reshuffled the deck about three-quarters of the way through (1689 ÷ 2315 = 72.9%). Blackjack shoes are typically shuffled around the same point. Different games, but similar considerations in this respect.

For my game redactle.net, I blacklist the Wikipedia article for 2 years. I figure there is a tradeoff between novelty and allowing the pool of articles to shrink. The Wikipedia vital level 4 category has 10k articles and probably half of them actually meet the criteria (length, number of languages etc) for making the cut.

As someone who recently built a daily word game[1], I 100% get it. I can say from first hand experience: there's an awful lot of words that are totally valid but not fun.

I spent approximately as much time on building the word list as I did developing the game. The author's technique of just grabbing a word list and spellchecking it is completely not sufficient, you will get so many weird unfamiliar words in there. In the end I was able to whittle down my list to about 24,000 using various automatic methods, but from that point I just had to do a manual review on the remaining list, which meant I got to see a lot of words, and many of them felt very obscure and/or not fun.

1: shameless plug: https://wheybags.com/turntiles

> So that does beg the question:

Since we're being pedantic about words here, it would be better to say that it "raises the question" or "prompts the question"!

Seems like a good post to plug a recent find and my new favourite -

https://puzzlist.com/stackdown

It's from the person who made https://wafflegame.net if you are familiar with it, one of many that came on the tails of the original Wordle.

In comparison, the Stackdown is less rushed and way more rewarding when solved. Also, more interesting in structure.

That's cool to see. I made a mobile game, Downwordly, that has the same mechanic in its puzzle mode. It came out almost five years ago and still has a decent set of versus players.

I'm more proud of a later word game that you can play free at https://wellwordgame.com/en If you give it a try, let me know what you think!

Hopefully this is an ok place to plug my own word game, https://spellrush.com/. It's very different from Wordle but that was a conscious decision since there are so many clones out there these days. Really wanted to put a fresh spin on word games.

stackdown seems very hard. Took me over 10min for todays puzzle.

In https://squareword.org (2D variant) I was also running into this problem. It's a bit different though, since I need to find valid 5x5 squares, with 5 words down and 5 across. Surprisingly, there is quite a limited number of such squares.

Ive been able to solve it by slowly injecting more challenging words over time, which has the side effect of also introducing a difficulty gradient. Players seem happy so far :)

They just need more bits of entropy - going from IPv4 to IPv6 involved quadrupling it, but this transition is much more minor. They could just go to 6 characters for now, and go to 7 later.

6 characters would be vastly harder. You'd need more than six rows for sure.

I've been waiting years for my word to be my first guess and still nothing... it's been my opener for years. I know my word hasn't been used as I've checked the list of used words.

So for me, reusing words is not what I want to hear.

I've used my own tool (https://pseudosavant.github.io/ps-web-tools/wordle-solver/) for understanding how many words are left after each guess. It'll show hints if you want them too, but they are disabled by default. I like understanding how my guesses reduce the word space well (or not).

It uses the list of all of the words that can be in Wordle, and there are so many words I can't imagine anyone guessing. And I come from a family with large vocabularies.

My friend and I labored over the word lists for our word game subletters.fun. We wanted the word pairs and at least one optimal path for each word pair to be from words on one list, which were simpler words that we would expect everyone to be familiar with. But players could use their own more advanced vocabulary to solve the puzzles on their own without feeling restricted. Then we bundled literally 10 years of unique word pairs into the game and shipped it.

At the risk of being accused of obscurantism, I would like to know more of the words on the 5-letter list that are excluded by Microsoft Word.

Why not add a character (for fun?) The weekend game can be 6 characters and the regular one 5 characters?

The analysis misses a point. Wordle uses two lists of five letter words: words that are in the dictionary, and can be used in a guess; and those that can be used as the daily secret word. The latter list is smaller, and sticks to more common words. Wordle has been around for 1550 days, so they have used 67% of the possible words. In another couple of years, they have to either start using uncommon words, or recycle. There's no rush, so it's unclear why this is happening now.

> Wordle has been around for 1550 days

I'm confused. Today's Wordle is #1,688.

I did an approximate calculation.

[deleted]

This is lame. The original creator of Wordle would’ve been more Spiny.

If anyone's looking for new word games, I built The Daily Baffle which might appeal to some of you. Check it out at dailybaffle.com!

I am guessing a high percentage of wordle players prefer a wordle version which uses common words, and New York Times would prefer cater to those, rather than a smaller group of enthusiasts.

Maybe it should be „forked“

I'm surprised they weren't reusing words already.

Obviously a finite resource will run out after a while.

Connections is better anyway.

It's a very different kind of game. I don't think it's at all comparable.

My favorite right now is https://tiledwords.com/, not affiliated to it in any way, I just enjoy it.

Hey, thanks! I’m glad you’re enjoying it! (I’m the creator)

It’s become tradition in my house to play tiled words with my wife just before bed. It’s the last thing we do together before falling asleep each night! Thanks for bringing us together with a bit of joy!

That’s awesome, thanks for letting me know!

I recommend anything at https://www.merriam-webster.com/games for these sorts of games. Lots of wordle variations and all add free.

I find Quordle a much better game than Wordle, since there is some real strategy involved, but still not overly much.

Connections is infuriating.

Not only are they using regional specific knowledge, but they use regional relative concepts.

Many people do not agree that ant rhymes with aunt.

The recent Homophones of words meaning brutal.

Gorey, Grimm, Grizzly, Scarry.

I am guessin that Grimm is a eponym which makes it nebulous at best, eponyms take a lot of use to be regarded in objective terms rather than as invoking an arbartrary property of the name holder. Kafkaesque rises to that use. I don't think Grimm does.

I have no idea if Scarry is supposed to be a homonym for scary. Which it neither sounds like nor means brutal.

Perhaps there is another word that means brutal that sounds like however the person who makes connections thinks Scarry is pronounced.

In which case it would be a homonym of a synonym of brutal.

I also do not live in the same country as only connect, yet do not have such issues with their walls.

The real problem is that while you might be wrong about an answer, once you lose faith that the puzzle setter is right, you can never be sure if your guess is wrong or they are wrong. It is no longer a puzzle and you are playing 'what have I got in my pocket?'.

'Grimm' is a homophone of 'grim', 'Grizzly' is a homophobe of 'grisly', 'Scarry' is a homophone in US English of 'scary', 'Gorey' is a homophone of 'gory'.

'Gory', 'grisly', 'grim' and 'scary' do all roughly mean brutal.

'Grimm' as the name of the brothers is a red herring connection, with Gorey and Scarry also names of children's authors.

Gory, grisly and grim can be seen as synonymous on a axis maybe close to brutal. They refer to the appearance. brutal evokes the action that happened. The other words are about how things ended up.

An autopsy can be gory, grisly and depending on circumstances, grim. It is not brutal.

Scary is about a state of mind.

so you have appearance, appearance, appearance, and state-of-mind being considered similar to an action descriptor.

Isn't the point of homophones that they sound like the equivalent word, thus gory, grim, grisly, scary?

I think the confusion is about what "Gorey, Grimm, Scarry" mean. They, along with "Silverstein" in that game, are last names of children's authors.

And that would be OK as a clue if Silverstein was a red herring, Grizzly was also a children's author and Scarry sounded like scary (and also meant something in the same ballpark as Gory, Grim, and Grisly)

Richard Scarry's surname is indeed pronounced "scary," rather than (as I assumed for many years) "scarr-ry."

That is, it rhymes with Harry, Larry, carry, parry, tarry, and marry, rather than... uh, starry, I guess?

Where I come from, Scarry rhymes with Harry, but Harry does not rhyme with scary.

  Harry does not rhyme with hairy
  Scarry does not rhyme with scary
  Marry does not rhyme with Mary. Nor with merry!
You can probably triangulate my childhood home with that information. :)

is "valew" related to the Brazilian "valeu", expressing gratitude/satisfaction?

Depends on your point of view.

The most direct thing we can say is "no, because there is no such word as valew". It's not in Merriam-Webster, it's not in Samuel Johnson's 18th-century dictionary, it's not in the Collins dictionary (for British English).

It is in the Oxford English Dictionary, where it is noted as a "[spelling] variant of value" from the 14th century. It has never been a word with any other meaning than that of value, and using it now would be a pure error if someone used it, which obviously nobody will ever do. Accepting it in Wordle makes as much sense as accepting vvest on the theory that that was an acceptable spelling of west in the past.

There is an etymological connection between Portuguese valeu and English value, in that they both descend from Latin valeo, but value has no sense of gratitude or satisfaction. (I'm guessing the blog author was misled by https://en.wiktionary.org/wiki/valew#Portuguese , which says that valew is Portuguese internet slang for valeu.)

thanks a lot!

Every now and then I play quordle, octordle, and once a thousand-word variation (which breaks down gameplaywise to just getting every letter at every spot).

A bit of reuse of the same word in the one-word version can't hurt I think

Yet the current word list apparently doesn't include "Irish" (even though Welsh, Scots and Brits are all valid). ¯\_(ツ)_/¯

I start with the same word every day. I hate to change it, because I want the joy of getting the wordle in 1 someday.

It doesn't beg the question, it raises it. Begging the question is a type of logical fallacy in which you assume the truth of your conclusion. It doesn't mean something "begs for the question to be asked."

I have no idea why this incorrect use of the term drives me so nuts; however, you'd think a blog post about English words and Wordle wouldn't make this mistake.

I agree with you. But it's clear that "begging the question" is going the way of "literally," and there's (sadly) nothing we can do about it.

I suppose some time in the future, someone will invent a new phrase meaning "assuming your conclusion".

At what point did dictionaries providing descriptive views of the English languages turn into a prescriptive one that emboldens people to just point to repeated wrong usage rather than admit they were wrong?

assuming your conclusion, why would we need a new phrase?

[deleted]

Well, I for one won't be party to it. I think informing everyone I can is my drop in the bucket in the fight against the incorrect usage of words. :-)

When you win that battle, would you please fight iOS predictive text vs proper apostrophe use next?

I think the idea was NYT was trying to imply they were running out.

To me, "begging the question" doesn't mean assuming the conclusion in particular, it just means that some of the premises used are less obvious than they are being passed off as. Assuming the conclusion is merely an especially egregious form of that.

I was objecting to the incorrect use of the phrase at the end of the article.