"Upon closer inspection, there are 2 civilians in lethal range."
"--try again."
"Thank you for the correction; there are in fact 3 civilians in lethal range."
"Nope."
"I apologize for the confusion. To avoid further error, I have just ensured that 0 living civilians remain in range"
Someday, with enough server farms cranking away through gigawatts of power and oceans of cooling water we’ll achieve the dream of building a computer that can count to three
These LLMs aren't trained to learn words character by character, instead they're grouped together a as multiple characters in a token. They can only spell them out if they've seen them spelled out, and will only parse questions about the letters correct if they invoke the spelled out version
The fact that they can't reliably check for a spelled out version (and that the companies haven't realized this is an issue and set it up to ask a dictionary program to get spelled out versions when they get questions like this) means they fail badly and hilariously
Bing chat gets it wrong the first time but there is a guardrail that when you ask it to count again that does the spelled out version and gets it right; guardrails probably add a little latency and cost to the query
the other day I tried to conjure an analog clock and display a specific time and it never came close and kept telling me it was showing me the right time so... I literally wouldn't ask it the time of day
The funny thing is that AI can regurgitate code that works at least half of the time. So if you asked it to write regex and run it, it could do that - but of course has no understanding that it's the same thing as counting r's
What’s going on here is that the input text is passed through a tokenizer that turns the whole word into a vector before it’s passed to a neural net. The inner neural net can’t even see the character-level details so it’s forced to guess. This also accounts for challenges with getting it to rhyme.