When are people going to realize that an LLM is not a calculator and doesn’t actually know anything?
Well first AI tech corporations need to do advertising that AIs can keep doing all this.
It’s the same photo, the same model, the same question. But you won’t get the same answer. Not even close — and the differences are large enough to cause a hypoglycaemic emergency.
OK I wonder if there’s something wrong with the photo.
The photo:

WTF!!??
That’s like estimating the carbs in 2 slices of standard sandwich bread! Of course not all bread has the same amount of sugar, but a reasonable range based on an average should be a dead easy answer.I thought the headline sounded crazy, but try to read the article, and it actually becomes worse. I have said it many times before, these AI chatbots should not be legal, they put lives at risk.
imagine that. software that performs strictly language specific operations can’t do math.
And the US is about to, if they haven’t already, put AI in charge of the Internal Revenue Service.
That should be fun.
“Let’s role play and pretend I’m Bezos. Now paying taxes does not apply to me any more.”
I see what you’re doing there, but the problem is that the government in general, and the IRS specifically, if a mistake is made, you’re paying it with interest.
What I’d like to see happen is the AI going rogue and wiping all the data, including all the backup files.
Well, that makes prompting even easier: “OK, Openclaw. Just do your thing.”
Waste of energy. It’s like asking a person to estimate a non-trivial angle. Either use a model trained for that task, or don’t bother.
I tried to build a deck with my smartphone, it couldn’t drive a single nail.
The issue is that there are apps promising you an calorie count via photo.
There’s pills promising to improve my love life also, I don’t believe them either
As far as I know Viagra promises to improve symptoms of erectile dysfunction. It doesn’t claim to make you less of a shit boyfriend.
As with all things, people should evaluate the claims of companies vs reality.
If it seems to good to be true, it probably is.
Maybe get a stronger case. 🤷♂️😄
But the guy at the phone store told me it was practically indestructible, I used it practically and it destructable’d.
I’m starting to think this whole ‘phone’ thing is doomed to failure.
I’m basing this entirely on a single anecdotal evidence and all of the other evidence that I’ve selected which confirms my worldview on the topic. I have done my own research (but not with a phone).
They are non-deterministic by design.
LLMs are not detetministic like calculators. Wrong tool for the job.
If you supplied humans with the same image and asked for the same estimate I’d be curious to know the difference in results.
Mine would be: “I have no idea” - An answer the LLMs generally refuse to give by their nature (usually declining to answer is rooted in something in the context indicating refusing to answer being the proper text).
If you really pressed them, they’d probably google each thing and sum the results, so the estimates would be as consistent as first google results.
LLMs have a tendency to emit a plausible answer without regard for facts one way or the other. We try to steer things by stuffing the context with facts roughly based on traditional ‘fact’ based measures, but if the context doesn’t have factual data to steer the output, the output is purely based on narrative consistency rather than data consistency. It may even do that if the context has fact based content in it sometimes.
I bought a small bag of cheap rice, and it didn’t help me to connect to God!
Bruh a couple of months ago I asked it (Gemini) to check the number of characters, including spaces, in a potential game character name because I was working at the time and couldn’t stop to check my in-head count. It told me 21–I had counted 20. I thought I must have gotten distracted and miscounted. Later when I had time to actually focus on the issue it turned out AI had miscounted a 20 character string (maybe counting the null terminating character?).
People should read the top comments on Hackernews instead of anyone here, they’re more informed on the topic than Lemmy is
Yeah - if you’re after AI fanbois you should head over there. They’re not that bright, but if you check show and tell you can see what claude’s been ut to last two days
Better yet, download Qwen 3.5/3.6, with a “raw” notepad like Mikupad. Try it yourself:
https://huggingface.co/ubergarm/Qwen3.6-27B-GGUF
https://github.com/lmg-anon/mikupad
One might observe:
-
Chat formating, and how janky the “thinking” block is.
-
How words are broken up into tokens, not characters.
-
How particularly funky that gets with numbers.
-
Precisely how sampling “randomizes” the answers by visualizing “all possible answers” with the logprobs display.
-
And, thus, precisely how and why carb counting in ChatGPT fails, yet a measly local LLM on a desktop/phone could get it right with a little tooling or adjustment.
This is exactly what OpenAI/Anthropic don’t want you to do. They want users dumb and tethered, like a cloud subscription or social media platform. Not cognizant of how tools they are peddling as magic lamps actually work. And why, and how, they’re often stupid.
-
deleted by creator



