Amazon must solve hallucination problem before launching AI-enabled Alexa

Tundrok · Jan 14, 2025

I used to work in Alexa and I just don't see it being converted to an LLM solution for responses at all. A core problem is that user expectation is that responses are quite reliable outside of general knowledge, and the currently hyped LLMs just don't provide that. Also, general knowledge has never even been a big draw for average consumers for a voice assistant. Having Alexa work off of an LLM is just not going to be particularly reliable anytime soon and is ripe with risk for consumer harm in various ways. Another massive problem that LLMs are facing for voice assistants is in time of response, both with time to response start and length of response. If a voice response isn't highly reactive and to the point, users get pretty pissed off and decrease engagement.

We just aren't anywhere near having LLMs be reliable enough with low enough latency and to-the-point responses to launch as a voice assistant replacement for Alexa/Siri/Google Assistant.

MobiusPizza · Jan 14, 2025

Tundrok said:
I used to work in Alexa and I just don't see it being converted to an LLM solution at all. A core problem is that user expectation is that responses are quite reliable outside of general knowledge, and the currently hyped LLMs just don't provide that. Also, general knowledge has never even been a big draw for average consumers for a voice assistant. Having Alexa work off of an LLM is just not going to be particularly reliable anytime soon and is ripe with risk for consumer harm in various ways. Another massive problem that LLMs are facing for voice assistants is in time of response, both with time to response start and length of response. If a voice response isn't highly reactive and to the point, users get pretty pissed off and decrease engagement.

We just aren't anywhere near having LLMs be reliable enough with low enough latency and to-the-point responses to launch as a voice assistant replacement for Alexa/Siri/Google Assistant.

As a customer I just want something simple, for these assistants to have better natural language processing and understand commands said in variations of sentence structure, grammar or object names. I am not looking for a chat bot to have philosophical discussion with me. As others have posted, currently if you say turn downstairs lights off and turn the lights off downstairs one will work and one will fail. It's useless.

FSTargetDrone · Jan 14, 2025

TinCoyote said:
I think they have seriously overestimated how much people want that.

I have Amazon Echos in my home....the majority of use is home automation activation, weather, and timers. My wife and I also use it to communicate between our two home offices, which are on different floors.

Yeah the vast majority of what we use Alexa for is smart device control and then timers.
Lots of food timers. I have no interest in using it for anything beyond that. I don’t want to have conversations with a computer.

thekaj · Jan 14, 2025

This is just going to turn into the next generation of the virtual assistant failure. It involves the exact same players, Microsoft, Apple, Amazon, and Google. And they're all throwing an insane amount of money at the issue. Apple seems to once again be the only one to be mostly viewing it as grafting it into existing products to make them more attractive, while the others are once again thinking that these things will be their own products that pay for themselves.

So in a few years time, major shareholders are going to start paying attention to how much money has been invested into these things, along with demanding to know when they expect to actually start making money, let alone recoup the costs. At which point, there will be a lot of uncomfortable shifting in chairs by the MS, Amazon, and Google finance people.

Tundrok · Jan 14, 2025

markgo said:
Funny how Apple’s caution seems to be borne out by Amazon’s experience.

Personally, I’d rather have tiny, incremental, predictable improvements in on-device agents with abilities to do real-world things.

There’s always ChatGPT if you want to…chat.

The problem with that in the space of voice assistants is: that's not enough to launch anything. Look at what Apple has been doing so far with Apple Intelligence that isn't even within the voice assistant space. It's all so minimal and therefore not particularly useful. Now Apple is building a reputation of having near-zero value with Apple Intelligence, so consumers are beginning to conclude Apple can't do anything with AI

Tundrok · Jan 14, 2025

Lexus Lunar Lorry said:
If regular Alexa couldn't generate any meaningful subscription revenue or sales commissions, does the team seriously expect GenAI Alexa (which is presumably far more expensive per query) to be different? It seems like Amazon is doubling down on a failed monetization strategy.

I think you are missing the point of what Alexa is aiming to achieve with using LLMs for voice interactions, specifically regarding Alexa responses. It wants to leverage the general knowledge successes of things like ChatGPT in a way that customers perceive it as having enough value to pay for on a monthly basis, likely through offering 1 to 3 month free trial of it when it becomes available. In a perfect world, that would be a killer revenue stream. Do I think they'll succeed? Absolutely not.

Tundrok · Jan 14, 2025

Thomas Harte said:
I would have agreed with you but we've crossed the threshold where 'AI' means absolutely anything the marketing department decides which may or may not be a step in any direction whatsoever.

E.g. any sort of eye tracking for camera autofocus now seems to be called AI; we had lesser versions of eye tracking autofocus at least a decade ago and nobody felt the need to call it AI but at some point the tag got attached. Including by Sony, which people generally regard as having the best eye-tracking autofocus in the industry.

I really think they just kept improving what they had and put a new label to it.

The thing is: there are a massive amount of applications for 'AI' because it's an extremely broad technical space, including the example you gave using visual models. 'AI' has been a thing for a very long time, but most recently it is getting a lot of attention with large language models (LLMs), which is a smaller but still huge subset of 'AI'.

markgo · Jan 14, 2025

Tundrok said:
The problem with that in the space of voice assistants is: that's not enough to launch anything. Look at what Apple has been doing so far with Apple Intelligence that isn't even within the voice assistant space. It's all so minimal and therefore not particularly useful. Now Apple is building a reputation of having near-zero value with Apple Intelligence, so consumers are beginning to conclude Apple can't do anything with AI

No one wants to be the first to hook up AI to things that affect the real world even in small ways. Wonder why.

jdale · Jan 14, 2025

Tundrok said:
I used to work in Alexa and I just don't see it being converted to an LLM solution for responses at all. A core problem is that user expectation is that responses are quite reliable outside of general knowledge, and the currently hyped LLMs just don't provide that. Also, general knowledge has never even been a big draw for average consumers for a voice assistant. Having Alexa work off of an LLM is just not going to be particularly reliable anytime soon and is ripe with risk for consumer harm in various ways. Another massive problem that LLMs are facing for voice assistants is in time of response, both with time to response start and length of response. If a voice response isn't highly reactive and to the point, users get pretty pissed off and decrease engagement.

We just aren't anywhere near having LLMs be reliable enough with low enough latency and to-the-point responses to launch as a voice assistant replacement for Alexa/Siri/Google Assistant.

I don't think they can solve this problem, but I disagree that it will block implementation. Alexa is a money-loser, and there is a very significant part of the American public that does not care about accuracy. Yes, it hallucinates, yes, it's going to be slower than the current system, yes, it's going to actually make responses to simple tasks less accurate. And they will release it anyway and we will get to see videos on your choice of social media about the most amusing results.

stormcrash · Jan 14, 2025

So instead of making the product actually better after they stuffed it with ads the plan is to stuff it with unreliable AI garbage, glad I threw out my echos years ago and gave up on the whole voice assistant thing altogether

Fatesrider · Jan 14, 2025

Windhaven said:
On one hand, I’m glad that at least one massive company seems to care about its products producing nonsense.
On the other hand… isn’t avoiding hallucinations basically impossible, given how LLMs are mainly very clever autocomplete? Even with some effort to provide sources like Google tries to, they’re terrible at picking up tone and cite sarcasm.

Yeah, my impression of the way AI is done is that there's no way to avoid hallucinations. It's literally baked into the processes used to create responses. And input matters wrt output in subtle ways that won't necessarily parse properly the first time, resulting in some odd response that's a dead giveaway that the response came from an AI, and not a human.

They'll have to do AI differently than the way they've done it, and train it in FAR different ways than they did (such as making sure factual information is used, and that it can parse the difference between fact and "popular belief").

Right now, it's not really "good enough" to fool most people (many, though, because they're none too bright in picking up the discrepancies each time). So before I bend knee to our AI overlords, they'd better be able to handle human input with output that is always responsive in a human-convincing way. None of them I've tried so far have met that benchmark.

Uncivil Servant · Jan 14, 2025

Thomas Harte said:
We got an Echo Dot for $5 on a punt circa 2018 because it was bundled with something or another. My pronunciation is about as straightforward and uninflected as southern England can produce and my history in publishing means that I chide myself if I even so much as split an infinitive. That's despite having spent now almost half my adult life in the US; people even ask when my natural-born-American child moved here because he's done such an excellent job of learning my British accent.

Alexa couldn't even understand the prompt for a cookie recipe that I recited directly from its literature. Due to its general uselessness we passed on the Echo Dot to somebody else shortly thereafter.

So either the speech recognition is calibrated by geolocation or Alexa was having a really bad few days.

Damn, if it can't handle South Anglian then it definitely can't handle Tihdewottah's 17th century South Anglian.

cibyr · Jan 14, 2025

davijoh723 said:
At this point, whenever I see a new product release and the main feature is "now includes AI" I'm 100% not interested in that product. It's a step backwards IMO. Sure hope this trend of cramming AI into everything ends sooner rather than later.

When was Alexa not an AI product?

Cheshire Cat · Jan 14, 2025

cibyr said:
When was Alexa not an AI product?

Never. A pattern-matching state-machine with an unreliable voice-recognition interface is not AI.

jakk · Jan 14, 2025

Hallucinations would be more fun and no worse than a maga on twitter.

orwelldesign · Jan 14, 2025

ColdWetDog said:
'Your plastic pal that's fun to be with' was sarcasm. NOT a road map.

I am unsure what that has slipped past the tech bros. Of course, there is the whole problem with '1984' so I suppose I shouldn't be surprised.

Just out of curiosity... What whole problem with 1984? Be specific.

I mean, I've read an awful lot of dystopias and near-dystopias and cyberpunk grim futures, and, when this comes up, I'm always curious specifically what is meant.

Thanks in advance for your thoughtful reply.

Deleted member 567875 · Jan 14, 2025

Their current Alexa gives absolute garbage answers 50% of the time already, they should just go ahead and roll with it. Even some hallucinations will make it 100x better than what it currently is.

rizzo420 · Jan 14, 2025

The 'hallucinations'(which I think is a hilarious thing to call the results of GIGO) are the best part though, at least they can be fun!

davijoh723 · Jan 14, 2025

pug fugly said:
This! I'm currently shopping for a new TV but many manufacturers have ruled themselves out with this insane BS.

Yeah no kidding, who knew shopping for a TV would end up being so complicated.

graylshaped · Jan 14, 2025

Uncivil Servant said:
Damn, if it can't handle South Anglian then it definitely can't handle Tihdewottah's 17th century South Anglian.

It struggles with "Turn the reading lamp on" in my standard american dialect.

Maestro4k · Jan 14, 2025

EthosPathosLegos said:
If history is any guide, they will invest heavily, still have hallucinations, give into the sunk cost fallacy, hype the shit out of their new "Alexa AI", and ignore the inevitable tsunami of complaints from people whose Alexa routinely gives false information and accidentally orders 10000 widgets.

Personally I'm looking forward to Alexa AI ordering a give gallon drum of sexual lube for someone by accident. The prudishness of American society will make that significantly worse for Amazon to deal with than just an accidental order for too many widgets. Might even cause them to pull the plug on Alexa AI overnight.

Alexstarfire · Jan 14, 2025

Is it even possible to fix? Feels like you'd have to come up with an entirely new AI system to fix it.

The flaw is that generative AIs are too much like humans. They make up shit randomly, don't give you the same response with the same inquiry, and also are super confident about it. However, they need to be more like reference material, like encyclopedias. 100% true information, or at least true as based on current understanding.

stormcrash · Jan 14, 2025

Maestro4k said:
Personally I'm looking forward to Alexa AI ordering a give gallon drum of sexual lube for someone by accident. The prudishness of American society will make that significantly worse for Amazon to deal with than just an accidental order for too many widgets. Might even cause them to pull the plug on Alexa AI overnight.

"Alexa, turn on the lights"

Alexa: "Okay, flooding the enrichment center with a deadly neurotoxin"

FSTargetDrone · Jan 15, 2025

th3suffering said:
You know what, Alexa is so bad at just understanding something simple, how bad could the hallucinations be comparatively?

Alexa cant understand "turn off the downstairs lights" is functionally the same as "turn off the lights downstairs". Yes, i know one is grammatically better than the other, but functionally this is something simple

I’ve set my various routines to recognize a few different phrases / word order I might use for a particular action.

For example, if I want to turn on the projector down in our den, I can say:

“Turn On Den TV”
“Turn On Den Projector”
“Turn Den TV On”
“Turn Den Projector On”

And so forth. A few minutes of adding the different voice commands I’m likely to use as presets and we’ve had almost no problems.

I also only recently discovered that I can simply say “turn off the lights” and Alexa will only turn off the lights in the room I’m in at the time. Of course this has required my having an Echo in nearly every room, but that was relatively cheap to do as most of the Echo devices I have were the very cheap and very useful Echo Flex model. I had not even thought about being able to do that when I bought all of the Echo devices over a few years, several years back now.

I wouldn’t even mind paying a nominal fee to simply continue to use the system as I do now, almost exclusively for smart device control, multi-room music streaming and setting timers/alarms and asking about the weather or the odd question. I’m quite surprised there has been so subscription for that to this point.

Most of my devices are controlled locally via Hubitat, so if Alexa went away tomorrow or the price was too high, I could do without, though it does seem possible to do local voice control (I think). Though most of the light control is done so with timers or motion sensors, so I don’t even ask for adjusting individual lights that much any more. When I turn on the TV in a room, after a minute the lights dim down. If someone gets up in the middle of the night to use the bathroom or go down to the kitchen, a number of motion sensors detect that and some specific lights some on dimly and then go off or just to minimum brightness a few minutes after motion is no longer detected.

I love it, but I could do without Alexa specifically.

FSTargetDrone · Jan 15, 2025

josi_ok said:
Apple and Amazon are two of the biggest players in home voice assistants (hardware and software). Both have struggled for years to develop a (arguable) compelling voice product. Each is taking a very different approach.
It's still very early in the AI wars. One can argue that there is no market for more advanced products, but these companies seem to believe otherwise and they have the resources to continue to evolve.

Whatever its faults, in my experience, Alexa is miles ahead of Apple in terms of voice control as it stands right now. I’ve had no problems at all controlling a myriad of smart device brands with Alexa/Habitat but Apple/Homekit is very fussy. Half the time Siri cannot understand what I’m saying about a specific device but that’s rarely an issue with Alexa.

Carewolf · Jan 15, 2025

Cheshire Cat said:
Never. A pattern-matching state-machine with an unreliable voice-recognition interface is not AI.

It was marketed as such though..

jballou · Jan 15, 2025

orwelldesign said:
Just out of curiosity... What whole problem with 1984? Be specific.

I mean, I've read an awful lot of dystopias and near-dystopias and cyberpunk grim futures, and, when this comes up, I'm always curious specifically what is meant.

Thanks in advance for your thoughtful reply.

The same thing wrong with Blade Runner or the Torment Nexus, Musk et Al look at wild wealth disparity in a cyberpunk dystopia as a preferable outcome.

It doesn’t help when the majority of idiots who worship them also reference 1984 as a reason to oppose universal healthcare but have never read or critically analyzed the actual story.

Uncivil Servant · Jan 15, 2025

jballou said:
The same thing wrong with Blade Runner or the Torment Nexus, Musk et Al look at wild wealth disparity in a cyberpunk dystopia as a preferable outcome.

It doesn’t help when the majority of idiots who worship them also reference 1984 as a reason to oppose universal healthcare but have never read or critically analyzed the actual story.

Hmmm, now what do I have to do to get them to read The Traitor Baru Cormorant, one of the few dystopias that accurately depicts the role of policy advisors in an empire?

(or maybe not, it would probably turn their "Deep State" conspiracy theories up to 11, but Dickinson is quite astute in pointing out that advisors tend to come from minorities and the fringes of the empire for good reasons)

internetomancer · Jan 15, 2025

Windhaven said:
On one hand, I’m glad that at least one massive company seems to care about its products producing nonsense.
On the other hand… isn’t avoiding hallucinations basically impossible, given how LLMs are mainly very clever autocomplete?

Different LLMs produce different numbers and types of hallucinations. And various post-training and prompting techniques fudge those numbers around as well... So who knows, maybe they will settle on some threshold, like 0.3% of some sample of queries, and then also try to disable it from doing and saying the kinds of things that get them sued.

(Or they will just turn it on when they think it's profitable, and use that as an excuse.)

Also... I think it would depend a lot on what you want to do with Alexa...? LLMs do some things well and some things terribly.

ripvlan · Jan 15, 2025

My father always told me to think before I speak. And if it doesn't make sense, or isn't necessary in this context, just shut up. Basically - self edit before the words come out.

They could simply tell the responding AI "that doesn't seem right" before every response. It works with chatGPT

Cool Physics · Jan 15, 2025

I am SICK to death with everyone calling all this stuff AI. It is NOT, I repeat NOT AI if you just write software that is more and more complex, capable, large, and (to repeat) capable. You can make it as complex and sophisticated and capable as you want, but it is still just deterministic SOFTWARE, doing what it is programmed to do. It may be appear to be intelligent and appear to go off in some strange humanistic type directions sometime, but it is still just very, very, very, very complex and capable software. "AI" is just a buzzword to make something appear to be more important, or different than it is. Remember the other famous buzzword/acronym, Y2K.

khumak50 · Jan 15, 2025

I can't really see a reason I would want a voice assistant unless it's contained within the body of a robot that can actually do useful stuff for me. Alexa make me dinner. Alexa do my laundry. Alexa make my bed. Ok now I'm interested. Until then, Alexa turn the lights on doesn't do much for me.

orwelldesign · Jan 17, 2025

Cool Physics said:
I am SICK to death with everyone calling all this stuff AI. It is NOT, I repeat NOT AI if you just write software that is more and more complex, capable, large, and (to repeat) capable. You can make it as complex and sophisticated and capable as you want, but it is still just deterministic SOFTWARE, doing what it is programmed to do. It may be appear to be intelligent and appear to go off in some strange humanistic type directions sometime, but it is still just very, very, very, very complex and capable software. "AI" is just a buzzword to make something appear to be more important, or different than it is. Remember the other famous buzzword/acronym, Y2K.

I'm sure there's better examples than Y2K.

You know why mostly nothing happened? Because a shit-ton of people worked their collective asses off to fix things that could have been capital-P problems.

It's like with vaccines -- they worked so well people forgot the horrors of those communicable and preventable diseases.

I agree about the AI bit; we're nowhere near Wintermute. But y2k isn't the best comparison.

Matthew J. · Jan 19, 2025

Maybe it's time to start investing in a couple of these. Pretty much the only thing I use Alexa for anymore is controlling Home Assistant, and occasionally asking it to play music from my Plex server. I really don't want to deal with a whole bunch of AI nonsense in my "smart" speaker.

https://www.home-assistant.io/voice-pe/

Amazon must solve hallucination problem before launching AI-enabled Alexa

Ars Centurion

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Centurion

Ars Centurion

Ars Centurion

Ars Praefectus

Ars Legatus Legionis

Ars Legatus Legionis

Ars Legatus Legionis

Ars Scholae Palatinae

Seniorius Lurkius

Ars Tribunus Militum

Ars Centurion

Ars Tribunus Angusticlavius

Deleted member 567875

Guest

Ars Centurion

Wise, Aged Ars Veteran

Ars Legatus Legionis

Ars Tribunus Militum

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Scholae Palatinae

Ars Tribunus Militum

Ars Scholae Palatinae

Account Banned

Ars Tribunus Militum

Ars Tribunus Angusticlavius

Ars Tribunus Angusticlavius