Amazon must solve hallucination problem before launching AI-enabled Alexa

I used to work in Alexa and I just don't see it being converted to an LLM solution for responses at all. A core problem is that user expectation is that responses are quite reliable outside of general knowledge, and the currently hyped LLMs just don't provide that. Also, general knowledge has never even been a big draw for average consumers for a voice assistant. Having Alexa work off of an LLM is just not going to be particularly reliable anytime soon and is ripe with risk for consumer harm in various ways. Another massive problem that LLMs are facing for voice assistants is in time of response, both with time to response start and length of response. If a voice response isn't highly reactive and to the point, users get pretty pissed off and decrease engagement.

We just aren't anywhere near having LLMs be reliable enough with low enough latency and to-the-point responses to launch as a voice assistant replacement for Alexa/Siri/Google Assistant.
 
Last edited:
Upvote
13 (13 / 0)

MobiusPizza

Ars Scholae Palatinae
1,363
I used to work in Alexa and I just don't see it being converted to an LLM solution at all. A core problem is that user expectation is that responses are quite reliable outside of general knowledge, and the currently hyped LLMs just don't provide that. Also, general knowledge has never even been a big draw for average consumers for a voice assistant. Having Alexa work off of an LLM is just not going to be particularly reliable anytime soon and is ripe with risk for consumer harm in various ways. Another massive problem that LLMs are facing for voice assistants is in time of response, both with time to response start and length of response. If a voice response isn't highly reactive and to the point, users get pretty pissed off and decrease engagement.

We just aren't anywhere near having LLMs be reliable enough with low enough latency and to-the-point responses to launch as a voice assistant replacement for Alexa/Siri/Google Assistant.

As a customer I just want something simple, for these assistants to have better natural language processing and understand commands said in variations of sentence structure, grammar or object names. I am not looking for a chat bot to have philosophical discussion with me. As others have posted, currently if you say turn downstairs lights off and turn the lights off downstairs one will work and one will fail. It's useless.
 
Upvote
4 (4 / 0)

FSTargetDrone

Ars Scholae Palatinae
748
I think they have seriously overestimated how much people want that.

I have Amazon Echos in my home....the majority of use is home automation activation, weather, and timers. My wife and I also use it to communicate between our two home offices, which are on different floors.
Yeah the vast majority of what we use Alexa for is smart device control and then timers.
Lots of food timers. I have no interest in using it for anything beyond that. I don’t want to have conversations with a computer.
 
Upvote
3 (3 / 0)

thekaj

Ars Legatus Legionis
48,270
Subscriptor++
This is just going to turn into the next generation of the virtual assistant failure. It involves the exact same players, Microsoft, Apple, Amazon, and Google. And they're all throwing an insane amount of money at the issue. Apple seems to once again be the only one to be mostly viewing it as grafting it into existing products to make them more attractive, while the others are once again thinking that these things will be their own products that pay for themselves.

So in a few years time, major shareholders are going to start paying attention to how much money has been invested into these things, along with demanding to know when they expect to actually start making money, let alone recoup the costs. At which point, there will be a lot of uncomfortable shifting in chairs by the MS, Amazon, and Google finance people.
 
Upvote
9 (9 / 0)
Funny how Apple’s caution seems to be borne out by Amazon’s experience.

Personally, I’d rather have tiny, incremental, predictable improvements in on-device agents with abilities to do real-world things.

There’s always ChatGPT if you want to…chat.
The problem with that in the space of voice assistants is: that's not enough to launch anything. Look at what Apple has been doing so far with Apple Intelligence that isn't even within the voice assistant space. It's all so minimal and therefore not particularly useful. Now Apple is building a reputation of having near-zero value with Apple Intelligence, so consumers are beginning to conclude Apple can't do anything with AI
 
Upvote
3 (3 / 0)
If regular Alexa couldn't generate any meaningful subscription revenue or sales commissions, does the team seriously expect GenAI Alexa (which is presumably far more expensive per query) to be different? It seems like Amazon is doubling down on a failed monetization strategy.
I think you are missing the point of what Alexa is aiming to achieve with using LLMs for voice interactions, specifically regarding Alexa responses. It wants to leverage the general knowledge successes of things like ChatGPT in a way that customers perceive it as having enough value to pay for on a monthly basis, likely through offering 1 to 3 month free trial of it when it becomes available. In a perfect world, that would be a killer revenue stream. Do I think they'll succeed? Absolutely not.
 
Upvote
6 (6 / 0)
I would have agreed with you but we've crossed the threshold where 'AI' means absolutely anything the marketing department decides which may or may not be a step in any direction whatsoever.

E.g. any sort of eye tracking for camera autofocus now seems to be called AI; we had lesser versions of eye tracking autofocus at least a decade ago and nobody felt the need to call it AI but at some point the tag got attached. Including by Sony, which people generally regard as having the best eye-tracking autofocus in the industry.

I really think they just kept improving what they had and put a new label to it.
The thing is: there are a massive amount of applications for 'AI' because it's an extremely broad technical space, including the example you gave using visual models. 'AI' has been a thing for a very long time, but most recently it is getting a lot of attention with large language models (LLMs), which is a smaller but still huge subset of 'AI'.
 
Upvote
1 (1 / 0)

markgo

Ars Praefectus
3,776
Subscriptor++
The problem with that in the space of voice assistants is: that's not enough to launch anything. Look at what Apple has been doing so far with Apple Intelligence that isn't even within the voice assistant space. It's all so minimal and therefore not particularly useful. Now Apple is building a reputation of having near-zero value with Apple Intelligence, so consumers are beginning to conclude Apple can't do anything with AI
No one wants to be the first to hook up AI to things that affect the real world even in small ways. Wonder why.
 
Upvote
2 (2 / 0)

jdale

Ars Legatus Legionis
18,261
Subscriptor
I used to work in Alexa and I just don't see it being converted to an LLM solution for responses at all. A core problem is that user expectation is that responses are quite reliable outside of general knowledge, and the currently hyped LLMs just don't provide that. Also, general knowledge has never even been a big draw for average consumers for a voice assistant. Having Alexa work off of an LLM is just not going to be particularly reliable anytime soon and is ripe with risk for consumer harm in various ways. Another massive problem that LLMs are facing for voice assistants is in time of response, both with time to response start and length of response. If a voice response isn't highly reactive and to the point, users get pretty pissed off and decrease engagement.

We just aren't anywhere near having LLMs be reliable enough with low enough latency and to-the-point responses to launch as a voice assistant replacement for Alexa/Siri/Google Assistant.
I don't think they can solve this problem, but I disagree that it will block implementation. Alexa is a money-loser, and there is a very significant part of the American public that does not care about accuracy. Yes, it hallucinates, yes, it's going to be slower than the current system, yes, it's going to actually make responses to simple tasks less accurate. And they will release it anyway and we will get to see videos on your choice of social media about the most amusing results.
 
Upvote
1 (1 / 0)

Fatesrider

Ars Legatus Legionis
24,977
Subscriptor
On one hand, I’m glad that at least one massive company seems to care about its products producing nonsense.
On the other hand… isn’t avoiding hallucinations basically impossible, given how LLMs are mainly very clever autocomplete? Even with some effort to provide sources like Google tries to, they’re terrible at picking up tone and cite sarcasm.
Yeah, my impression of the way AI is done is that there's no way to avoid hallucinations. It's literally baked into the processes used to create responses. And input matters wrt output in subtle ways that won't necessarily parse properly the first time, resulting in some odd response that's a dead giveaway that the response came from an AI, and not a human.

They'll have to do AI differently than the way they've done it, and train it in FAR different ways than they did (such as making sure factual information is used, and that it can parse the difference between fact and "popular belief").

Right now, it's not really "good enough" to fool most people (many, though, because they're none too bright in picking up the discrepancies each time). So before I bend knee to our AI overlords, they'd better be able to handle human input with output that is always responsive in a human-convincing way. None of them I've tried so far have met that benchmark.
 
Upvote
3 (3 / 0)

Uncivil Servant

Ars Scholae Palatinae
4,667
Subscriptor
We got an Echo Dot for $5 on a punt circa 2018 because it was bundled with something or another. My pronunciation is about as straightforward and uninflected as southern England can produce and my history in publishing means that I chide myself if I even so much as split an infinitive. That's despite having spent now almost half my adult life in the US; people even ask when my natural-born-American child moved here because he's done such an excellent job of learning my British accent.

Alexa couldn't even understand the prompt for a cookie recipe that I recited directly from its literature. Due to its general uselessness we passed on the Echo Dot to somebody else shortly thereafter.

So either the speech recognition is calibrated by geolocation or Alexa was having a really bad few days.

Damn, if it can't handle South Anglian then it definitely can't handle Tihdewottah's 17th century South Anglian.
 
Upvote
0 (0 / 0)

cibyr

Seniorius Lurkius
22
Subscriptor++
At this point, whenever I see a new product release and the main feature is "now includes AI" I'm 100% not interested in that product. It's a step backwards IMO. Sure hope this trend of cramming AI into everything ends sooner rather than later.
When was Alexa not an AI product?
 
Upvote
1 (1 / 0)

orwelldesign

Ars Tribunus Angusticlavius
7,307
Subscriptor++
'Your plastic pal that's fun to be with' was sarcasm. NOT a road map.

I am unsure what that has slipped past the tech bros. Of course, there is the whole problem with '1984' so I suppose I shouldn't be surprised.

Just out of curiosity... What whole problem with 1984? Be specific.

I mean, I've read an awful lot of dystopias and near-dystopias and cyberpunk grim futures, and, when this comes up, I'm always curious specifically what is meant.

Thanks in advance for your thoughtful reply.
 
Upvote
1 (1 / 0)

Maestro4k

Ars Tribunus Militum
1,537
If history is any guide, they will invest heavily, still have hallucinations, give into the sunk cost fallacy, hype the shit out of their new "Alexa AI", and ignore the inevitable tsunami of complaints from people whose Alexa routinely gives false information and accidentally orders 10000 widgets.
Personally I'm looking forward to Alexa AI ordering a give gallon drum of sexual lube for someone by accident. The prudishness of American society will make that significantly worse for Amazon to deal with than just an accidental order for too many widgets. Might even cause them to pull the plug on Alexa AI overnight.
 
Upvote
0 (0 / 0)

Alexstarfire

Ars Scholae Palatinae
720
Is it even possible to fix? Feels like you'd have to come up with an entirely new AI system to fix it.

The flaw is that generative AIs are too much like humans. They make up shit randomly, don't give you the same response with the same inquiry, and also are super confident about it. However, they need to be more like reference material, like encyclopedias. 100% true information, or at least true as based on current understanding.
 
Upvote
6 (6 / 0)
Personally I'm looking forward to Alexa AI ordering a give gallon drum of sexual lube for someone by accident. The prudishness of American society will make that significantly worse for Amazon to deal with than just an accidental order for too many widgets. Might even cause them to pull the plug on Alexa AI overnight.
"Alexa, turn on the lights"

Alexa: "Okay, flooding the enrichment center with a deadly neurotoxin"
 
Upvote
5 (5 / 0)

FSTargetDrone

Ars Scholae Palatinae
748
You know what, Alexa is so bad at just understanding something simple, how bad could the hallucinations be comparatively?

Alexa cant understand "turn off the downstairs lights" is functionally the same as "turn off the lights downstairs". Yes, i know one is grammatically better than the other, but functionally this is something simple
I’ve set my various routines to recognize a few different phrases / word order I might use for a particular action.

For example, if I want to turn on the projector down in our den, I can say:

“Turn On Den TV”
“Turn On Den Projector”
“Turn Den TV On”
“Turn Den Projector On”

And so forth. A few minutes of adding the different voice commands I’m likely to use as presets and we’ve had almost no problems.

I also only recently discovered that I can simply say “turn off the lights” and Alexa will only turn off the lights in the room I’m in at the time. Of course this has required my having an Echo in nearly every room, but that was relatively cheap to do as most of the Echo devices I have were the very cheap and very useful Echo Flex model. I had not even thought about being able to do that when I bought all of the Echo devices over a few years, several years back now.

I wouldn’t even mind paying a nominal fee to simply continue to use the system as I do now, almost exclusively for smart device control, multi-room music streaming and setting timers/alarms and asking about the weather or the odd question. I’m quite surprised there has been so subscription for that to this point.

Most of my devices are controlled locally via Hubitat, so if Alexa went away tomorrow or the price was too high, I could do without, though it does seem possible to do local voice control (I think). Though most of the light control is done so with timers or motion sensors, so I don’t even ask for adjusting individual lights that much any more. When I turn on the TV in a room, after a minute the lights dim down. If someone gets up in the middle of the night to use the bathroom or go down to the kitchen, a number of motion sensors detect that and some specific lights some on dimly and then go off or just to minimum brightness a few minutes after motion is no longer detected.

I love it, but I could do without Alexa specifically.
 
Upvote
0 (0 / 0)

FSTargetDrone

Ars Scholae Palatinae
748
Apple and Amazon are two of the biggest players in home voice assistants (hardware and software). Both have struggled for years to develop a (arguable) compelling voice product. Each is taking a very different approach.
It's still very early in the AI wars. One can argue that there is no market for more advanced products, but these companies seem to believe otherwise and they have the resources to continue to evolve.
Whatever its faults, in my experience, Alexa is miles ahead of Apple in terms of voice control as it stands right now. I’ve had no problems at all controlling a myriad of smart device brands with Alexa/Habitat but Apple/Homekit is very fussy. Half the time Siri cannot understand what I’m saying about a specific device but that’s rarely an issue with Alexa.
 
Upvote
2 (2 / 0)

jballou

Ars Scholae Palatinae
889
Just out of curiosity... What whole problem with 1984? Be specific.

I mean, I've read an awful lot of dystopias and near-dystopias and cyberpunk grim futures, and, when this comes up, I'm always curious specifically what is meant.

Thanks in advance for your thoughtful reply.
The same thing wrong with Blade Runner or the Torment Nexus, Musk et Al look at wild wealth disparity in a cyberpunk dystopia as a preferable outcome.

It doesn’t help when the majority of idiots who worship them also reference 1984 as a reason to oppose universal healthcare but have never read or critically analyzed the actual story.
 
Upvote
5 (5 / 0)

Uncivil Servant

Ars Scholae Palatinae
4,667
Subscriptor
The same thing wrong with Blade Runner or the Torment Nexus, Musk et Al look at wild wealth disparity in a cyberpunk dystopia as a preferable outcome.

It doesn’t help when the majority of idiots who worship them also reference 1984 as a reason to oppose universal healthcare but have never read or critically analyzed the actual story.

Hmmm, now what do I have to do to get them to read The Traitor Baru Cormorant, one of the few dystopias that accurately depicts the role of policy advisors in an empire?

(or maybe not, it would probably turn their "Deep State" conspiracy theories up to 11, but Dickinson is quite astute in pointing out that advisors tend to come from minorities and the fringes of the empire for good reasons)
 
Upvote
1 (1 / 0)
On one hand, I’m glad that at least one massive company seems to care about its products producing nonsense.
On the other hand… isn’t avoiding hallucinations basically impossible, given how LLMs are mainly very clever autocomplete?
Different LLMs produce different numbers and types of hallucinations. And various post-training and prompting techniques fudge those numbers around as well... So who knows, maybe they will settle on some threshold, like 0.3% of some sample of queries, and then also try to disable it from doing and saying the kinds of things that get them sued.

(Or they will just turn it on when they think it's profitable, and use that as an excuse.)

Also... I think it would depend a lot on what you want to do with Alexa...? LLMs do some things well and some things terribly.
 
Last edited:
Upvote
0 (0 / 0)
I am SICK to death with everyone calling all this stuff AI. It is NOT, I repeat NOT AI if you just write software that is more and more complex, capable, large, and (to repeat) capable. You can make it as complex and sophisticated and capable as you want, but it is still just deterministic SOFTWARE, doing what it is programmed to do. It may be appear to be intelligent and appear to go off in some strange humanistic type directions sometime, but it is still just very, very, very, very complex and capable software. "AI" is just a buzzword to make something appear to be more important, or different than it is. Remember the other famous buzzword/acronym, Y2K.
 
Upvote
0 (1 / -1)

orwelldesign

Ars Tribunus Angusticlavius
7,307
Subscriptor++
I am SICK to death with everyone calling all this stuff AI. It is NOT, I repeat NOT AI if you just write software that is more and more complex, capable, large, and (to repeat) capable. You can make it as complex and sophisticated and capable as you want, but it is still just deterministic SOFTWARE, doing what it is programmed to do. It may be appear to be intelligent and appear to go off in some strange humanistic type directions sometime, but it is still just very, very, very, very complex and capable software. "AI" is just a buzzword to make something appear to be more important, or different than it is. Remember the other famous buzzword/acronym, Y2K.

I'm sure there's better examples than Y2K.

You know why mostly nothing happened? Because a shit-ton of people worked their collective asses off to fix things that could have been capital-P problems.

It's like with vaccines -- they worked so well people forgot the horrors of those communicable and preventable diseases.

I agree about the AI bit; we're nowhere near Wintermute. But y2k isn't the best comparison.
 
Upvote
3 (3 / 0)

Matthew J.

Ars Tribunus Angusticlavius
7,832
Subscriptor++
Maybe it's time to start investing in a couple of these. Pretty much the only thing I use Alexa for anymore is controlling Home Assistant, and occasionally asking it to play music from my Plex server. I really don't want to deal with a whole bunch of AI nonsense in my "smart" speaker.

https://www.home-assistant.io/voice-pe/
 
Upvote
0 (0 / 0)