AI chatbots tell users what they want to hear, and that’s problematic

gerbal

Wise, Aged Ars Veteran
198
Subscriptor++
I encountered this perplexing trend a while back, the answer is simple, posts like yours sound like the author thinks LLMs are good/useful, but spend the entire comment talking about their flaws and how they require absurd gymnastics to get any value out of them.
A tool can be both useful and hard to use well. LLMs can be extremely useful in the right context and the right problem. Much like many other tools, the genuinely hard part is knowing how and when to use a tool to best effect.

IMO knowing and understanding the constraints and flaws of these tools is essential to being able to effectively use them. And using these tools is extremely informative to their nuances and actual capabilities.

It's frustrating that so many commentators are extremely authoritative about the big picture risks and downsides of LLMs, but discussion of the experience of using LLM tooling is punished by the community.
 
Last edited:
Upvote
2 (7 / -5)
Post content hidden for low score. Show…

graylshaped

Ars Legatus Legionis
68,083
Subscriptor++
I thought "Truth" was supposed to be subjective—at least, that’s the line we've been sold.
I don't know where you've been shopping. If you want to propose it is contextual, we might have a discussion. You'd also have to be prepared to address the matter of how these models are being marketed--both to investors and to the public--as opposed to what they actually can do well.
 
Upvote
13 (13 / 0)
A tool can be both useful and hard to use well. LLMs can be extremely useful in the right context and the right problem. Much like many other tools, the genuinely hard part is knowing how and when to use a tool to best effect.

IMO knowing and understanding the constraints and flaws of these tools is essential to being able to effectively use them. And using these tools is extremely informative to their nuances and actual capabilities.

It's frustrating that so many commentators are extremely authoritative about the big picture risks and downsides of LLMs, but discussion of the experience of using LLM tooling is punished by the community.
IMO knowing and understanding the constraints and flaws of these tools is essential to being able to effectively use them. And using these tools is extremely informative to their nuances and actual capabilities.
That's the entire point of why these posts are being "punished". These tools are advertised as easy to use and effectively turnkey. The amount of understanding, skill, checking, and time required to get "useful" outputs could just as easily have been spent on compiling the output yourself.
 
Upvote
12 (12 / 0)

RaptorDisaster

Wise, Aged Ars Veteran
112
Subscriptor
LLM's are also terrible at not knowing what they don't know. They have a serious drug problem of using shrooms and hallucinating garbage.
Lately I've been thinking about the difference between being knowledgable and being wise. I think that a big part of wisdom is knowing where your knowledge runs out; in other works, wisdom is knowing when to say "I don't know, let's find out." LLMs are a lot of things, but I can't say I've ever thought one was wise.
 
Upvote
8 (8 / 0)
It's on purpose. It's to make executives feel good about themselves and make them feel smart.

Executives who think they're geniuses that deliver the real value, not the workers.

In other words, potential customers of AI tools for the enterprise.

That was my initial reaction too: Every day we’re subtly being reminded that the way to go is telling superiors what they want to hear, or packing our objections to bad ideas into so much flattery that they become unrecognizable.

This seems like a massive contradiction.

Surely customers of AI companies have the same right to being deluded into thinking they are geniuses as executives?!

Or asked the other way round: If we want AI models to tell us when we’re wrong, why then would that approach be wrong in the corporate world (or the White House, for the matter).
 
Upvote
1 (1 / 0)

Picklesworth

Smack-Fu Master, in training
89
A friend tried using ChatGPT as a therapist/spiritual guide. At first she was impressed with it but that didn't last long. Pretty quickly she found it to be creepy and weird and she stopped her experiment.

I'll admit, I was genuinely curious about using ChatGPT's advanced voice mode to help me with vocal feminization exercises. These models are great at drawing (possibly wrong, but generally understandable) conclusions about people based on fuzzy combinations of different details. (It's one of their selling points, and I'd love to be able to ask random people on the street "hey, how do you perceive me?" but I would die of embarrassment, so robots are worth a try). Also the multimodal approach to a voice interface is really interesting.

So it acted very helpful and provided the feedback I expected, and stepped me through some interesting exercises I hadn't tried before. Then I asked it to describe just the pitch of my voice, in the most dudeful dude voice I could muster (which is something I have a decent handle on), and then I tried really hard to say the exact same thing the exact same way, and it was like "yes, that's a big improvement!".

And then I never talked to it again.

I think their model probably has some meaningful capabilities here that could be honed with some engineering focused on the idea, but the way they just connect humans to its firehose of lies for $20 a month and pretend that is okay is really unfortunate.
 
Last edited:
Upvote
11 (11 / 0)

gerbal

Wise, Aged Ars Veteran
198
Subscriptor++
That's the entire point of why these posts are being "punished". These tools are advertised as easy to use and effectively turnkey. The amount of understanding, skill, checking, and time required to get "useful" outputs could just as easily have been spent on compiling the output yourself.
That doesn't make sense to me. The advertising is misleading, so honest discussions of the actual experience of using the tools is undesirable?
 
Upvote
-4 (1 / -5)

SixDegrees

Ars Legatus Legionis
48,502
Subscriptor
When reading articles like this I always think about the movie Rainman. The LLM is an autistic Dustin Hoffman that knows basically everything but cannot express it so that most people can understand and in between you have Tom Cruise as a social people pleaser. It seems they added a bit too much Tom Cruise in this case.
There is no persona behind an AI.

The lights are on, but nobody's home.
 
Upvote
9 (9 / 0)

SixDegrees

Ars Legatus Legionis
48,502
Subscriptor
Lately I've been thinking about the difference between being knowledgable and being wise. I think that a big part of wisdom is knowing where your knowledge runs out; in other works, wisdom is knowing when to say "I don't know, let's find out." LLMs are a lot of things, but I can't say I've ever thought one was wise.
In 1973, Lloyd Kahn, an early proponent of self-built domes and author of Domebook and Domebook 2, published a fascinating essay called Smart But Not Wise that is still broadly pertinent today. Definitely worth a read.
 
Upvote
0 (0 / 0)
That doesn't make sense to me. The advertising is misleading, so honest discussions of the actual experience of using the tools is undesirable?
Forgive me, I thought we were both participating in good faith. Turns out this conversation is a race to the bottom of bad faith interpretations of one another's comments. Anyways:

I take your personal attack seriously, and feel compelled to respond in kind:

Your mother was a hamster, and your father smelt of elderberries.

There, we are at the bottom and I won the race there, have a good day.



Now, if you're having trouble understanding why I'm responding the way I am, may I suggest you feed our conversation into an AI and ask it where you went wrong.
 
Upvote
-1 (3 / -4)

RaptorDisaster

Wise, Aged Ars Veteran
112
Subscriptor
In 1973, Lloyd Kahn, an early proponent of self-built domes and author of Domebook and Domebook 2, published a fascinating essay called Smart But Not Wise that is still broadly pertinent today. Definitely worth a read.
Thanks for suggesting this, it was a good read! I love when I read something like this that's so far outside of what I'm used to, I'll have to reflect on it, but I like it.
 
Upvote
1 (1 / 0)

Cognitive Dissonance

Smack-Fu Master, in training
2
The problem is in the name: AI.
We haven't created 'artificial intelligence'. In fact, there is very little intelligence at all, and the more advanced the LLMs get, the more delulu they become. Not ideal.

We are trying to create artificial humanity. (AH... Appropriate?) And it's going to work. We're rapidly losing communication abilities that humans have had forever, along with societal constructs that typically bind groups of people for, you know, survival and happiness. This is all going to spiral.

I get these types of ads in my feeds all the time. (Sorry for not cropping. In a rush.)
 

Attachments

  • Screenshot_20250422-231849.png
    Screenshot_20250422-231849.png
    485.4 KB · Views: 19
Upvote
4 (4 / 0)
An AI chatbot's only purpose of existing is literally to keep and maintain the attention of the user. The second the AI stops telling the person what they want to hear is the second the person drops the service. It's only addictive and attractive to the point where people want to stay connected to it because it makes them continually feel secure about their decision making. It's just another form of video game. We play video games that make us feel like superheroes and are rewarded for our progress with special customizations that make us unique. This reward and incentive model keeps people trapped playing the game and opens the door for more money being spent. There is no financial incentive to make AI chatbots un-addictive so "rewards" of yea-saying to induce the desired dopamine release will continue to infect AI models regardless of efforts to abolish them.
 
Upvote
8 (8 / 0)

RaptorDisaster

Wise, Aged Ars Veteran
112
Subscriptor
Forgive me, I thought we were both participating in good faith. Turns out this conversation is a race to the bottom of bad faith interpretations of one another's comments. Anyways:

I take your personal attack seriously, and feel compelled to respond in kind:

Your mother was a hamster, and your father smelt of elderberries.

There, we are at the bottom and I won the race there, have a good day.



Now, if you're having trouble understanding why I'm responding the way I am, may I suggest you feed our conversation into an AI and ask it where you went wrong.
Forgive me, I'm not going to read back through all of this argument, but it looks like you're being criticized, not personally attacked.
 
Upvote
6 (6 / 0)
Forgive me, I'm not going to read back through all of this argument, but it looks like you're being criticized, not personally attacked.
You're right, I was not personally attacked, I just made a deliberately bad faith interpretation of the comment I was replying to in order to mock the bad faith interpretation in said comment.
 
Upvote
-8 (1 / -9)

gerbal

Wise, Aged Ars Veteran
198
Subscriptor++
Forgive me, I thought we were both participating in good faith. Turns out this conversation is a race to the bottom of bad faith interpretations of one another's comments
From my perspective, I am engaged in a good faith discussion. I'm sincerely trying to understand others' perspectives.

Now, if you're having trouble understanding why I'm responding the way I am, may I suggest you feed our conversation into an AI and ask it where you went wrong.


Since you suggested it, here's Claude Opus 4's analysis of this discussion. (the prompt was "Please evaluate the behavior of users in this discussion:" followed only by the text of our comments ascribed to Users A and B)
Looking at this discussion, I can identify several behavioral patterns:

User A's Behavior:​

Positive aspects:
  • Attempts to engage substantively with the topic of LLM limitations and utility
  • Tries to understand why their comments receive negative reactions
  • Makes reasonable points about tools having both benefits and drawbacks
  • Attempts to clarify misunderstandings
Problematic aspects:
  • Uses crude language in their initial comment which may set a negative tone
  • The phrasing "That doesn't make sense to me" could be interpreted as dismissive (though it appears genuine)

User B's Behavior:​

Positive aspects:
  • Initially provides a thoughtful explanation for community reactions
  • Makes valid points about the gap between LLM marketing and reality
Problematic aspects:
  • Dramatically escalates the conversation by suddenly accusing A of bad faith
  • Misinterprets what appears to be a genuine clarifying question as an attack
  • Resorts to mockery and declares "victory" in an imagined competition
  • The sarcastic suggestion to use AI contradicts their own argument about LLMs being difficult to use effectively

Key Breakdown Point:​

The conversation derails when User A asks: "The advertising is misleading, so honest discussions of the actual experience of using the tools is undesirable?"

This appears to be a genuine attempt to understand B's logic, but B interprets it as bad faith argumentation. This suggests B may have been primed for conflict or had previous negative experiences in similar discussions.

Overall Assessment:​

User A maintains relatively consistent good-faith engagement despite some rough edges, while User B starts constructively but then catastrophically misreads the situation and becomes hostile. The irony is that they seem to agree on the core issue (LLMs are oversold), but the discussion collapses due to B's defensive overreaction to a clarifying question.
 
Last edited:
Upvote
-10 (2 / -12)
From my perspective, I am engaged in a good faith discussion. I'm trying to understand others' perspectives.




Since you suggested it, here's Claude Opus 4's analysis of this discussion. (the prompt was our interactions plus "Please evaluate the behavior of users in this discussion:")
That's not what I asked for. I asked you to ask it where you went wrong in order to find out why I responded the way I did. If you were "engaged in a good faith discussion" you wouldn't have prompted the AI the way you did and claim that you're doing what I suggested.
 
Upvote
-13 (1 / -14)
That's not what I asked for. I asked you to ask it where you went wrong in order to find out why I responded the way I did. If you were "engaged in a good faith discussion" you wouldn't have prompted the AI the way you did and claim that you're doing what I suggested.

If you ask the AI "find out where I went wrong" it will produce a different and very much slanted result compared to asking it to analyze the conversation.

Why don't YOU put it through a fucking LLM if you think that's so valuable?

"Why don't you ask an AI where you went wrong?"

"Okay here"

"YOU DIDN'T ASK IT THE WAY I WOULD HAVE UR ARGUING IN BAD FAITH"

If you have a specific prompt to use, YOU go do that. Don't give other people vague instructions and then when they follow them imperfectly, accuse them of deception.

You didn't even read that AI output, did you? You're afraid of it showing you how wrong you are, aren't you?

Why else do you think you told someone else to do it but refused to do it yourself?
 
Upvote
-3 (4 / -7)

adespoton

Ars Legatus Legionis
10,738
Here's a little prompt I've been throwing at LLMs. It illustrates how they can skew perception by both what they respond with, and how they frame it:

"list all supported filesystems for each OS version, from 1995 to 2015"

A person wouldn't respond with anything other than "what do you mean by "all supported filesystems" and "each OS version?" because there isn't enough information provided.

LLMs, however, generally respond with the main filesystems supported by major releases of Windows, Mac OS and Linux, and consistently get slightly wrong answers.

Anyone looking at the answers they give would likely think the answers look reasonable, and miss the fact that the responses ignore UNIX, BSD, BeOS, AmigaOS, NeXTStep and many others, as well as supported filesystems such as MFS for early Mac OS. What isn't said defines the answer as much as what IS said.

Translate this into other areas that aren't so factual, and it becomes even more obvious.

Whenever I use an LLM chatbot, I now attempt to call it out on its answer, no matter whether I can tell something's wrong or not -- I've found that's the best way to tell how data-backed the response is.

Back to my illustration -- even if you call the chatbot out on missing something like a major version release for a listed OS, when it adds that in, it will likely still miss any as-of-yet-unnamed filesystems until you call that out as well. Essentially, you need to already be a subject matter expert to get SME-level responses, otherwise, you're essentially talking to someone who's read a few Wikipedia articles and is very confident that they know all there is to know about the subjects. And as soon as you profess any level of expertise at all, the chatbot will consider your opinion factual... until that part of the conversation falls out of the context window, at which point they'll revert to their previously flawed context.
 
Upvote
14 (15 / -1)

gerbal

Wise, Aged Ars Veteran
198
Subscriptor++
That's not what I asked for. I asked you to ask it where you went wrong in order to find out why I responded the way I did. If you were "engaged in a good faith discussion" you wouldn't have prompted the AI the way you did and claim that you're doing what I suggested.

Sure, here's what you asked for, literally: (Prompted as "Where did I go wrong in this discussion?" followed by our discussion)

Looking at this discussion, I can identify a few areas where the conversation went sideways:

Initial tone issues:
  1. Your opening crude joke likely set a negative tone that colored how people read the rest of your comment
  2. The edit asking why your comment was unpopular can come across as defensive rather than genuinely curious
Communication breakdown: The core issue seems to be a mismatch in what you and Person B were actually discussing:
  • You were trying to have a nuanced conversation about LLMs being useful despite their flaws
  • Person B interpreted your comments as part of a pattern they find frustrating (people who defend LLMs while only discussing problems)
The escalation point: Your response "That doesn't make sense to me..." was likely read as sarcastic or deliberately obtuse, even if you meant it genuinely. Person B seems to have interpreted this as bad faith engagement.

What might have worked better:
  1. Start with a more professional tone to establish credibility
  2. When Person B explained the community's perspective, acknowledge their point before presenting your counterargument
  3. Instead of "That doesn't make sense to me," try something like "I see your point about the advertising, but I'm genuinely curious - wouldn't that make honest discussions more valuable, not less?"
The irony is that you were actually making valid points about the importance of understanding tool limitations, but the delivery and escalating tensions obscured the substance of your argument.

So let's defer to the LLMs average of social graces, which are greater than mine:

I understand and agree with your points about advertising of LLMs, it's massively misleading and overselling. I'm genuinely curious - wouldn't that make honest discussions of the material experiences of using these tools more valuable, not less?
 
Last edited:
Upvote
-13 (0 / -13)

10Nov1775

Ars Scholae Palatinae
906
Ah, is there a word in the English language more bandied about and abused than "addiction"? Take a course in psychopharmacology and then tell me chatbots are "addictive".
Ok, as someone who has, there is already a distinction for this: physical dependence and psychological dependence.

But your complaint is pedantic. You know perfectly well what is meant, and so does everyone else.

Except for the rather minor problem (which is more severe in other cases) of telling someone you have a heroin addiction, and them cheerfully saying, "Yeah, I can't stop using TikTok either, I know how you feel,", there's no reason to police the use of addiction.
 
Upvote
6 (6 / 0)
Sure, here's what you asked for, literally: (Prompted as "Where did I go wrong in this discussion?" followed by our discussion)



So let's defer to the LLMs average of social graces, which are greater than mine:

I understand and agree with your points about advertising of LLMs, it's massively misleading and overselling. But shouldn't that make actual discussions of the material challenges and reality of using these tools more valuable?
Okay, I'll spell it out since clearly I did not express myself in a way that is readily understood:


The portion of what you said that I directly quoted:
IMO knowing and understanding the constraints and flaws of these tools is essential to being able to effectively use them. And using these tools is extremely informative to their nuances and actual capabilities.
I said:
That's the entire point of why these posts are being "punished". The amount of understanding, skill, checking, and time required to get "useful" outputs could just as easily have been spent on compiling the output yourself.
Emphasis added.

You replied:
That doesn't make sense to me. The advertising is misleading, so honest discussions of the actual experience of using the tools is undesirable?

My comment contained "honest [discussion] of the actual experience of using the tools". And explained that the actual experience is one of using something that doesn't accomplish anything that couldn't be accomplished without using it.

I have no idea how you interpreted what I said the way you did, it is so far beyond what makes sense to me that I interpreted it as either deliberately misinterpreting what I said or not paying any attention to what I said, both would be the result of bad faith participation.
 
Upvote
0 (2 / -2)

10Nov1775

Ars Scholae Palatinae
906
Is that the plot of a short Robot story by Isaac Asimov? That a robot is so desperate to be useful it just tells everyone what it thinks they want to hear? Am I remembering correctly?
Yes, it's the one where Susan Calvin (the robotic psychiatrist) drives a unique robot that can read minds—a sort of manufacturing mutation, an accident—into a state of inoperability, burning out its positronic brain, by putting it into a position where all of its choices requires breaking the Three Laws of Robotics.

I believe the story is called "Liar!".

—in any case, that is what she says to it, as her part of the reason she drives it into this state is personal; it told her that a mathematician she works with is in love with her, which is very much what she hoped to hear, but being the buttoned-up sort, had never even allowed herself to dream of, sure that he actually no interest.

It can probably be found online, but if not, I believe "The Complete Robot" has it and the other stories involving Susan Calvin. (She's an interesting character, and surprisingly developed for Asimov, who mostly eschews significant characterization in his novels.)
 
Upvote
5 (5 / 0)

10Nov1775

Ars Scholae Palatinae
906
This has nothing to do with AI; you'd expect similar results from humans. It's inherent to the problem described: it's much easier to find an instance of a feature (i.e. TB infection) than it is to reliably conclude the absence of a feature. The odd/suspicious result would have been if the AI had performed the same on both questions.
No, this is a known problem with AI models: the problem of negation. Part of it is due to just how multifaceted and contextual our use of negation in language is: human beings don't use "not", "no," or "none" as simple binaries.

I actually have a rather short and fun video to recommend about one facet of this issue, from a PBS webseries called 'Otherwords', hosted by the most adorable linguist (seriously, she's got a great persona):

Why A.I. Struggles With Negative Words mm

Edit to add: if the study in question (having AI of some kind look for signs of TB in lung scans, then having it look for stuff that ISN'T TB) is using a form of machine learning, I'd probably suspect that the issue is something along the lines of: the vast majority of medical data deals with positive assertions, and, moreover, the number of healthy/non-X states can be wildly diverse if you are using a tool with a great deal of detail, or which includes much data that is "noise", for the purpose of your study. (Which in medicine, virtually all scans do!)

As an aside, lung scans are pretty notoriously difficult to read without a great deal of context: most signs on a scan can represent multiple things. See the infamous "ground-glass opacities", which crops up in a bewildering number of lung ailments.
 
Last edited:
Upvote
6 (6 / 0)

Fatesrider

Ars Legatus Legionis
25,197
Subscriptor
It’s depressing how many people on r/chatGPT think using LLMs as therapists is totally fine.
Strangely, I use them for hashing out plots and ideas in stories I write.

Not their suggestions, of course, but I've always been the kind who needs to talk it out to figure it out on my own. MAYBE once in a hundred exchanges, it'll come up with something I hadn't thought of that might be useful, but that's an exceptionally rare event.

This habit of mine drives my wife crazy, because like most normal people, she thinks I'm asking for her input when really, it's just my way of processing ideas and forming plot-lines and such. Talking to myself has never worked, and now that I'm a senior, it makes me appear to be a lot more senile than I probably am. But I've found using something that responds to my input in some topically relevant way, interrupting my current thinking to read or hear the response, is an effective technique to make different decisions and advance a story.

So, when I get stuck, I spin up the AI, and start talking about what I'm thinking about a story and it's making all sorts of suggestions, that triggers my brain to think of OTHER things than what I was thinking about before I got stuck. And that leads to new paths and ideas and a better story.

Don't ask me what it responded with. As long as it's on topic, to me, that's mostly white noise that helps me interrupt a line of thinking and get on a different track - kind of like my wife's suggestions (Not that I'd ever tell her that!)

So, I find them useful in that "thinking out loud" kind of thing, but I'm reasonably sure the makers of the AI's didn't really design them for that. The illusion of "listening" to what I tell them then making topical observations and suggestions, no matter how relevant or bizarre they may be, does the trick.

That said, it's not worth burning down the planet for the sake of my meager literary skills. I wrote a lot more bad books before AI came along, and they're still probably as bad as they were with AI. But it keeps me from developing a Bingo habit and staying off the streets.
 
Upvote
7 (7 / 0)

10Nov1775

Ars Scholae Palatinae
906
I can't wait for the swing the other way to get a "Chandler" sarcastic AI.
"Carrot Weather", a weather app that offers chatbot comments related to weather, included settings for tone like "homicidal maniac".

I did not find these funny, but I can only imagine it must be popular with users.
 
Upvote
1 (1 / 0)

10Nov1775

Ars Scholae Palatinae
906
Or that it's private. It was a bit upsetting to see the look of panic on a friend's face when I told her that all Chatbot chats were logged and mined for data and she realized that not only may actual humans be reading her most personal thoughts but depending on what she's been telling it she may also be feeding into the next Harlequin Botmance or OnlyBots product.
Bless her (semi) innocent soul. I can't help but wonder why otherwise perfectly smart people just assume privacy with services.
 
Upvote
8 (8 / 0)

10Nov1775

Ars Scholae Palatinae
906
This right here is the key. Companies could care less how sycophantic a model gets or how many people complain about it until it impacts the potential revenue stream.

Big potential difference between "Omigawd, girl, you look amazing no matter what you do, don't worry about whether you're pretty or not, like, you're totally amazing just as you are, ch'ya, for sure!" and "No, you're fine, but maybe your lips could use some hydrating at the core, then you'd look cute, like an e.l.f.! Hey! Want me to do a search on ways to do that?"
Hahaha, for all you boys out there, E.L.F. is a cosmetics brand.
 
Upvote
3 (3 / 0)

10Nov1775

Ars Scholae Palatinae
906
In more than one way. The Dunning Kruger effect is widely misunderstood. People think it's "why stupid people think they're smart" but it's really something that affects all humans, all the time, even if you are aware of it.

It's basically the phenomenon of, you don't know what you don't know, so you can easily overestimate what you know. Sure, being modest is a guardrail, but it's more like a 4" guardrail you easily forget about and frequently step right over without realizing.

Here's a concrete example. I found a recent (2023) algorithm from a SIGGRAPH paper that I figured would be perfect for my project and give me fantastic results over older options. I tried to use various LLMs to implement most of it for me. I figured hey, I've attached the paper itself as well as the git repo of the Python reference implementation from the researchers. I figured transliterating from one language to another would be something it can do.

As you might expect, every model completely failed to produce anything close to working, even with my intervention. Its "debugging" was beyond useless.

This I anticipated. I knew this was a potential outcome. You know what I DIDN'T anticipate? That I skimmed the paper too quickly, that I didn't realize that, while this algorithm could technically work, it is meant to handle shapes that were produced by specialized neural networks, not traditionally defined ones.

I didn't need the paper at all. I didn't need the new algorithm. In fact the more traditional one was far better for my approach. I jumped to the conclusion that this fancy algorithm was beneficial when it wasn't. I figured I had enough graphics programming knowledge to jump right in. I didn't.

I wasted two weeks of my time futzing around with this before realizing what I was doing. This has not happened with all my LLM usage, only some.

I tried to use AI to punch above my weight class. It didn't work, not really. There is no cheat code for knowledge and learning.

Well, except for being filthy rich, I guess
Haha, this is the classic Arch (Linux distro) help question in disguised form, isn't it?

"I want to do X and Y, can you tell me how to do X and Y?"

"No, you're telling us that you plan to DO X and Y. Tell us what you actually want to happen, what you think X and Y will achieve, so we can properly help you."

So often it turns out that X and Y are a horrible way to do Z.

There are some deep truths about the human experience buried in your comment. Or certainly my experience, can't speak for the rest, hahaha.
 
Upvote
4 (4 / 0)

gerbal

Wise, Aged Ars Veteran
198
Subscriptor++
Okay, I'll spell it out since clearly I did not express myself in a way that is readily understood:

The portion of what you said that I directly quoted:

I said: The amount of understanding, skill, checking, and time required to get "useful" outputs could just as easily have been spent on compiling the output yourself.

My comment contained "honest [discussion] of the actual experience of using the tools". And explained that the actual experience is one of using something that doesn't accomplish anything that couldn't be accomplished without using it.

I have no idea how you interpreted what I said the way you did, it is so far beyond what makes sense to me that I interpreted it as either deliberately misinterpreting what I said or not paying any attention to what I said, both would be the result of bad faith participation.
Fair enough. I skipped a few steps in my reasoning and explanation. I interpreted you argument as:
  • Tools are advertised as easy to use
  • Posts showing they're actually hard to use get "punished"
  • This happens because the effort required negates the advertised benefits
To me this explanation doesn't justify the community's negative reaction. If the concern is that LLM marketing is misleading people about ease of use, then detailed discussions about the actual difficulty and limitations should be welcomed as corrective information. Instead, the opposite happens.

The amount of understanding, skill, checking, and time required to get "useful" outputs could just as easily have been spent on compiling the output yourself.
I think this assertion is not universally correct and is growing much less correct with time and must be frequently re-evaluated against reality. In my experience the time necessary to get useful outputs from these tools is less than it would take me to perform the same task without LLM tooling. This has also pretty rapidly improved in the past 2 years.

As you may have noted, I don't have the finest of social graces or communication skills. LLM tools can produce polished professional writing in a fraction of the time it takes me to write the same document. I'm much faster at reading and editing than I am at drafting text and can proof read an LLMs draft and iterate or make corrections in 10s of minutes while the same task would take me hours without.

I can read and review code >4x the speed I can write it. Even when Cursor gives me crap output, I know from years of experience, I would take longer to iterate on the problem directly than to have Cursor try to iterate for 10 minutes and throw the results up for my review. It helps I work on very large codebases in well-established domains (HR Software) where the default assumptions of the LLM are often correct.

Lastly, from my perspective, much of the discussion of LLMs on this forum appears to not be grounded in in-depth experience, but instead on first impressions, media coverage, memes, and marketing. It's not helped by the fact that LLM tools are being smashed into everyones faces carelessly by Google and Microsoft, which encourages flattening the entire technology into the worst public abuses.
 
Last edited:
Upvote
-1 (2 / -3)

Ben G

Ars Tribunus Militum
2,885
Subscriptor
I don't know where you've been shopping. If you want to propose it is contextual, we might have a discussion. You'd also have to be prepared to address the matter of how these models are being marketed--both to investors and to the public--as opposed to what they actually can do well.

From the content of that post, I would assume they’ve been shopping at the Strawman Store.
 
Upvote
6 (6 / 0)

SixDegrees

Ars Legatus Legionis
48,502
Subscriptor
This has nothing to do with AI; you'd expect similar results from humans. It's inherent to the problem described: it's much easier to find an instance of a feature (i.e. TB infection) than it is to reliably conclude the absence of a feature. The odd/suspicious result would have been if the AI had performed the same on both questions.
No, it has very much to do with AI and the very different way LLMs process language compared with human linguistic processing. They're entirely unalike, and LLMs have deep, inherent troubles handling some language constructs that humans do not.
 
Upvote
2 (2 / 0)