Did ChatGPT help health officials solve a weird outbreak? Maybe.

charliebird

Ars Tribunus Militum
2,356
Subscriptor++
Ok, with that context, I wouldn't have reacted as strongly as I did.

Might even try it myself since I use HnS already, though I think I'm just old and have hairy ears making them itch sometimes :flail:
(Now I'll only be able to see that emoji as someone desperately trying to itch their ears)
I also get dead skin flakes on my hairy eyebrows which is allegedly related to the same condition (Eyebrow Dandruff) which is a bit embarrassing. 🫣 Same treatment of just small dab of head and shoulders. I better stop here since we're way off topic.
 
Upvote
4 (4 / 0)

bbf

Ars Tribunus Militum
2,369
I agree that vetting the accuracy of AI answers can take as long as just searching for references by oneself.

One thing I like about the copilot summaries on bing search are the references that it used to generate that I can peruse to gain more insight. I never trust the "AI" summary.

For subjects I already know something about, sometimes a few of the references contain incorrect information, so the summary is accordingly flawed.
 
Upvote
1 (1 / 0)

SixDegrees

Ars Legatus Legionis
48,304
Subscriptor
I agree that vetting the accuracy of AI answers can take as long as just searching for references by oneself.

One thing I like about the copilot summaries on bing search are the references that it used to generate that I can peruse to gain more insight. I never trust the "AI" summary.

For subjects I already know something about, sometimes a few of the references contain incorrect information, so the summary is accordingly flawed.
I don't know how copilot works, but Google's AI search results also have links - and they're most often to products or websites that pay for that placement. No different from their old "prioritized" search results, with ad customers nearer the top presumably based on how much they paid.
 
Upvote
5 (5 / 0)
Why are health officials of a state even turning to an LLM when presumably they'd have direct access to actual (human) expert and properly researched documentation resources to answer ALL of these questions? Nothing the LLM told them couldn't have been answered better and with more confidence from a proper literature search or just a 5 minute phone call with an expert.
 
Upvote
10 (10 / 0)
This is extremely dangerous for reasons that may not be immediately apparent. Because LLMs only preserve the connections between word-pieces, and not the actual meaning, you can have situations where an LLM considers "hyper" and "hypo" to be words that are both associated with a given suffix. And then it will take that suffix and determine the next token.

I have health issues that result in hypotension. Think about how an LLM is going to parse that, given that any training information is going to have at least 10:1 instances of hypertension to hypotension. So, how do I ensure that it picks up only connections between the whole word hypotension and not the token "tension"?

Now, zooming out, consider health issues where medical literature is disproportionately taken from male subjects, which is basically every specialty except OB/Gyn, and consider whether a woman should even consider using an LLM for medicine.
Even OB/Gyn is apparently still dealing with a "male bias" as it's still having to deal with decades of "precedent" getting generated by generations of male physicians who deemed they "knew better" than the woman they were examining (see amongst other things the amount of women complaining about getting excessive pain and bleeding basically ignored when it turns out they have advanced cases of things like endometriosis).
 
Upvote
10 (10 / 0)
I've had a few ongoing, very minor medical issues that I've mentioned to doctors with no success (Seborrheic dermatitis is one I've had for years and years). They usually shrugged their shoulders and said, "That’s weird," and didn't offer a helpful suggestion. I gave the symptoms to ChatGPT, and it diagnosed the problem right away and suggested an over-the-counter treatment which worked. It was honestly pretty amazing. I’m not saying this is a substitute for real doctors, and I’m sure a specialist would have diagnosed the same thing. But as a supplement to medical professionals, there’s value, I reckon.
When you have a good experience with AI it's socially best to not reveal it to others.
 
Upvote
-7 (3 / -10)
I'm right there with everyone who hates the AI hype. The stuff from Sam Altman and the rest of the industry is honestly exhausting. But I think two things can be true at once. The industry is a mess, but the tool itself is still useful if you actually use some critical thinking.

I didn't have endless time or resources to chase this down with different doctors, and it wasn't bothersome enough to keep doing that anyway. But I was able to use the tool and it helped me solve the problem. I think just pointing fingers and saying nobody should ever use it because it's "dangerous" misses the point. It's just a tool for cutting through the data faster.
"I got lucky, therefore this tool is useful" is a hell of a reasoning to make and to insist on. Do you understand that you can get people sick and even killed with this recommendation?
 
Upvote
10 (10 / 0)

hel1kx

Ars Tribunus Militum
1,623
Just for fun, I "asked Google" the question "will S. Agbeni grow in an improperly drained cooler?" and the AI overview said yes, and referenced this case as its source, and the first search result was the CDC announcement about this (https://www.cdc.gov/mmwr/volumes/75/wr/mm7507a1.htm).

First paragraph of AI response:
Yes, Salmonella enterica serotype Agbeni can grow in an improperly drained cooler. A 2024 outbreak investigation identified standing water and contaminated ice in a beer cooler as the likely source of a Salmonella Agbeni infection, where the cooler lacked proper drainage and sanitation.

Feels bad but I can't really articulate why. Single-source being circularly referenced or something?
 
Upvote
7 (7 / 0)

Dinosaurius

Ars Praetorian
417
Subscriptor++
The tendency of LLMs to just make references up is pretty well known and has ended many a legal and consulting career already.
That's a PEBKAC error.
I use Claude all the time for Fire Life Safety work and I must cite references to NFPA, CAN/ULC, Ontario Building and Ontario Fire codes - I'll query to Claude something like "Which code references the lumens per square foot required for emergency lighting ? " and I'll spit me back a reference to the Ontario Building Code.

The important difference, though, is I then go to the building code, itself, and verify that the reference says what Claude says it references.

What Claude does for me, though is save the time of finding that specific reference and will also find additional cross-references that I did not initially account for.

In my memory pre-loads, I have things like "All fire code answers must reference NFPA, CAN/ULC, Ontario Building or Ontario Fire codes and if you cannot provide a specific reference: You must indicate so", along with "Do not be sycophantic" and "All fire-code related answers must be defensible in an Ontario provincial or Canadian Federal court"
 
Upvote
4 (12 / -8)
That's something I've been finding more than a little annoying about AI assistants. They feign cheer for helping and tell me everything is the greatest, most powerful, most superlative ever. They are the cheerful all-knowing assistant, Even when I instruct it to be objective, I still get that sense it is patronizing me. I would like it a lot more if it would just generate a flat response without trying to engage my enthusiasm.
I used one to mock up some bash scripting with some commandline tools I wasn't familiar with. The first couple things I needed, it provided perfect solutions that worked immediately. The third thing I needed it cited options that weren't in the man pages. I asked for clarification and it admitted it had read a man page from a different tool.

Then we went through about 20 rounds of trying to solve a particular issue with how SQLite is used in shapefiles and how to pass arguments properly to ogr2ogr to do a spatial match between some geojson data and a shapefile encoded for a particular projection. Every single solution was presented as "the final solution" (there's aways a hint of racism or anti-sematism in there if you're paying attention), "boss solution", "final boss solution", "real solution", "the real ultimate solution", "$random_dudebro_coder_awesome_phrase" solution. I've never seen things described this way by stack overflow users so I don't understand why it feels the need to add this crap in for reassurance on something that it definitely can't guarantee will work.

Did it save me some time? Yes. Did I feel dirty afterwards? Yes. I don't know if it's actually reading the man pages or just mashing together a bunch of random answers to similar problems that were posted on stack overflow/etc, or making code soup from all the github code that ever was, or what.

Will this thing hit a point where it starts getting dumber because the source data has dried up?

AI is repeatedly hitting niche forums to the point where its prohibitively upping the cost to run those forums and at some point those things are all going to be login/subscription only to combat that. I think we're headed for a major change in the accessibility of quality user created content as this is going to force the open forum format out of existence. We can't count on AI creators to use the internet responsibly when there is a gold rush to be had.
 
Upvote
3 (3 / 0)

graylshaped

Ars Legatus Legionis
67,692
Subscriptor++
That's a PEBKAC error.
I use Claude all the time for Fire Life Safety work and I must cite references to NFPA, CAN/ULC, Ontario Building and Ontario Fire codes - I'll query to Claude something like "Which code references the lumens per square foot required for emergency lighting ? " and I'll spit me back a reference to the Ontario Building Code.

The important difference, though, is I then go to the building code, itself, and verify that the reference says what Claude says it references.

What Claude does for me, though is save the time of finding that specific reference and will also find additional cross-references that I did not initially account for.

In my memory pre-loads, I have things like "All fire code answers must reference NFPA, CAN/ULC, Ontario Building or Ontario Fire codes and if you cannot provide a specific reference: You must indicate so", along with "Do not be sycophantic" and "All fire-code related answers must be defensible in an Ontario provincial or Canadian Federal court"
This is the way. Treat them as the fancy card catalog they are suited to be.
 
Upvote
5 (5 / 0)

bronskrat

Wise, Aged Ars Veteran
159
That's because an LLM has no motive, and we're used to automatically guessing people's motives in any conversation. Motives don't have to be nefarious, for most of us, posting on Ars is primarily motivated by boredom, killing time, etc as well as an interest in the subject. If someone was always posting about how Bitcoin is the future, etc etc people would similarly make some assumptions about their motivations.

LLMs have no motivations, so when we naturally try to guess, it comes across as being fake and insincere in ways that are almost baffling, because we aren't used to a conversation without a motive or any operating theory of mind as we know it. And of course, the LLM cannot understand your motivations and won't respond to them as we expect.
LLMs are trained on data that does have motivations and it's all thrown into a giant bucket and mixed together. What it comes out with is unpredictable BUT... a different perspective, even if one developed this way, is still useful as long as it's taken with a grain of salt.

But to all the people that say, "it's not thinking, it's just predicting the next word", well, who knows if that's not how our own brains work having trained on different sets of data given to us by experiences?
 
Upvote
-12 (0 / -12)

ranthog

Ars Legatus Legionis
15,240
LLMs are trained on data that does have motivations and it's all thrown into a giant bucket and mixed together. What it comes out with is unpredictable BUT... a different perspective, even if one developed this way, is still useful as long as it's taken with a grain of salt.

But to all the people that say, "it's not thinking, it's just predicting the next word", well, who knows if that's not how our own brains work having trained on different sets of data given to us by experiences?
Saying it has a perspective requires the LLM to be able to have a perspective. Same with them having "reasoning" or other capabilities. Those are marketing words made up by the companies to give the impression of capabilities for a product they're trying to sell you. Its why they made up the term "hallucination" instead of saying our LLM is wrong and flat out makes shit up.

We have no evidence that LLM's are anything but a very powerful statistical prediction of the next word. The claim that they are anything but this requires actual proof.
 
Upvote
5 (5 / 0)

azazel1024

Ars Legatus Legionis
15,020
Subscriptor
And for this case in particular. I can't believe the collective County Health Officials didn't all recoil in horror when they heard the words, "... reportedly hosed off ..." That phrase is a sure butt-clencher.
Yeah, I can't help wondering what this "cooler" was used for, before it was used as a cooler.
 
Upvote
3 (3 / 0)

SraCet

Ars Legatus Legionis
16,817
Thank you! Every time I hear about people wanting to use LLMs as a medical search engine I wonder if I've spent the past two decades hallucinating this tool that we already have!
...
Spoken like somebody who hasn't used an LLM to find research papers.

Having an LLM search for things is vastly superior to doing it yourself via whatever search engine you're trying to do it with.

I mean, just consider this article. All you do is say to an LLM, "hey, can you find any case studies or research papers about infectious diseases transmitted via beer cans that are stored in a big thing of ice water," and the LLM will think of several/many combinations of search terms and search for all of them simultaneously, often in different languages, and it can also lean on its training data to try to "remember" an answer to what you're asking about.

Compared to you yourself trying to come up with good search terms for this, and manually going through search results to see if any of them are suitable... ugh.

And the whole problem of LLMs making s**t up is a total non-issue here, since you're going to click through on the studies it finds and read them yourself. (Which is no different from what you would do with a regular search engine in the first place.)
 
Upvote
-10 (2 / -12)

SraCet

Ars Legatus Legionis
16,817
This is extremely dangerous for reasons that may not be immediately apparent. Because LLMs only preserve the connections between word-pieces, and not the actual meaning, you can have situations where an LLM considers "hyper" and "hypo" to be words that are both associated with a given suffix. And then it will take that suffix and determine the next token.
...
Absolute nonsense.

You've read too many articles talking about how LLMs only work on "probabilities" and you now seem to think that all LLMs do is some kind of Bayesian bigram analysis of your text, as if it works the same as the autocomplete feature on the phone you had 15 years ago.

Give me an example of an LLM getting confused between "hyper" and "hypo" and I'll take you seriously. But I would happily bet money that you can't.
 
Upvote
-10 (1 / -11)

SraCet

Ars Legatus Legionis
16,817
LLMs are trained on data that does have motivations and it's all thrown into a giant bucket and mixed together. What it comes out with is unpredictable BUT... a different perspective, even if one developed this way, is still useful as long as it's taken with a grain of salt.

But to all the people that say, "it's not thinking, it's just predicting the next word", ...
Indeed, this has always been a garbage analysis of how LLMs work.

Okay, it predicts the next word. Fine.

What if you gave Einstein the transcript of a whole conversation about relativity, but you cut it off halfway through and asked him to predict the next word.

Would people then complain that "all he did was predict the next word" as if that's some kind of useful f**king insight into Einstein's thought process?
 
Upvote
-13 (0 / -13)

graylshaped

Ars Legatus Legionis
67,692
Subscriptor++
I'll be clear and say there is no reason to fault the local health officials here. Brown County has fewer than 7,000 residents, and the county itself employs less than 1% of them. Only one name is connected in public records to the county health department, and I am assuming the investigation itself was handled by her and/or consultants. From all accounts, their investigation found the salient information: as referenced in the report cited by the article, it boils down to "all 13 persons who became ill reported 1) spending time in the infield area and 2) drinking canned beer from the beer tent. No illnesses were identified among persons who did not access the beer tent."

That link also contains the exact prompts given to ChatGPT 4.0 and the data collected in a table following. It also contains a conclusion, entitled "Implications for Public Health Practice" that says NOTHING about "AI" or how awesome it would be to use for these situations. It says--I know, this will be highly controversial--to follow good food sanitation and hygiene practices:
This investigation underscores the importance of local adaptability and collaboration with event organizers, members of the community, and public health departments. In a small community, monitoring social media posts and photos, as well as personally contacting fair board members and persons who health department staff members had encountered at the fair, contributed to rapid situational awareness and early case finding but also contributed to reluctance to report, for fear of implicating a friend or neighbor as contributing to the outbreak. This outbreak highlights the role of implementing and enforcing food sanitation and hygiene practices, including proper handling and storage of ice, frequent cleaning of coolers, and prevention of cross-contamination to prevent similar outbreaks in comparable community event settings.
The use of "AI" here was a curiosity, and actually distracts from the work the people in the field did. Reading anything into it after looking at that work and the leading questions the model was asked is making more out of it than it is worth.


edit: typo
 
Last edited:
Upvote
4 (5 / -1)
Indeed, this has always been a garbage analysis of how LLMs work.

Okay, it predicts the next word. Fine.

What if you gave Einstein the transcript of a whole conversation about relativity, but you cut it off halfway through and asked him to predict the next word.

Would people then complain that "all he did was predict the next word" as if that's some kind of useful f**king insight into Einstein's thought process?
Einstein would have probably complained, yes. You were trying to waste his time.
 
Upvote
8 (8 / 0)

dwl-sdca

Ars Scholae Palatinae
901
Subscriptor++
"You’ve got to remember that these are just simple farmers. These are people of the land. The common clay of the new West. You know—morons.” – Jim, Blazing Saddles, 1974
There are several pages of comments that I have yet to read but I think this is worth saying: I grew up in a farming town of about 16,000 people. DYI and fixing things with duct tape and baling wire was admired. Improvised giant ice chests were common at agricultural festivals, church and club picnics, and sports events. Livestock watering troughs were used. (This added to the down-home rural feel of the events.) Although food preparation stoves and large BBQ grills were typically hand-made, food itself was prepared and served in a sanitary manner. Food was supervised by public health officials. Inspectors checked the food temperature. Yet, no one paid attention to the disgusting appearance of the ice-water slurry that kept beer and sodas chilled. (Keeping stuff cold was safe and good…Right?) It was normal practice to wipe the can surface with a handkerchief or shirt tail but many didn’t bother with that. This went back to the days where can openers were used. Can openers were on ropes or chains and were usually actually in the ice slurry. Thus, droplets of the slurry were introduced into the beverage itself.
 
Upvote
7 (7 / 0)

dwl-sdca

Ars Scholae Palatinae
901
Subscriptor++
Given the events of the past year or so, I'm not sure PubMed can be trusted as a source of reliable medical information any longer.
The issue is not “trust”. PubMed and most other literature databases are indexes of published literature — the sources are not screened for academic rigor. Indeed, hundreds of crap pay-to-publish “journals” are included.* The old Medline system had a screening process before a publication could be indexed there.

*The rationale for including junk journals is that the contents of those journals can appear in popular news media and online. An argument opposing clap-trap stating that the article was published in Archives of Quality Science (really Archives of Foolishness) isn’t likely to be convincing. One needs to read, or at least scan, the article to identify flawed methods or fake data. The curators of these databases expect that the knowledge-seekers who use the database will be able to distinguish between credible research reports and garage.
 
Upvote
2 (2 / 0)

The Lurker Beneath

Ars Tribunus Militum
6,636
Subscriptor
Just for fun, I "asked Google" the question "will S. Agbeni grow in an improperly drained cooler?" and the AI overview said yes, and referenced this case as its source, and the first search result was the CDC announcement about this (https://www.cdc.gov/mmwr/volumes/75/wr/mm7507a1.htm).

First paragraph of AI response:


Feels bad but I can't really articulate why. Single-source being circularly referenced or something?

Autocitogenesis.
 
Upvote
-1 (1 / -2)

The Lurker Beneath

Ars Tribunus Militum
6,636
Subscriptor
Indeed, this has always been a garbage analysis of how LLMs work.

Okay, it predicts the next word. Fine.

What if you gave Einstein the transcript of a whole conversation about relativity, but you cut it off halfway through and asked him to predict the next word.

Would people then complain that "all he did was predict the next word" as if that's some kind of useful f**king insight into Einstein's thought process?

Chess-playing computers just predict the next move of a winning game!

But the biggest issue is people thinking it's just a stochastic Markov Chain. Those don't include a cloud of correlations that embody real meaning.
 
Upvote
1 (2 / -1)

Rhurazz2012

Wise, Aged Ars Veteran
192
Yeah, but the people who create and program and train the LLM totally do have motives, and those motives are a) profit and b) promote right-wing politics. Different LLMs will have different built in biases depending on what information gets funneled into them. That's why, for example, Elon's Grok periodically says things that contradict what Elon himself says. Then Elon makes changes and Grok starts parroting his line of thought.

LLMs are not unbiased observers. They are mirrors that reflect the humans behind their creation.
This is why that LLMs can NEVER be trusted, because of bias. Chatbots and the like function only to those who fed it what THEY want them to have. Biased opinions of the creators of LLMs and the like are EXACTLY why AI is being shoveled and trusted right now, because the average Joe and Sally just want answers, and don't look further behind the veil of misinformation...
 
Upvote
3 (3 / 0)

jerminator

Ars Centurion
353
Subscriptor
It seems the chatbots are arcing into 2 different usages: they are becoming a 21st-century compiler for developers, and they are becoming a one-stop search tool for the rest of the public, whether they're searching for shopping deals, vacation activities, business information, or medical advice. These public health officials could have searched PubMed but it was easier and "cuter" to ask ChatGPT. The earlier commenter who found a cure for his/her skin condition could have googled for answers but would have to sift and evaluate them.

Literally the first link that pops up is to the Mayo Clinic and gives the treatment information. Why would you want an intermediate guessing machine in between you and what send a generally reliable source is beyond me
 
Upvote
3 (3 / 0)

rochefort

Ars Praefectus
5,245
Subscriptor
Fire the fucking investigators for cause.

The cause? Egregiously unqualified. ANYONE claiming the title of "investigator" who used an AI AT ALL should be standing in an unemployment line scanning the jobs posts for mindless manual labor jobs.
I wouldn't go that far. If it was a really unusual and puzzling case, and consulting the literature and outside experts was of no use, sure, try AI. But this was actually a simple case that should have been easy for a bunch of qualified public health officials to solve. Yet they took the stupid way.
 
Upvote
1 (1 / 0)

SraCet

Ars Legatus Legionis
16,817
Chess-playing computers just predict the next move of a winning game!

But the biggest issue is people thinking it's just a stochastic Markov Chain. Those don't include a cloud of correlations that embody real meaning.
Yup. Ridiculous.

Also, all the people who claim that LLMs just work by calculating the most "probable" subsequent word, as if the way those probabilities are calculated is somehow obvious and trivial.

"How does the weather forecaster come up with his predictions?"
"Oh, it's stupid, all he does is tell us the most probable forecast."

Awesome.
 
Upvote
-5 (1 / -6)

SraCet

Ars Legatus Legionis
16,817
Fire the fucking investigators for cause.

The cause? Egregiously unqualified. ANYONE claiming the title of "investigator" who used an AI AT ALL should be standing in an unemployment line scanning the jobs posts for mindless manual labor jobs.
Also fire any investigator who ever uses Google to search for anything.

Those f**kwits should have every research paper ever written memorized by heart.
 
Upvote
-3 (1 / -4)

SraCet

Ars Legatus Legionis
16,817
I wouldn't go that far. If it was a really unusual and puzzling case, and consulting the literature and outside experts was of no use, sure, try AI. But this was actually a simple case that should have been easy for a bunch of qualified public health officials to solve. Yet they took the stupid way.
I mean, read the article?

It sounds like it was pretty straightforward for them to come up with their hypothesis without using an LLM, and all they used the LLM for was maybe as a fancy search tool and/or for some independent confirmation. I fail to see the problem.
 
Upvote
-3 (0 / -3)

graylshaped

Ars Legatus Legionis
67,692
Subscriptor++
A fascinating typical story of how a LLM cannot reasonably determine causation. I'm not sure why anyone is surprised by the situation. We're using the term A.I. on software that is not intelligent in any complex way. Soooo, can we talk about what a LLM is and what it isn't?

Cheers
You can try. You can also try pinning people down on what "intelligence" is on its own.

Good luck!
 
Upvote
0 (0 / 0)
That's a PEBKAC error.
I use Claude all the time for Fire Life Safety work and I must cite references to NFPA, CAN/ULC, Ontario Building and Ontario Fire codes - I'll query to Claude something like "Which code references the lumens per square foot required for emergency lighting ? " and I'll spit me back a reference to the Ontario Building Code.

The important difference, though, is I then go to the building code, itself, and verify that the reference says what Claude says it references.

What Claude does for me, though is save the time of finding that specific reference and will also find additional cross-references that I did not initially account for.

In my memory pre-loads, I have things like "All fire code answers must reference NFPA, CAN/ULC, Ontario Building or Ontario Fire codes and if you cannot provide a specific reference: You must indicate so", along with "Do not be sycophantic" and "All fire-code related answers must be defensible in an Ontario provincial or Canadian Federal court"
This sounds like how using Google search used to work before they enshitified it into near-uselessness.

I remember in the before times where you could even add keywords like "+NFPA, +CAN/ULC" and have Google obey the required keywords before they decided to ignore the user because they know best.

Reinventing indexed search by burning ten times the query energy and a million times the training / indexing energy doesn't feel like progress, but your approach seems safer than blindly trusting the LLM.
 
Upvote
6 (6 / 0)
The Google AI summary described the exact code solution that I needed, complete with API calls I hadn't seen.

"Wow!" I thought, "can't believe I missed that!"

I hadn't. The calls don't exist, not in the current version, not in past versions.

And I wasted my time trying to track down these invented calls.

Mostly, programmers don't use the Google AI summaries and have startet to move away from chatbots too. They're using coding agents with MCP support. The MCP support allows AI to know about APIs.
 
Upvote
-1 (1 / -2)

The Lurker Beneath

Ars Tribunus Militum
6,636
Subscriptor
Yup. Ridiculous.

Also, all the people who claim that LLMs just work by calculating the most "probable" subsequent word, as if the way those probabilities are calculated is somehow obvious and trivial.

"How does the weather forecaster come up with his predictions?"
"Oh, it's stupid, all he does is tell us the most probable forecast."

Awesome.

I think you could maybe formalise them as mathematically equivalent to Markov chains, but not based on input statements but on a huge corpus of recursively-generated hypothetical statements.

So where a simple Markov chain based on all the text read by an LLM will often confuse say 'hypothermia' and 'hyperthermia' as in the example somebody posted earlier, in the recursive set those confusions have been winnowed down to near zero and even if you see the output as a Markov chain, it will only be sampling over the meaningful - or at least relatively meaningful - inputs.
 
Last edited:
Upvote
0 (0 / 0)