Your doctor’s AI notetaker may be making things up, Ontario audit finds

themaxdaddy80 · 2026-05-14T13:31:01-0400

Aaaaa whaaaaaaaaaat....

The patient probably isn't prompting the LLM correctly. User error. The patients need training

/s

themaxdaddy80 · 2026-05-14T13:31:40-0400

Sorry. Double post

MacBrave · 2026-05-14T13:33:36-0400

I thought this was covered on a recent season 2 episode of The Pitt?

chateauarusi · 2026-05-14T13:37:13-0400

This is my shocked face.

sigmasirrus · 2026-05-14T13:38:15-0400

why can’t the doctors just do their job and write notes??

Tactical Finesse · 2026-05-14T13:38:37-0400

So who is on the hook when these AI tools go wrong, in a field like healthcare, where consequences are life or death? Particularly when the hallucinating tools are actually recommended by government orgs?

Arstotzka · 2026-05-14T13:39:55-0400

But that seemingly key “accuracy” metric was only responsible for about 4 percent of a vendor’s overall score, making it easy to meet the minimum threshold for approval even if an AI scribe scored a “zero” on the accuracy metric (a separate metric measuring “domestic presence in Ontario” was worth 30 percent of the overall scoring).

Accuracy: 4%
Domestic Presence in Ontario: 30%

It is refreshing to see priorities spelled out so honestly. Here's the table from the linked PDF, if anyone else is curious. Domestic presence was the highest-weighted criteria, beating out trivialities such as accuracy, security, formatting, usability, and privacy.

Screenshot 2026-05-14 at 10.38.00 AM.png

fe3a8b63 · 2026-05-14T13:40:17-0400

As someone in Ontario that regularly needs medical help, medications that aren't common, and whose conditions change whenever they feel like it, I have to see my doctor often. And recently they posted notices on the walls that state that "MOH (Ministry of Health) approved AI" is being used.

And its absolutely wrong.

They used Telus health to sync health records, which had issues but worked. Moving away from paper. And now my doc and everyone who works there are forced to use shit-tier AI. Which is blatantly wrong.

I'd argue its just over 70% wrong. Considering my immense 1100+ page medical record. Because some things like "stage fright" "performance anxiety" or "shy bladder" don't exist. Not in AI's world.

To save lives of Ontarians the entire board of CPSO and MOH needs to be fired ASAP. CPSO especially, they seek to punish doctors that help. Id give every person that makes up the CPSO board 2yrs 1mo in prison tbh. That way they'd be listed as a serious criminal offender. Because they denied healthcare for the lulz, and because they're completely incompetent.

Out in 2yrs, but with a criminal record. Slap on the wrist imho, considering how many are denied medical help. But a criminal record for the rest of their lives.

spacekobra · 2026-05-14T13:40:22-0400

sigmasirrus said:
why can’t the doctors just do their job and write notes??

Just so its clear, this is an audit of a simulated situation to make sure that the tools being advertised are up to snuff for Ontario doctors.

As to why doctors can't just write their charts. Ontario has a doctor shortage, doctors are overwhelmed and AI tools are promising to make their life easier. So, what do you expect? Buy my tool and make your note taking brainless sounds like a great offer.

roboninja73 · 2026-05-14T13:40:22-0400

sigmasirrus said:
why can’t the doctors just do their job and write notes??

That's a lot of salary for a notetaker. There may be no valid way around it, but trying to free up some of their time for more technical tasks that require their actual expertise seems like a valid endeavour.

Tactical Finesse · 2026-05-14T13:40:51-0400

sigmasirrus said:
why can’t the doctors just do their job and write notes??

Short answer...it is an opportunity cost in time that could be spent seeing/treating patients, which people (AKA prospective patients) complain about wait times to see doctors. Which, the trade off between hallucinating LLMs over longer patient wait times is...clearly problematic.

Sarty · 2026-05-14T13:40:58-0400

It isn't the number in this article that will attract the most attention, but they evaluated twenty vendors of this crap? LLMs weren't quite invented yesterday. How is the marketplace that differentiated, when there aren't really equivalents to production or shipping bottlenecks? Some tool ought to out-compete most of the field, shouldn't it?

I'm reminded of that saying in football--if you have two viable quarterbacks, you really have none. Same goes. If you have twenty approvable LLM medical scribe tools available, you really have none.

sporkinum · 2026-05-14T13:42:07-0400

Tactical Finesse said:
So who is on the hook when these AI tools go wrong, in a field like healthcare, where consequences are life or death? Particularly when the hallucinating tools are actually recommended by government orgs?

The Doctor, ultimately. I work in radiology, and we have been using speech to text for years. It's up to them to proofread. If it is wrong, and there is a lawsuit, they will be hung out to dry.

scortiusthecharioteer · 2026-05-14T13:42:20-0400

themaxdaddy80 said:
Aaaaa whaaaaaaaaaat....

The patient probably isn't prompting the LLM correctly. User error. The patients need training

/s

Problem Is Between Gurney And Chair. PIBGAC

stormcrash · 2026-05-14T13:45:59-0400

Why is AI involved at all rather than basic dictation software we already had?

Fenixgoon · 2026-05-14T13:46:43-0400

stormcrash said:
Why is AI involved at all rather than basic dictation software we already had?

Came here to ask this.

Tam-Lin · 2026-05-14T13:47:53-0400

sigmasirrus said:
why can’t the doctors just do their job and write notes??

They can and do; it’s why my wife, who is officially scheduled to work from 8 AM - 4 PM, routinely doesn’t get home until midnight. Because doctors get reimbursed for seeing patients, so their employers schedule them to see as many patients as possible, and don’t make any allowances for all of the ancillary work that has to be done around the patient encounters.

stormcrash · 2026-05-14T13:47:58-0400

Sarty said:
It isn't the number in this article that will attract the most attention, but they evaluated twenty vendors of this crap? LLMs weren't quite invented yesterday. How is the marketplace that differentiated, when there aren't really equivalents to production or shipping bottlenecks? Some tool ought to out-compete most of the field, shouldn't it?

I'm reminded of that saying in football--if you have two viable quarterbacks, you really have none. Same goes. If you have twenty approvable LLM medical scribe tools available, you really have none.

We're still in the "race to marketshare" stage of the bubble where venture capital and speculation are propping up more options than will be viable. Consolidation and retrenchment will come eventually though as the free money slows down and profit fails to materialize, that or mergers as bigger players in health or whatever industry look to snap up these products that really belong as a feature rather than as a standalone offering

einstein4pres · 2026-05-14T13:48:34-0400

sigmasirrus said:
why can’t the doctors just do their job and write notes??

In my experience, doctors are still responsible for the notes (whether self-written, AI-scribed, human-scribed, or dictated). The general idea is that these AI tools are sufficiently cheap and good that it frees up the doctor to spend less of their time writing notes and more of their time actually doctoring (either spending more time with each patient or seeing more patients).

Obviously, whether this is an actual value proposition will depend on the quality of the AI tool in question (for the given provider/provider's specialty).

I haven't done any analysis of the quality side of this, but I can tell you that such tools are quite popular with providers.

Tam-Lin · 2026-05-14T13:49:04-0400

Fenixgoon said:
Came here to ask this.

Because it was never that good either; the gold standard for transcription is a person, often disabled/otherwise home bound, who are usually amazingly fast and accurate, but, you know, cost money.

Tactical Finesse · 2026-05-14T13:49:31-0400

sporkinum said:
The Doctor, ultimately. I work in radiology, and we have been using speech to text for years. It's up to them to proofread. If it is wrong, and there is a lawsuit, they will be hung out to dry.

I wasn't sure if a sanctioning org, like actual high-level government, okaying this thing would change that or not.

Alethe · 2026-05-14T13:52:43-0400

sporkinum said:
The Doctor, ultimately. I work in radiology, and we have been using speech to text for years. It's up to them to proofread. If it is wrong, and there is a lawsuit, they will be hung out to dry.

As someone else said, moral and legal crumple zones for both their employers and the providers of these models. Despicable.

Fatesrider · 2026-05-14T13:53:47-0400

sigmasirrus said:
why can’t the doctors just do their job and write notes??

Rhetorical question, but the issue is pretty straightforward.

They can't fucking write. Source: Me, after >20 years in the medical field earning a PhD in hieroglyphic interpretation. My schooling came from my father, who should have been a physician given how horrible his handwriting was.

So they have to learn to type. That's a WIP for most of them. They're taught a lot of skills in medical school, but it SEEMS that typing, legibility and coherence aren't among them. MOST dictate their notes, and expect a human to interpret them correctly. Mostly, they do. But given how they're going AI on that to get rid of the humans, I suspect that's where it's happening.

BTW, dictating notes has been a thing for 40 years.

The issue with AI is that all medical records are (supposedly) sealed, so it has no real clue HOW doctors write notes. So I'd expect it to take the medical shorthand that's often used and plays with it. Abbreviations will throw it (prn, QD, QID, p.o., IV, IM, etc.) and the use of medicalese (formal medical anatomy & physiologiy along with tests etc,) isn't common out in the "normal world".

Another aspect having nothing to do with that is doctors don't have the TIME to do that. Specialists, yes, they might see a lot fewer patients. But a GP will see 30+ patients/day, and THEN have to write notes on all of them, with some being a lot more comprehensive than others.

I can see WHY they'd want to use AI. But AI, as it's typically trained, will fuck that up very badly. So this result is not only not surprising, it was predictable - for anyone who has both a tech and medical background that is.

maxoakland · 2026-05-14T13:53:57-0400

DUH. How many gigabytes of evidence that LLMs hallucinate do we need? I’m just freaking sick of this sh*t

Sajuuk · 2026-05-14T13:54:16-0400

sigmasirrus said:
why can’t the doctors just do their job and write notes??

Because they're already over scheduled and the cocaine and adderall only last so long.

JudgeMental · 2026-05-14T13:55:11-0400

Arstotzka said:
Accuracy: 4%
Domestic Presence in Ontario: 30%

It is refreshing to see priorities spelled out so honestly. Here's the table from the linked PDF, if anyone else is curious. Domestic presence was the highest-weighted criteria, beating out trivialities such as accuracy, security, formatting, usability, and privacy.
View attachment 135056

This is what I came to note. It's insane that "does it actually work" is the lowest metric. Then again, that matches my experience with just about any other legal or corporate entity, so I'm not actually surprised either.

IncreaseMather · 2026-05-14T13:55:26-0400

Tam-Lin said:
Because it was never that good either; the gold standard for transcription is a person, often disabled/otherwise home bound, who are usually amazingly fast and accurate, but, you know, cost money.

Physician here who for years relied on transcriptionists. They were phenomenal, excellent at their jobs and many helped catch mistakes/improved clarity of medical jargon. When my institution switched to dragon, I changed to typing my own notes out. It was faster and more accurate than dragon ever was (hates my southern drawl). And now my institution has rolled out similar AI to what this article is addressing. I plan to never use it.

gmyx · 2026-05-14T13:56:33-0400

I wonder if they tested / emulated a bilingual conversation - I don't see that in the document. Not just English/French but the many other languages that exists in the province. I know when I talk to my doc I routinely switch from English and French, sometimes mid-sentence. My experience with Teams is that it just shits the bed and makes shit up, more than normal.

graylshaped · 2026-05-14T14:00:28-0400

I would prefer not to rely on a doctor who uses his idiot nephew as a note taker, personally.

These results are indefensible for professionals to use.

1eardown · 2026-05-14T14:00:28-0400

[edit: deleted inaccurate stuff]
I notice errors in the CC [edit for clarity: closed captioning or subtitles] of almost every movie I watch, even in some cases flipping the meaning to the complete opposite. It is slightly concerning to think of that happening to the notes taken during my doctor visit. But as long as doctors are being held accountable for mistakes, I say let them use their professional judgment, just like they do for life-and-death medical decisions on a daily basis.

Doug DigDag · 2026-05-14T14:03:22-0400

The benefit of note-taking is only occasionally derived from reading the notes.

The main benefit of note-taking is writing the notes. That is, the act of writing it down, of converting your perceptions into words, identifying the most important features, even of just moving your fingers on the pen, all serve to broaden your memory and understanding of the events on which the notes are being taken. All that is extremely valuable.

But as none of those intangible things are being directly tracked on any accountant's spreadsheet, zero value is assigned to them by the people writing the checks. What is valued is instead just this: quantity of text. And cost-effectiveness, real or illusory, is king.

aggressive-trail · 2026-05-14T14:04:08-0400

stormcrash said:
Why is AI involved at all rather than basic dictation software we already had?

Both my parents are doctors. You’d be surprised at how bad those are as well. And how bad humans are too.

A good study would compare different methodologies so we can determine what the optimal outcome ought to be.

graylshaped · 2026-05-14T14:04:38-0400

sigmasirrus said:
why can’t the doctors just do their job and write notes??

Many cannot read their own handwriting?

Anadromous · 2026-05-14T14:04:57-0400

I have used two different scribe platforms, with most of my experience being with Heidi. For me, it means I can look at the client while we are talking about things.
(Client not patient in my case, I'm a veterinarian).
Being able to look at the client and maintain that one-to-one connection is really helpful.

I can say the things that I am finding when I am doing an examination, and they are recorded in real time rather than forgotten when writing notes later.

And most important, there's a verbatim transcript stored with each consult, timestamped. So when the client claims that you said "x" or did not say "y", you have a nice verbatim record that you indeed did or didn't do the things that are being claimed.

I read every note for completeness and accuracy, and every discharge statement/summary for completeness. Still takes me far less time than it used to with traditional note taking. As with everything, it is important to use the tool, not let the tool use you.

graylshaped · 2026-05-14T14:05:44-0400

spacekobra said:
Buy my tool and make your note taking brainless sounds like a great offer.

The old adage "If it sounds too good to be true..." smiles, smugly.

graylshaped · 2026-05-14T14:07:34-0400

Tactical Finesse said:
So who is on the hook when these AI tools go wrong, in a field like healthcare, where consequences are life or death? Particularly when the hallucinating tools are actually recommended by government orgs?

Prediction: Malpractice carriers will begin to exclude coverage for errors attributable to unproven models.

CosmicCaribou · 2026-05-14T14:07:58-0400

IncreaseMather said:
Physician here who for years relied on transcriptionists. They were phenomenal, excellent at their jobs and many helped catch mistakes/improved clarity of medical jargon. When my institution switched to dragon, I changed to typing my own notes out. It was faster and more accurate than dragon ever was (hates my southern drawl). And now my institution has rolled out similar AI to what this article is addressing. I plan to never use it.

Props for this!

Tam-Lin · 2026-05-14T14:09:37-0400

JudgeMental said:
This is what I came to note. It's insane that "does it actually work" is the lowest metric. Then again, that matches my experience with just about any other legal or corporate entity, so I'm not actually surprised either.

These days, I’m not sure I’d agree. Digital sovereignty is a serious issue. Let’s say you do have an amazingly accurate solution, but it’s supplied by a company in a different country, maybe even a competitor. Or maybe a country you thought was friendly, but then the e populace elects a completely unfit person to lead the government. How confident are you that you’ll be able to rely on that solution?

Your doctor’s AI notetaker may be making things up, Ontario audit finds

Smack-Fu Master, in training

Smack-Fu Master, in training

Ars Scholae Palatinae

Ars Praetorian

Ars Scholae Palatinae

Wise, Aged Ars Veteran

Ars Scholae Palatinae

Ars Centurion

Refiner of the Quarter

Smack-Fu Master, in training

Wise, Aged Ars Veteran

Ars Tribunus Angusticlavius

Ars Tribunus Militum

Ars Centurion

Ars Legatus Legionis

Ars Praetorian

Ars Scholae Palatinae

Ars Legatus Legionis

Seniorius Lurkius

Ars Scholae Palatinae

Wise, Aged Ars Veteran

Ars Centurion

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Centurion

Smack-Fu Master, in training

Ars Centurion

Ars Legatus Legionis

Seniorius Lurkius

Smack-Fu Master, in training

Wise, Aged Ars Veteran

Ars Legatus Legionis

Ars Scholae Palatinae

Ars Legatus Legionis

Ars Legatus Legionis

Smack-Fu Master, in training

Ars Scholae Palatinae