Deloitte will refund Australian government for AI hallucination-filled report

JBinFla

Wise, Aged Ars Veteran
143
Perhaps Lisa Crawford has a case for defamation or slander for having these false papers attributed to her. One way to stop the nonsense is to make it hurt. As it airs they are partially refunding the money but clearly all they did is engineer a few AI prompts to get the report. Make them refund it all, make them pay for defamation and send a message that this crap isn’t okay.

Same for the lawyers who submit briefs to the court with fake legal citations.
 
Upvote
332 (333 / -1)
Earlier this year, Deloitte declared it would start using generative AI for its reports as a way of enhancing the value provided to its clients. I don't remember if they said it in a specific report or not, but I recall seeing it.

The citation issue continues to trip people up across the spectrum, from lawyers to business analysts. It's striking how many supposedly smart people do not understand the limits of the tools they insist will deliver such amazing value.
 
Upvote
283 (283 / 0)

Geeklaw

Ars Centurion
270
Subscriptor
Deloitte and the DEWR buried that explanation in an updated version of the original report published Friday "to address a small number of corrections to references and footnotes," according to the DEWR website.
How many times do we have to see the fastest-to-the-bottom crowd make these mistakes before companies realize that AI isn't a magic wand?

Anyone want to place a bet on :
  1. How long it takes to find a hallucination in the updated version.
  2. When another Deloitte client will discover hallucinations in the work product they received, or
  3. When Deloitte issues a statement blaming a low-level employee who's been disciplined for the errors?
 
Upvote
199 (199 / 0)
Post content hidden for low score. Show…

GlockenspielHero

Ars Scholae Palatinae
693
Subscriptor
I mean, I guess just use chat-gpt instead of Deloitte, seems to be the result of their consultancy. The results will be similarly nonsense, but a lot cheaper.

This. I absolutely cannot understand why a consultancy, who's entire business model is "pay us large sums of money for our expert's advice" would rely on a LLM for even as much as grammar advice.

If your expert is ChatGPT, why do I pay you? I can write prompts myself. This is an incredibly fast way to sink your entire business model- if I were McKinsey or one of the others I'd be out there advertising "We know what we're doing, we don't need AI to do it poorly"
 
Upvote
369 (369 / 0)

struwell

Seniorius Lurkius
46
I want to be fair about this, because I generally think ChatGPT is a useful tool for lit searches and summaries of papers (as with any summary, some nuance is lost). However, once I asked it for sources on a certain topic and it responded with hallucinated papers. My first clue that something wasn't quite right was when one of the papers (of which I was not previously aware) listed me as the first author...
 
Upvote
-10 (49 / -59)

Nihilus

Ars Scholae Palatinae
980
This. I absolutely cannot understand why a consultancy, who's entire business model is "pay us large sums of money for our expert's advice" would rely on a LLM for even as much as grammar advice.

If your expert is ChatGPT, why do I pay you? I can write prompts myself. This is an incredibly fast way to sink your entire business model- if I were McKinsey or one of the others I'd be out there advertising "We know what we're doing, we don't need AI to do it poorly"
It does worry me too that apparently nobody is sanity checking these reports before their general publication? Like if 10+ citations outright did not exist then Deloitte's analysis must have just been taken at face value with no serious review, and we are presumably using this to justify government policy?!

Even ignoring the generative AI aspect that strikes me as extremely concerning.
 
Upvote
187 (188 / -1)

Frodo Douchebaggins

Ars Legatus Legionis
12,063
Subscriptor
1759769510691.png

As true as it's ever been.


Fun story involving non-Deloitte consultants: I was vehemently opposed to a project being forced through at my last company, not because I thought the goal was bad (in fact it was actually a good goal, if the process of implementation was done by competent people), but because the consulting company and integrator for some of the stuff was setting off all sorts of alarm bells in my talks with them. One of my last major acts of resistance was putting together a slideshow since execs love those, and giving a succinct list of issues that were going to be encountered if they took this approach, and the upfront and predicted long-term effects of those issues.

The astute reader will guess that yes, they did proceed with reckless abandon, shortly after I departed that company. One of my coworkers kept a copy of the presentation, and as they ran into each issue, annotated that with the date they did run into it, and some little screenshots from slack and emails of people freaking out, and then when it was all done, he realized it was 12 slides long, and printed it as a calendar and sent it to me :ROFLMAO:
 
Last edited:
Upvote
329 (330 / -1)

DerHabbo

Ars Tribunus Militum
1,526
If an individual did this, they'd be in jail. I don't believe Australia has the concept of corporate personhood but here in the US, if corporations are people, why can't we throw corporations in jail? Cause there are many, many American companies that deserve decades in prison at this point (I'd give Deloitte maybe 4 years for this, they can be out in 2 years 3 months due to corporate prison overcrowding).
 
Upvote
122 (122 / 0)

RZetopan

Ars Tribunus Angusticlavius
7,947
Perhaps Lisa Crawford has a case for defamation or slander for having these false papers attributed to her. One way to stop the nonsense is to make it hurt. As it airs they are partially refunding the money but clearly all they did is engineer a few AI prompts to get the report. Make them refund it all, make them pay for defamation and send a message that this crap isn’t okay.

Same for the lawyers who submit briefs to the court with fake legal citations.
I'm not familiar with the Australian legal system, but they should actually have to pay a penalty multiplier for fraud, in other words pay more than they received for the fraudulent report. It also appears that they started with the conclusion that they wanted, and are claiming that their fraud does not alter the conclusion.
 
Upvote
101 (101 / 0)

wanderling

Wise, Aged Ars Veteran
118
Subscriptor
I want to be fair about this, because I generally think ChatGPT is a useful tool for lit searches and summaries of papers (as with any summary, some nuance is lost). However, once I asked it for sources on a certain topic and it responded with hallucinated papers. My first clue that something wasn't quite right was when one of the papers (of which I was not previously aware) listed me as the first author...
Genuine question - that makes me question the value of the summaries. How can we know the summaries are correct without reading the paper itself? Is there any research on not just lost nuance, but the hallucinations in AI summaries? I'd be interested in seeing it across approaches, such as NotebookLM and Kagi, which can pin to a set of sources, or requests to summarize a single paper across different models.

I occasionally use AI to summarize things, but I don't trust it past summarizing things where the ultimate goal is to point me to the actual authoritative source when I'm having a hard time finding it, so I can verify the summary. Do you trust the summaries you get? And if so, why?
 
Upvote
93 (94 / -1)

Unclebugs

Ars Praefectus
3,066
Subscriptor++
This has not been a good time for Deloitte; this mess with AI and the mess their former CEO is making in the WNBA. Anybody thinking about shorting Deloitte? For a professional services organization, using AI to do professional work, or the notorious work of its former CEO Cathy Engelbert sure isn't helping.
 
Upvote
29 (31 / -2)

Dumb Svengali

Ars Scholae Palatinae
650
The widespread adoption of AI is revealing how common motivated reasoning analysis is across every domain and industry. I would think fake citations would warrant a reexamination of the entire document, not just the quiet removal of those fake cites.

I mean, I remember doing that as an undergrad banging out meaningless term
papers and thinking “gosh I’m going to get a great grade on this despite realizing halfway through that I’m wrong in my hypothesis and just ignoring those citations.”

But I suppose I shouldn’t be surprised. The entire purpose of those giant consulting firms is to send some 28 year olds who make $300,000/year to go suss out what shitty thing Big Boss wants to do, then write a report justifying why firing everyone, doing a very unpopular thing, or treading heavily on moral and/or legal boundaries is the goal.

Imaginary AI cites fit perfectly into that system when you understand what the real product is. Voila. The Big 4.
 
Upvote
121 (121 / 0)

RZetopan

Ars Tribunus Angusticlavius
7,947
Earlier this year, Deloitte declared it would start using generative AI for its reports as a way of enhancing the value provided to its clients. I don't remember if they said it in a specific report or not, but I recall seeing it.

The citation issue continues to trip people up across the spectrum, from lawyers to business analysts. It's striking how many supposedly smart people do not understand the limits of the tools they insist will deliver such amazing value.
"It is difficult to get a man to understand something, when his salary depends on his not understanding it." Attributed to Upton Sinclair, but unverified.
 
Upvote
77 (77 / 0)

Joel Bruner

Seniorius Lurkius
39
Subscriptor
Folks wondering how and why a consultancy would use AI/LLMs is simply: banking off their name, maximizing profits, passing on the production savings to their end-of-year bonuses. IBM GTS (now Kyndryl) was the same racket for IT services IMO: Charge Cadillac prices for IT support but then outsource, offshore and find every way to cut corners and drive down actual delivery costs. I think if they had actually paid decent wages and nurtured talent that stayed and didn't turn over yearly they could have actually had a good product. Instead they filled floors of offices in India for a client with runbook/script readers working from runbooks US workers "knowledge transferred" before being let go. They were in effect LLM precursors: they had no true reasoning, discernment, or true opinion on what was the actual best/next step to resolve an issue. Most had never used Macs, yet were offering Mac support, it was ridiculous. And if the runbook didn't work? Run them around in circles until they gave up. The Delivery Manager would hear from the client and make up some reason why support was doing the best they could, etc... No different in this case: Correct and play down those pesky attribution errors because someone noticed and keep making cheap sausage at Kobe Beef prices.
 
Upvote
103 (104 / -1)
Post content hidden for low score. Show…

terrydactyl

Ars Tribunus Angusticlavius
7,886
Subscriptor
How many times do we have to see the fastest-to-the-bottom crowd make these mistakes before companies realize that AI isn't a magic wand?

Anyone want to place a bet on :
  1. How long it takes to find a hallucination in the updated version.
  2. When another Deloitte client will discover hallucinations in the work product they received, or
  3. When Deloitte issues a statement blaming a low-level employee who's been disciplined for the errors?
4. Mistakes were made.
 
Upvote
42 (42 / 0)

struwell

Seniorius Lurkius
46
Genuine question - that makes me question the value of the summaries. How can we know the summaries are correct without reading the paper itself? Is there any research on not just lost nuance, but the hallucinations in AI summaries? I'd be interested in seeing it across approaches, such as NotebookLM and Kagi, which can pin to a set of sources, or requests to summarize a single paper across different models.

I occasionally use AI to summarize things, but I don't trust it past summarizing things where the ultimate goal is to point me to the actual authoritative source when I'm having a hard time finding it. Do you trust the summaries you get? And if so, why?
Yeah, I was skeptical. I baselined this by using ChatGPT to summarize papers I'd already read, or on topics I'm already an expert in. Usually the summary was a re-wording of the abstract, with maybe some additional context from the paper. I've found these summaries to be pretty good, but you have to check.

In the case I mentioned, the statement ChatGPT made about the topic is very probably correct, but the sources for the correct statement were made up.

This is a theme -- I've asked ChatGPT how to make calculations using a code that is popular in my field (and which I've used for about a decade), just as a curiosity. It is correct about what the code ought to able to do, and it is even correct about how the code would do it, but it totally makes up the keywords you'd need to set in the input file to make this happen.

Quite bold of Deloitte to use it without checking. It's a trivial thing to do. If I were a consulting firm in the age of generative AI, I would want to be very clear on how I add value to AI -- here it is deeply unclear how Deloitte did.
 
Upvote
34 (42 / -8)

hillspuck

Ars Scholae Palatinae
2,179
Perhaps Lisa Crawford has a case for defamation or slander for having these false papers attributed to her. One way to stop the nonsense is to make it hurt. As it airs they are partially refunding the money but clearly all they did is engineer a few AI prompts to get the report. Make them refund it all, make them pay for defamation and send a message that this crap isn’t okay.

Same for the lawyers who submit briefs to the court with fake legal citations.
While I am in favor of sticking it to people who use AI like this in any way possible, I think you'd probably find it hard to make the case that it's "defaming" her. Unless the made up papers are about her doing experiments on humans or something of that magnitude. While it's wrong, if someone cited a fake paper I wrote on an inconclusive drug trial, it'd be hard to show that it had somehow damaged my reputation.

Edit: Due to the number of downvotes, I have to wonder if I just put my point across poorly or if there's just a bunch of AI defenders downvoting. My point is that the law as it is written today in most countries requires defaming to somehow damage someone's reputation. That's what I'm saying would be hard to prove in this case. I'm all for the AI companies being charged with fraud for knowing that their software is riddled with bad output.
 
Last edited:
Upvote
11 (33 / -22)

kaleberg

Ars Scholae Palatinae
1,258
Subscriptor
It looks like AI might destroy the consultancy industry. Why pay millions to a fancy consultant when one can ask an LLM to crank out an equally worthless report? Management pays consultants to justify decisions they have already made and provide a way to deflect blame when they go awry. It sounds a lot cheaper to fire up an LLM, get the nonsense one wants and have a dumb computer to blame.
 
Upvote
59 (60 / -1)

motytrah

Ars Tribunus Militum
2,960
Subscriptor++
This. I absolutely cannot understand why a consultancy, who's entire business model is "pay us large sums of money for our expert's advice" would rely on a LLM for even as much as grammar advice.

If your expert is ChatGPT, why do I pay you? I can write prompts myself. This is an incredibly fast way to sink your entire business model- if I were McKinsey or one of the others I'd be out there advertising "We know what we're doing, we don't need AI to do it poorly"
These big firms have been doing this for ages. It's just in the before times they'd have a room full of cheap college grads with limited experience do the work. They you have one person supervising multiple projects directing the work.

I think the biggedt thread AI imposes is Junior Devs and College Hires.
 
Upvote
58 (58 / 0)

norton_I

Ars Praefectus
5,813
Subscriptor++
Earlier this year, Deloitte declared it would start using generative AI for its reports as a way of enhancing the value provided to its clients. I don't remember if they said it in a specific report or not, but I recall seeing it.

The citation issue continues to trip people up across the spectrum, from lawyers to business analysts. It's striking how many supposedly smart people do not understand the limits of the tools they insist will deliver such amazing value.

It's striking and worrying. Because a fake citation is easy to verify and fix. And it would be easy to add a "citation corrector" to an LLM that just removed or replaced bogus citations with "real ones". But the real problem is that the fake citations are a minor problem in themselves but are mostly a canary for the other problems that are much harder to verify. If a report like this cited real research papers but mis-stated their results, it would be much harder to detect. Still possible of course, but requires a lot of work that negates much of the benefit of hiring consultants to make a research report, and also may require significant subject matter expertise that the client might not have.
 
Upvote
68 (68 / 0)

DarthSlack

Ars Legatus Legionis
23,229
Subscriptor++
4. Mistakes were made.

5. Pants were lost.

Given that Deloitte's business model is to hire a whole bunch of kids right out of college, kinda supervise them a little bit, and charge boatloads of money for whatever flows out, this result is entirely unsurprising. And this is far from the first time that shoddy work has gotten them in trouble
 
Upvote
61 (61 / 0)

SeanJW

Ars Legatus Legionis
11,893
Subscriptor++
I mean, I guess just use chat-gpt instead of Deloitte, seems to be the result of their consultancy. The results will be similarly nonsense, but a lot cheaper.
That was already a suggestion made.

The bulk of the report isn’t actually wrong and it is backed up by genuine evidence and reasonable recommendations. Of course “your system is shit, designed to automatically punish people without recourse” isn’t exactly news seeing most of it was made more”efficient” by too many years under conservative governments. Help poor people? Why would they do that? They don’t vote for conservatives unless they’re a pensioner.

Expensive report for what it says.
 
Upvote
-11 (6 / -17)
Genuine question - that makes me question the value of the summaries. How can we know the summaries are correct without reading the paper itself? Is there any research on not just lost nuance, but the hallucinations in AI summaries? I'd be interested in seeing it across approaches, such as NotebookLM and Kagi, which can pin to a set of sources, or requests to summarize a single paper across different models.

I occasionally use AI to summarize things, but I don't trust it past summarizing things where the ultimate goal is to point me to the actual authoritative source when I'm having a hard time finding it, so I can verify the summary. Do you trust the summaries you get? And if so, why?
There is no way I'm aware of to guarantee that an AI has summarized something perfectly without reading the documents yourself. ChatGPT openly acknowledges, if you ask it, that it is best at summarizing short, clear content. Since that's the content you least need summarized, it seems to be a use-case in search of a use. And because chatbots are not like humans and can't learn your personal preferences, there's no guarantee that a bot won't summarize 50 stories perfectly but fail on the 51st.

The only way I'm aware of to guard against this outcome other than just reading the documents yourself is to watch for small discrepancies that cut against your own understanding of the topic. If you know Factual Event A has occurred, but the summary either doesn't mention it or implies it has not, that can be a sign of inaccuracy.

The best use for AI summaries I can think of is to summarize your own work -- maybe for writing an executive summary or an email that needs to condense a complex, lengthy report into its most important takeaways. If you wrote the source, you'd definitionally be the person most equipped to summarize it.
 
Upvote
65 (65 / 0)

crepuscularbrolly

Ars Tribunus Militum
1,766
Subscriptor++
I want to be fair about this, because I generally think ChatGPT is a useful tool for lit searches and summaries of papers (as with any summary, some nuance is lost). However, once I asked it for sources on a certain topic and it responded with hallucinated papers. My first clue that something wasn't quite right was when one of the papers (of which I was not previously aware) listed me as the first author...
You're fooling yourself about the accuracy of the summaries.

ETA how long does it take to read the abstracts and closing paragraphs?
 
Last edited:
Upvote
34 (35 / -1)