Deloitte will refund Australian government for AI hallucination-filled report

qchronod

Ars Praefectus
3,778
Subscriptor++
My question is why would Deloitte use an off-the-shelf LLM implementation, and not a fine-tuned, custom trained model for their own consultancy needs? Too hard to do, takes too long, or don't they get the need for a Retrieval-Augmented Generation (RAG) database with the citation docs? Not sure how basic their workers implementation was, but damn, they sound dumber than they already are.
Employees were typically not allowed to use 3rd-Party LLMs (even blocking them on the internet), they were encouraged to use the internal customized version that is trained on internal documents. They did seem to be opening up a bit with allowing at least some people to access external generators through specific API-implementations, but that also appeared to require you do set up your own agents, links to external information that you needed, and a bunch of other fiddly bits that most people aren't going to be experts on (especially after only taking a couple of 3 hour online training classes).
 
Upvote
3 (3 / 0)

qchronod

Ars Praefectus
3,778
Subscriptor++
This has not been a good time for Deloitte; this mess with AI and the mess their former CEO is making in the WNBA. Anybody thinking about shorting Deloitte? For a professional services organization, using AI to do professional work, or the notorious work of its former CEO Cathy Engelbert sure isn't helping.
Unfortunately they can't be shorted since they are privately owned, they way they've acted in the last year or so definitely deserves it.

I have personal feelings about how the highest levels of management run and things like essentially buying a promotion with obtaining increasing amounts of equity in the business.
 
Upvote
1 (1 / 0)

Faceless Man

Ars Legatus Legionis
11,621
Subscriptor++
But did they have big balls?
Well, in the specific instance mentioned, it might depend on which of their accusers you talked to.

For those not familiar (ie non-Australians), a "senior advisor" to a cabinet minister was accused of assault* by a colleague, and, while the criminal case was dismissed due to jury issues, they lost the defamation case they brought with a ruling that they most likely had assaulted their colleague. The same person, among other things, has also been credibly accused of the same thing by other people in their hometown.

* I'll let you work out what kind of "assault" I'm talking about. There's a reason I don't want to mention their name.
 
Upvote
1 (1 / 0)

graylshaped

Ars Legatus Legionis
67,928
Subscriptor++
Well, in the specific instance mentioned, it might depend on which of their accusers you talked to.

For those not familiar (ie non-Australians), a "senior advisor" to a cabinet minister was accused of assault* by a colleague, and, while the criminal case was dismissed due to jury issues, they lost the defamation case they brought with a ruling that they most likely had assaulted their colleague. The same person, among other things, has also been credibly accused of the same thing by other people in their hometown.

* I'll let you work out what kind of "assault" I'm talking about. There's a reason I don't want to mention their name.
Yeesh. My reference was to a punk Musk brought in to DOGE as one of his minions tossing the system rooms of the federal government, who famously used that nom de plume for his online persona.

Neither one should be considered as trusted servants, it sounds like.
 
Upvote
3 (3 / 0)

Faceless Man

Ars Legatus Legionis
11,621
Subscriptor++
Yeesh. My reference was to a punk Musk brought in to DOGE as one of his minions tossing the system rooms of the federal government, who famously used that nom de plume for his online persona.

Neither one should be considered as trusted servants, it sounds like.
I got that. But I didn't want to confuse the two circumstances. Although, I wouldn't be surprised if the last government had invaded the ACT Magistrates Court in response to their boy being charged.
 
Upvote
2 (2 / 0)
So the equivalent for a non physical entity is banking them from operating in certain jurisdictions for a period of time.
I’d say that the equivalent of gaol would be either the government casting all the votes and getting all the dividends for the duration of the sentence (less prison wages), or alternatively getting the percentage of each class of share equivalent to the percentage of the average criminal’s remaining life that the sentence would represent (I.e. if the average criminal is 30 and their life expectancy is 80, it would be a 2% stake for each year of the sentence) and having some kind of golden share which means that rules about minimum or maximum percentage ownership of any particular class don’t apply to them.
 
Upvote
0 (0 / 0)

SeanJW

Ars Legatus Legionis
11,893
Subscriptor++
My question is why would Deloitte use an off-the-shelf LLM implementation, and not a fine-tuned, custom trained model for their own consultancy needs? Too hard to do, takes too long, or don't they get the need for a Retrieval-Augmented Generation (RAG) database with the citation docs? Not sure how basic their workers implementation was, but damn, they sound dumber than they already are.


Err....they do? That's a private instance, fed from their data. Did you read any of the links?
 
Upvote
2 (2 / 0)

SeanJW

Ars Legatus Legionis
11,893
Subscriptor++
As an Aussie, there is nothing more Australian Government then paying consultants for nothing and trying to automate penalties. They need the automated penalty revenue to pay for more consulting on the automated penalty systems.

The penalties are already automated. This is the report from the current government to say "the previous Coalition government did that to fuck over the unemployed", which is, well, "no shit sherlock". A few grants to a few of the academics cited in this report would have gotten a much more accurate and cleaner report (or hell, has already been produced by these academics in the first place) for a much lower cost. But that would require a public service that hasn't already been gutted of competency (by the same Coalition governments) to be able to put together the final report. Hence the need for consultancies to do the job a public service is supposed to be able to do in the first place.

Edit: For those not up with Australian Federal politics, our current government is a centre-right one; the previous party (the Coalition) was a much right wing but it's made of up two parties, the Liberals, who are centre-right but more right than the current ALP (they used to be further right, but an electoral massacre got rid of some of the weirder offenders), and the Nationals, who are the loony nut-job utter right wing crazy country party and make up for the electoral loss of the Liberal's crazies by turning the crazy up to 11 in their party instead as a result. Neither is what you'd call a role model of social progress, but the ALP at least recognises the words when put together.

Edit 2: Oddly enough one of the more progressive political leaders in decades was leader of the Coalition who was always fairly socially progressive and actually understood things like science having meaning. So of course they threw him under the bus the moment that sort of thing could actually affect policies and replaced him with an onion eating nutjob who threw all that sort of thing into a shredder. And one of the ALPs former leader is a complete utter whacko who has lost law suits for homophobic slurs being defamation. They really don't stick to the party line as much as you'd think.

Edit 3: I miss Paul Keating. Not for his policies, they stank. But for his pithy way with words, like referring to the honorable opposition as a "conga line of suckholes"

Edit 4: The Coalition opposition currently has a (gasp) female leader. Not because they think it's a good idea really, more because she's the last one left standing who voters won't actively seek out to spit on. Well, most voters. A slim majority of them. Something like that.
 
Last edited:
Upvote
3 (3 / 0)
drawing a distinction between a corporation shielding an investor for legal liability beyond his or her investment, and the misbelief that the financial shield somehow magically protects its officers, agents, and trustees
The biggest problem with corporate personhood, apart from where it is abused to make the practical exercise of civil rights proportional to surplus wealth[1], is that it is typically an effective barrier to proceeds of crime laws: if a company profits from a crime and shareholders profit from that, whether via dividends or capital gains (or more exotic measures like borrowing against the value) then that profit for the shareholders should also be treated as proceeds of crime.

1. This is relatively simple to resolve in a manner that’s totally consistent with with the pretexts for extending civil rights to corporations in America and elsewhere: treat all corporate exercises of rights as if they were done by the ultimate beneficial owners and subject to all the restrictions, caps, or prohibitions that apply to each of those owners.
 
Upvote
0 (1 / -1)
Employees were typically not allowed to use 3rd-Party LLMs (even blocking them on the internet), they were encouraged to use the internal customized version that is trained on internal documents. They did seem to be opening up a bit with allowing at least some people to access external generators through specific API-implementations, but that also appeared to require you do set up your own agents, links to external information that you needed, and a bunch of other fiddly bits that most people aren't going to be experts on (especially after only taking a couple of 3 hour online training classes).
Thanks for that info.
 
Upvote
1 (1 / 0)

Marlor_AU

Ars Tribunus Angusticlavius
7,711
Subscriptor
For those not up with Australian Federal politics, our current government is a centre-right one; the previous party (the Coalition) was a much right wing but it's made of up two parties, the Liberals, who are centre-right but more right than the current ALP (they used to be further right, but an electoral massacre got rid of some of the weirder offenders), and the Nationals, who are the loony nut-job utter right wing crazy country party and make up for the electoral loss of the Liberal's crazies by turning the crazy up to 11 in their party instead as a result. Neither is what you'd call a role model of social progress, but the ALP at least recognises the words when put together.
How you see these parties all depends where you sit on the political spectrum.

I'm sure many people on the left outside Australia would love to see a "centre-right" party that supports universal healthcare, emissions reductions targets, multilateralism, increased industrial relations protections (right-to-disconnect laws, minimum standards for gig workers, criminalising wage theft, corporate manslaughter laws), better recognition for Indigenous people, subsidised childcare, and so on.

It's fair to say that Labor has hewn to the political centre, and isn't as well aligned with the mainstream "Left" as it used to be, but I think most progressive voters globally would still look on with envy and wish their centrist parties had even half of those policy positions.
 
Upvote
2 (2 / 0)

SeanJW

Ars Legatus Legionis
11,893
Subscriptor++
How you see these parties all depends where you sit on the political spectrum.

I'm sure many people on the left outside Australia would love to see a "centre-right" party that supports universal healthcare, emissions reductions targets, multilateralism, increased industrial relations protections (right-to-disconnect laws, minimum standards for gig workers, criminalising wage theft, corporate manslaughter laws), better recognition for Indigenous people, subsidised childcare, and so on.

It's fair to say that Labor has hewn to the political centre, and isn't as well aligned with the mainstream "Left" as it used to be, but I think most progressive voters globally would still look on with envy and wish their centrist parties had even half of those policy positions.

Unfortunately, they're still fairly well in the pocket of the fossil fuel industry. We're still a major exporter after all and they're certainly not wanting that to change. And the Coalition has burnt to the ground our unis and other 3rd and 4th place export industries (because hot beds of liberal free thinkers and all that). They're also just as repressive as far as free speech and similar. They're certainly not the Nationals or Clive Palmer, but Latham was their leader too... The difference between them and the Coalition is not as significant as all that, but they certainly put on a better show of it.

On the bright side, they've finally realised News Corp is a spent force in media in Australia - they may own newspapers and TV, but nobody is actually using those for information. They hit the internet for better or worse.

Edit: I worked for many years at a uni btw, so I have to laugh about "free thinkers" at a uni. Not because they don't - they certainly do. But there's a very good reason I grabbed the domain "uselesscockmuppets.com" while I was working there...
 
Upvote
1 (1 / 0)

graylshaped

Ars Legatus Legionis
67,928
Subscriptor++
The biggest problem with corporate personhood, apart from where it is abused to make the practical exercise of civil rights proportional to surplus wealth[1], is that it is typically an effective barrier to proceeds of crime laws: if a company profits from a crime and shareholders profit from that, whether via dividends or capital gains (or more exotic measures like borrowing against the value) then that profit for the shareholders should also be treated as proceeds of crime.

1. This is relatively simple to resolve in a manner that’s totally consistent with with the pretexts for extending civil rights to corporations in America and elsewhere: treat all corporate exercises of rights as if they were done by the ultimate beneficial owners and subject to all the restrictions, caps, or prohibitions that apply to each of those owners.
"Relatively simple."

/rollsyeyes
 
Upvote
1 (1 / 0)

Shoraine

Wise, Aged Ars Veteran
103
"It is difficult to get a man to understand something, when his salary depends on his not understanding it." Attributed to Upton Sinclair, but unverified.
It is verifiable, though (even if it is the man attributing it to himself). Source: Oakland Tribune (Volume 121, #164, 11 Dec 1934, page 19, column 3, bottom half).
 
Upvote
0 (0 / 0)
Deloitte is one of the highest regarded consulting firms in the world, and they got caught using AI which produced important documents with many hallucinations. Similarly highly regarded engineering organizations (e.g. Arginine National Laboratory) are posting ads about how they are using AI to design manufacture, build, and operate nuclear power plants. And the politicians are pushing the NRC to use AI to review new designs.

The fad of thinking AI is acceptable to design, build, or operate safety critical equipment, systems, or facilities is now out of control. I hold all of the AI fan boys responsible for stoking this fad, and when ( not if) the inevitable "interesting event" occurs, and people die, I would argue that all of the fan boys should be held accountable for the mess that will be created.

We have good records of everyone who has been stoking the fad - maybe someone could program an AI to sort thru all of their words, and apportion fault accordingly.
 
Upvote
1 (1 / 0)
The bigger joke is that people actually do read these reports.
Some people, like safety analysts and regulators, MUST read these reports, and they depend on them being correct and complete in order to justify a finding that a facility is "safe". If there is no independent verification and validation of every fact in these reports, then the reports are totally worthless. Less than worthless, even, because their very existence will be used by other people and AIs to produce new reports and make bad decisions.
 
Upvote
1 (1 / 0)

Tofystedeth

Ars Tribunus Angusticlavius
6,407
Subscriptor++
How about you start calling them ERRORS, the word hallucinations is being used to dodge the fact that LLMs are hugely unreliable by design.
I will never understand the stance that using the word hallucinate doesn't imply unreliability.
I get the idea that it implies a level of cognition that they simply do not possess, but they're still just being wildly wrong.
Hell, hallucinate sounds way worse than error to me. Everyone makes errors, due to inattention, or poor analysis of the data you have and whatnot. Hallucination implies that it just said stuff that doesn't even exist in reality which is way worse.
 
Upvote
2 (2 / 0)

graylshaped

Ars Legatus Legionis
67,928
Subscriptor++
Oh, and Arthur Anderson died after it was caught fabricating information for Enron. Deloitte should suffer the same fate.
I haven't seen any information that the bogus report Deloitte gave the Australian government was used to commit investor fraud, though.

Yet?
 
Upvote
0 (0 / 0)

Faceless Man

Ars Legatus Legionis
11,621
Subscriptor++
Oh, and Arthur Anderson died after it was caught fabricating information for Enron. Deloitte should suffer the same fate.
I know someone who used to work for Andersen Consulting. When they were spun off into a separate company, we never really expected that it would be the part that would still be going 10 years later.
 
Upvote
1 (1 / 0)

Random_stranger

Ars Praefectus
5,293
Subscriptor
This sounds like conspiracy rant. Pichai worked for McKinsey not Deloitte, but for a short period of time, and to my understanding as an entry level consultant. Not a partner or anyone making strategic decisions for the firm. I haven't heard that Nadella worked for either, if so it would have been in the 90s.

In any case the idea that either would be acting to repay favors from such a job decades ago seems ludicrous. You can just blame them for their actions directly.

Dang.. I got Deloitte and McKinsey mixed up. My bad.
 
Upvote
0 (0 / 0)

azazel1024

Ars Legatus Legionis
15,077
Subscriptor
Genuine question - that makes me question the value of the summaries. How can we know the summaries are correct without reading the paper itself? Is there any research on not just lost nuance, but the hallucinations in AI summaries? I'd be interested in seeing it across approaches, such as NotebookLM and Kagi, which can pin to a set of sources, or requests to summarize a single paper across different models.

I occasionally use AI to summarize things, but I don't trust it past summarizing things where the ultimate goal is to point me to the actual authoritative source when I'm having a hard time finding it, so I can verify the summary. Do you trust the summaries you get? And if so, why?
That is generally what I use it for, the rare times I use it.

Give me a summary that will help me find the actual source citations, so I can go tease that out of the original data/summary/paper, because I am having a hard time figuring out where to look or what to do next.

Well, and on rare times I use it to rewrite something I've written to sound better and then re-read what it spit out to me. Like "here is a paragraph, make this sound more professional and use less profanity".
 
Upvote
0 (0 / 0)

azazel1024

Ars Legatus Legionis
15,077
Subscriptor
There is no way I'm aware of to guarantee that an AI has summarized something perfectly without reading the documents yourself. ChatGPT openly acknowledges, if you ask it, that it is best at summarizing short, clear content. Since that's the content you least need summarized, it seems to be a use-case in search of a use. And because chatbots are not like humans and can't learn your personal preferences, there's no guarantee that a bot won't summarize 50 stories perfectly but fail on the 51st.

The only way I'm aware of to guard against this outcome other than just reading the documents yourself is to watch for small discrepancies that cut against your own understanding of the topic. If you know Factual Event A has occurred, but the summary either doesn't mention it or implies it has not, that can be a sign of inaccuracy.

The best use for AI summaries I can think of is to summarize your own work -- maybe for writing an executive summary or an email that needs to condense a complex, lengthy report into its most important takeaways. If you wrote the source, you'd definitionally be the person most equipped to summarize it.
On your last, that is one of my main uses for it. Summarize something I've written for an executive summary or closing statement and then double check it makes sense based on what I wrote for a white paper or decision memo.
 
Upvote
0 (0 / 0)
How about you start calling them ERRORS, the word hallucinations is being used to dodge the fact that LLMs are hugely unreliable by design.

How does the word "hallucination" imply reliability?

"You can't trust ChatGPT, it hallucinates" scans as worse to me than "You can't trust ChatGPT, it makes errors."

Everyone periodically commits an error or makes a mistake. Literally everyone. The goal is to make sure those errors are as small as possible, but we all make them. Saying LLMs hallucinate captures their tendency to conflate, misrepresent, and provide unreliable data that may or may not have any basis in fact. It's different from just making a mistake.
 
Upvote
1 (1 / 0)
What puzzles me is why fields like consultancy and law don't seem more concerned about detecting nonsense citations before they turn in their work.

Regardless of what you do or don't know about LLMs; dodgy citations are perhaps the single most damning flavor of error in what is supposed to be a document of authoritative reasoning because they combine almost the same level of triviality in detection as typos or misspellings(they don't just leap off the page; but someone who knows nothing about the subject can run your bibliography through a search engine one line at a time quite easily); while, unlike fiddly detail proofreading, casting substantial doubt on the material they are included in.

Unless someone is just preening; citations go where it's important. You are invoking an authority, or referring to the results of a study, or incorporating by reference a paper to justify your line of thought. Perhaps you are passing quickly over a deeper subject that the reader may seek clarification on. In all of those cases it's critical that what you are citing actually exist; because if it doesn't all those implied purposes are just lies.

It just seems weird to see fields that(ideally among other things; but definitely in part) trade in prestige and credibility just sort of...not...bothering to sanity check such a mistake when they probably would be concerned if the document was littered with typos or one of the figures was inadvertently inserted upside down; despite those sorts of errors potentially having to relation to the paper's quality of analysis at all.
My best guess is that in some cases, the problem isn't too few people looking at the text, but too many.

Let's assume that Deloitte farmed this work out to a writer or three, either working directly for the analysts whose names get attached, or simply as part of an assignment pool if the final work isn't attributed to named authors. The bad citations get into the document because someone makes the mistake of trusting them -- that has to happen somewhere -- but the problem may be that once they are in the document, everyone assumes the errors are someone else's problem to find.

The lower-tier author thinks "The analyst whose name goes on it will flag any problems as part of his feedback. This is only the first draft."
The analyst above that person thinks: "I can trust my author team to use good citations. They know what they are doing. I don't need to check."

Then, the other people who are providing feedback for specific aspects of the story only check it for their own spheres of responsibility. Legal checks to make sure Deloitte isn't making any claims it can't stand behind. Layout and design might check for obvious typos, but they aren't concerned with factual validity. The people who are reviewing the drafts on the client side assume that Deloitte's references are good (at least initially) and focus on giving feedback on other aspects of the report.

But if Deloitte doesn't have a person whose dedicated job is checking every citation for existence and/or accuracy, that most fundamental task may not be owned by any person in particular. When 10-20 different people touch a document from inception to publication, they may collectively assume that the entire creation process is edited by someone. And since everyone assumes it's someone else's job to do that kind of verification, it doesn't get done.
 
Upvote
3 (3 / 0)
My best guess is that in some cases, the problem isn't too few people looking at the text, but too many.

Let's assume that Deloitte farmed this work out to a writer or three, either working directly for the analysts whose names get attached, or simply as part of an assignment pool if the final work isn't attributed to named authors. The bad citations get into the document because someone makes the mistake of trusting them -- that has to happen somewhere -- but the problem may be that once they are in the document, everyone assumes the errors are someone else's problem to find.

The lower-tier author thinks "The analyst whose name goes on it will flag any problems as part of his feedback. This is only the first draft."
The analyst above that person thinks: "I can trust my author team to use good citations. They know what they are doing. I don't need to check."

Then, the other people who are providing feedback for specific aspects of the story only check it for their own spheres of responsibility. Legal checks to make sure Deloitte isn't making any claims it can't stand behind. Layout and design might check for obvious typos, but they aren't concerned with factual validity. The people who are reviewing the drafts on the client side assume that Deloitte's references are good (at least initially) and focus on giving feedback on other aspects of the report.

But if Deloitte doesn't have a person whose dedicated job is checking every citation for existence and/or accuracy, that most fundamental task may not be owned by any person in particular. When 10-20 different people touch a document from inception to publication, they may collectively assume that the entire creation process is edited by someone. And since everyone assumes it's someone else's job to do that kind of verification, it doesn't get done.
You know, this sort of situation exists in engineering, too. Not just in writing reports, but in building stuff. The failure of the Kansas City Hyatt walkway was exactly due to something like this, where the engineers designed a structure that was competent, but then the fabricators said that it would be too hard/expensive to build that way, so they suggested an alternative, which was approved by someone who did not understand what was happening. It was oltimately signed off by a friend Professional Engineer who thought that everyone else had reviewed it properly, but it had not.
 
Upvote
0 (0 / 0)