Editor’s Note: Retraction of article containing fabricated quotations

Status
Not open for further replies.

AI_Skeptic

Wise, Aged Ars Veteran
179
It is a fundamental part of a journalist’s responsibility to verify their quotations. I believe that the journalist should often go further and ask if the quoted person has a further comment.
I feel what Benj did was much then what you're leading on. The mis-quote was from a website, not from an interview. There is absolutely no reason why Benj could not copy the quote from the AI system, paste it into a "find" box on his web browser, and confirm the quote exists. That takes 30 seconds to do. Benj did not do that. Benj has 20 years of journalism experience (per his website) and I am stunned he failed to perform such a simple task.

What other basic journalism tasks is Benj failing to perform?
 
Upvote
50 (55 / -5)
The only criticism I have for Ars itself is pulling the article and comments down. I shouldn’t have to go sleuthing to find the original article as it was published.

I don't mind the sleuthing that I did. On balance, in this new age where information and misinformation is hoovered up instantly, and rebroadcast widely and loudly, it makes sense to withdraw rather than perpetuate.
 
Upvote
4 (9 / -5)

r0twhylr

Ars Praefectus
3,357
Subscriptor++
I've enjoyed Kyle Orland's work over the years but I won't trust it going forward and honestly will be moving on from Ars as well. This is not the first issue with intergrity they've had and at some point as a reader you have to admit there is something broken with the culture regardless of how much you enjoy the content.
Expecting integrity and transparency is reasonable. Expecting perfection is not.

Where will you move on to? Do you expect that other sites and sources are somehow immune? A number of high-profile publications have had problems with fabricated material over the years, and the last couple decades of our descent into soundbite-fed infotainment instead of journalism has turned formerly respected news platforms into garbage. Ars at least fights to maintain their integrity.

I'm going to wait and see what happens. I'm hoping for exceptional transparency. I'm withholding judgement on whether Benj should stay or not - that's not my call and I'm glad - but I am expecting a more detailed response from Ken on what will be done to prevent this sort of thing in the future. Given that people are fallible, given that deadline pressure will always exist in journalism, given that with or without AI, source fabrication is a real threat to journalism, what can be put in place to prevent a recurrence?

At this point I have no plans to cancel my subscription.
 
Upvote
17 (29 / -12)

graylshaped

Ars Legatus Legionis
67,692
Subscriptor++
Errors in judgement should be forgiven. Errors in morals should not be forgiven. I do not believe that Benj made an error in judgement, but instead he made an error in morals.

1. He (SHOULD!) know, as an AI expert, that AI systems are probability generators, and can not think or make decisions. Because it's a probability generator, by definition, it can not extract quotes from websites. If there is such a tool created, that's worthy of an article itself.

2. He is a journalist of over 20 years. The first thing journalists are taught is to always verify the quotes. The quotes, in this case, come from a website, which could easily be verified using the "Find" command on any web browser. He failed to engage in that very simple task.

3. Because he used AI generated quotes, he put his employer at extreme legal risk. Thankfully the quote generated by AI wasn't harmful, and the person wronged took it in stride. However, there is no reason at all why he didn't verify the quote. But could it happen again? Would an employer want to take that risk?

Clearly, he is not a new journalist, and he is not new to AI. Either he's incompetent as an AI expert, and if so, dismissed from the AI beat - or incompetent as a journalist, and in that case he should be dismissed from working for Ars.

Why do you believe this is an error in judgement?
Decision making requires judgment. It's a silly hair to split.
 
Upvote
8 (11 / -3)
Well there are also the other villagers saying "I'm not sure forging quotes in a journalistic piece is such a big deal".

This is an interesting take - the quotes were made up bullshit, but were "truthy enough" that it doesn't matter to you that they were fabricated.

Please, someone here explain how at least 21 Ars readers can think it is okay for a journalist to misrepresent a key quotation. To my thinking, once is enough for a true journalist to post that kind of lie.

Does intent matter in this situation? There are lots of accusations of "forging," "fabricating," "lying," and so on throughout these comments, and every single one of those requires intent to deceive. To whit, from the macOS dictionary:

fabricate | ˈfabrəˌkāt | verb [with object] 1 invent (something) in order to deceive

forge | fôrj | verb [with object] 3 produce a copy or imitation of (a document, signature, banknote, or work or art) for the purpose of deception

lie 2 | lī | noun an intentionally false statement

I don't believe it's likely that Mr. Edwards intended to quote the subject of his article incorrectly, therefore I don't think those adjectives accurately convey the situation.

So back to intent - does it matter? My initial reaction is that it should matter, but I'm open to other thoughts on the matter.
 
Upvote
-15 (21 / -36)

niftykev

Ars Scholae Palatinae
730
Not that it matters, but the soccer analogy about yellow/red cards isn't quite on the mark. I mean it is in the concept of is this thing bad enough to go right to red.

No, this thing is more like being involved in professional sports and gambling on professional sports, especially on games you're actually playing. Bringing the game into disrepute. It's still not a perfect analogy, but his actions have tarnished his name as as well as Ars' name. In the sporting world, that tends to result in long suspensions up through lifetime bans.
 
Upvote
7 (15 / -8)

Sarty

Ars Tribunus Angusticlavius
7,816
I don't believe it's likely that Mr. Edwards intended to quote the subject of his article incorrectly, therefore I don't think those adjectives accurately convey the situation.

So back to intent - does it matter? My initial reaction is that it should matter, but I'm open to other thoughts on the matter.
I think "reckless disregard for the truth" is the applicable standard here. Putative AI expert asks the slopbox to pull quotes from a single blog post, an act which would take that expert maybe five minutes of human time, and he just rolls with it and publishes it to the world.

If "did not know and had no reasonable way of knowing" is a 0, and "I lied to them intentionally, maliciously, and with a smile on my face" is a 10, Mr. Edwards' apparent behavior isn't a 10. But it's closer to a 7 than a 3.
 
Upvote
54 (62 / -8)

AI_Skeptic

Wise, Aged Ars Veteran
179
So back to intent - does it matter? My initial reaction is that it should matter, but I'm open to other thoughts on the matter.
I think it depends on how new a journalist is to the job. For a new journalist, I believe intent should matter, and a new journalist should be taught best practices - which includes not using AI tools to extract information from websites and then quoting what AI generates because these tools are just a word probability generator, and to reinforce that sources need to be tripled checked to minimize liability.

Benj, according to his website, has over 20 years of experience in Journalism as a reporter, and is an expert in AI. This tells me the following:

1. He failed to performed basic due diligence in confirming a quote from a website is valid (which could be performed using a copy/find/paste in the web browser of his choice).
2. He does not understand that GenAI tools are a probability generator, and can not accurately extract text from websites (a massive failing to understand how GenAI operates!)

As I understand it, a reporter should always verify the source three times, just to make sure the source isn't being misquoted. This minimizes the risk of liability.

If he can't do a 30 second search of a website, that shows a massive failure in willingness to verify AI information.
 
Upvote
46 (51 / -5)
Post content hidden for low score. Show…

Niles Gazic

Ars Praetorian
405
Subscriptor++
And a note to our current administration in DC - this is what transparency looks like.

Why in the hell do people even bother offering suggestions like this? If we were talking about some newbie politician, then maybe.

But it is blindingly obvious that the Trump administration doesn't give a single flying fuck about ethics, humility, integrity, morality, the rule of law, transparency, or even just common decency. The more brutal and criminal and sadistic and stupid some act or policy is, the more they love it.
 
Upvote
21 (31 / -10)

Robin-3

Ars Scholae Palatinae
1,127
Subscriptor
It's not like Benj is shy of his AI usage as a tool to write better articles.


View: https://youtu.be/1nEph7-Viyc?t=255


In his interview with Ed Zitron, he is very candid, and he admits that GenAI is a good tool to help him fight against COVID Brain Fog. This is not a new excuse from him, the interview is from 4 months ago.

This is... enlightening, but in a gut-sinking way.

So, someone who should and does know better is, simultaneously, trying to stay on top of things and gainfully employed in a rapidly-evolving field despite struggling with long-term health (including mental health & focus ability) drawbacks from COVID.

And that person is, like so many of us, being bombarded on all sides with the message that AI is so! helpful! and so! capable! and will make your life so! much! easier! And, unfortunately, it is and does.... most of the time. But it also screws up, rarely but inevitably (and at a frequency that's still statistically significant if you use it as a day-to-day tool). And AI has no self-awareness, no consciousness, so it's both confidently and plausibly wrong when it screws up.

We, as humans, really aren't wired to deal with that. Just like we aren't wired to sit with hands poised just above the steering wheel of a "self-driving" car and attention on the road despite the car driving itself, we aren't wired to rigorously check every single plausible "fact" a tool presents to us, when the last several facts have been fine and these ones sound fine too.

That's why this is such a bad idea.

And I have a lot of sympathy for someone with long-term health issues that impact the abillity to focus and function; I have long-term health issues that impact my ability to focus and function. But if you want to be trusted to do your job, you have to be willing and able to say "sorry, I'm not up to it today. I need to clock out and put in for sick time." (Or do your job's equivalent of busywork for the rest of the day, and ask someone else to cover if you have a thinking-about-stuff deadline. Or whatever.)

It sucks. (Long COVID sucks, as do lots of chronic illnesses. The way illness and semi-disability are treated in the work world, especially in the U.S., are awful too.) And it isn't fair. But it also isn't fair to your employer (and their customers, and your colleagues) to choose to do a subpar job because you don't want to admit you aren't up for it. You'll screw something up someday.

And this was a hell of a screwup. At the risk of being melodramatic, this was a betrayal of trust.
 
Upvote
77 (80 / -3)
Does intent matter in this situation? There are lots of accusations of "forging," "fabricating," "lying," and so on throughout these comments, and every single one of those requires intent to deceive. To whit, from the macOS dictionary:



I don't believe it's likely that Mr. Edwards intended to quote the subject of his article incorrectly, therefore I don't think those adjectives accurately convey the situation.

So back to intent - does it matter? My initial reaction is that it should matter, but I'm open to other thoughts on the matter.
There's another definition of "fabricate" which is entirely devoid of intent. And he used a piece of software which fabricated quotes in that sense of the verb.

But you are right to focus on the word "intent."

Then he submitted them as his own work with the strong implication that he retrieved them from the blog post himself and had verified they were correct. That later part seems to me to fall on the wrong side of the "intent" test, although it probably also can't be properly called "fabrication" or "forgery." Perhaps by not disclosing the source and presenting it as his own work it could be a "lie of ommission" though.

The AI fabricated. The AI forged. The writer intended to take the credit for what it produced. With credit comes responsibility.
 
Upvote
48 (48 / 0)
Having read all comments to date (currently up to page 29) I want to offer a revised take now that I've had time to digest many points of view.

The main concern I have at this point isn't about the article in question. It's how Ars chooses to respond to a foreseeable situation: an article is published with inaccurate information. I personally am not fussed whether or not AI is used to write the article. Since it's against Ars policy that is going to be a problem for Mr Edwards. But from my perspective if the article were factually correct then Ars would likely not need to really say anything publicly if an internal policy were found to be violated.

Inaccuracies happen for a variety of reasons. Just because an LLM got in the mix this go around shouldn't change how Ars acknowledges the error, corrects it, and communicates about it. I don't think they took down the article because of the factual errors. Those could easily have been dealt with by simply correcting the quotes. I believe Ars nuked the article because it was highly embarrassing to them that their AI beat writer made such a fundamental error in usage of AI tools.

So either Ars does not have a standard process for dealing with errors in published articles or wasn't willing to follow them for this article. The first seems unlikely. So why aren't they following normal procedures for correcting errors? Because they correctly realized that due to the nature of the error this could blow up badly. So they panicked and pulled the article then issued an intentionally vague retraction statement. And now of course they're losing subs not just over the bad article but because of the perception that they are more interested in covering up the error than addressing it.

Now the sad truth is that the way Ars is handling it is the corporate way. You acknowledge the error as vaguely as possible, give some room for the outrage, make understanding noises, accept the loss of some subs, then wait for it to go away. This minimizes immediate revenue loss. However it is not the morally correct thing to do and over time it slowly erodes a community that is built on shared trust.

Many have requested a thorough post-mortem from Ars and an explanation of how they will amend their current review processes to minimize the chance of a recurrence. I'd also like better insight into Ars correction/retraction process so when such issues arise again, we can clearly understand where Ars staff are in the process and what we as readers should expect to be made known publicly and when.
 
Upvote
64 (64 / 0)
Post content hidden for low score. Show…
Post content hidden for low score. Show…

graylshaped

Ars Legatus Legionis
67,692
Subscriptor++
My take on Clawbots is simple. It's humans pulling the strings.
Take the governor off a self-driving lawnmower, unleash it in Aunt Martha's garden, and watch the fun!

Current "AI" is a tool that does what what humans designed it to do, regardless of the twists and turns the random elements in its programming allow it to take. The mistakes are purpose built.
 
Upvote
11 (12 / -1)

etr

Ars Scholae Palatinae
1,074
Lots of thoughts...

Were I to guess, the "process" in this case probably has the writer as a single point of failure. I would not call that ideal, but (1) I'd be surprised if it was particularly unusual, especially in the current media environment, and (2) even places with "good process" probably run into this at times. For the second point, take a journalist quoting a confidential informant. Unless there is absolutely always someone else there/on the phone (something a confidential informant would likely want to avoid), you have to take the journalist word on any quotes they collect.

I would encourage a more in-depth follow-up once folks are back from holiday.

Probably the first thing it would be nice to hear is an overview/rational of how you make the call to take a problem piece down versus leaving it up. Honestly, I can imagine arguments for taking either approach in a given case. I would offer that I don't think you have to convince everyone of the right answer in any given case, but I think some folks might feel a bit better if they heard some rationale that seems solid.

From the retraction:
We have reviewed recent work and have not identified additional issues. At this time, this appears to be an isolated incident.

These two sentences do a lot of heavy lifting, and I do not mean that in a facetious manner. Here's my mental model of how that worked:

  • Mobilize a team to work over a holiday weekend.
  • Select a healthy sample of articles from the writer, with a heavy weight on the weekend ones.
  • For each article, compile a list of all facts and quotes.
  • Verify each fact. For quotes from public sources, verify those.
  • For interview-based quotes, contact the quoted parties to assess the veracity of the quotes in the article.
  • If you complete this for a healthy sample of your healthy sample and do not find issues, you can probably conclude that the issue was not pervasive.

If you make it here, you can probably reasonably conclude that there is not a pervasive problem in the writer's work, and dismiss folks for the rest of the holiday weekend--to come back to the rest of the writer's corpus when the holiday weekend (and ideally, some comp time) is over.

There's every possibility I'm wrong on some of the, "what," here. However, it went, I think a lot of folks would like to know how something like this gets handled. That's arguably of extra importance if there is a single point of failure vulnerability here (which seems likely) as well.

I think the, "why," matters, too. It makes sense to set the first priority in the crisis response would be triaging how much bad material is making it into articles and start pulling down problems if they are found in any number. If a good sample does not suggest a serious problem, though, I can see letting people have their weekend and taking more time to thoroughly vet things during business hours.

As of this time, the writer does not seem to have been dismissed to this point. That feels right at this time. I think it's prudent to review the situation carefully, and I think it's appropriate to have that happen primarily during work hours (since there does not appear to to need to be a flood of retractions), as well as recovery if illness is involved. I think also think it is prudent to get some, "trust but verify" information from the writer involved, and it is reasonable that said writer be paid for the time that takes. You can't readily do that if you dismiss them hastily.

In terms of consequences, I think the first thing I would like to hear is that any work involving that writer is on hold for the duration of the investigation. (This is a good time to say what shouldn't need to be said.)

At the conclusion of the investigation...it's tricky. On one hand, I'm not out for metaphorical blood, but on the other I do not have a quick thought on what, short of a dismissal, comes down firmly enough to reinforce editorial standards. (Editorial standards aside, I'm sure a lot of other folks already went through the wringer of the weekend digging for more potential problems, and more still will do the same when they get back, many of them through no fault of their own. It's tough to go light when there is so much impact on other employees.) I suspect I'm not alone in those feelings. If you come up with a response that is short of dismissal and explain it, I would hear you out and suspect others would, as well.

PS: This is at least the second time I've seen Aurich jump on something like this, and as rocky as they are, I suspect they are far better for his jumping in. He's one of the great folks you have, and one of the keenest disappointments in such messes is how it sidelines such great work.
 
Upvote
26 (26 / 0)

Resistance

Wise, Aged Ars Veteran
418
I don't care about Jim Salter. He's not the one involved here. Edwards is and he doesn't deserve to get fired for it.
While Edwards may not deserve it (I agree with you that he probably doesn't deserve it) ars employees, readers, subscribers, and sources deserve a publication and team they can trust. It's hard to have that trust when you keep someone on your team that violated it.
 
Upvote
28 (31 / -3)

jkratz36

Smack-Fu Master, in training
10
While Edwards may not deserve it (I agree with you that he probably doesn't deserve it) ars employees, readers, subscribers, and sources deserve a publication and team they can trust. It's hard to have that trust when you keep someone on your team that violated it.
I absolutely agree with this but that is, again, a management problem. They took the article down instead of having the guts to keep it up with corrections and an explanation.
 
Upvote
12 (19 / -7)

Mradyfist

Ars Centurion
271
Subscriptor
The fact that the quotes that ended up in the article were fully generated, and therefore verifiably fabricated, is actually a blessing in disguise here - if the quote extraction tool had worked as expected, that would still be AI-generated content and against Ars' stated policy.

Whenever an article is quoting from another source, especially one like a blog which is publicly available, the whole job of the author is to understand the context that the quote is in so that the reader can assume that the meaning of the quote carries over, once it's stripped of that context. Doing so is what makes it not just plagiarism - you're checking that it represents the speaker, attributing it, etc, all the things that make it transformative work. Deciding where and what to quote is part of writing the article, and if you offload that to an AI then it's writing the article for at least that part of it.

In this case, Benji shouldn't have been using AI to extract quotes, because that's literally his job. That's what I pay a subscription fee for, I want to help provide a financial incentive for real people at Ars to reason about something like this, and help contextualize it for me. I'm fully capable of reading the blog post (and actually did, before Benji's article came out), and if I wanted an AI to choose some useful quotes to try to summarize it then I'd ask it to do so myself - I don't need someone to run the tools for me and didn't ask for it.

The thing that would make me happiest as an outcome has nothing to do with what happens to this writer though. I'd prefer it if the AI policy at Ars explicitly disallowed the type of "offloading thought" usage of AIs when writing articles, since in my mind that doesn't add any value to the final product that I couldn't have gotten by just slapping a blog into ChatGPT myself. Someone earlier suggested an AI usage disclosure, which also makes sense to me - I realize that saying "nobody ever so much as glanced at anything AI while writing an article!" is not a realistic statement, but the ways in which it was used (and preferably, which tools and models) is something that I'd like to know before the fact, rather than something discovered in a big messy scandal.
 
Upvote
62 (62 / 0)
Post content hidden for low score. Show…

etr

Ars Scholae Palatinae
1,074
Yeah, what does Jim Salter know about what it's like to be a writer for Ars Technica, anyway?

I don't care about Jim Salter. He's not the one involved here. Edwards is and he doesn't deserve to get fired for it.
Just to make sure we've left no one out, Thad Boyd was referring to the fact Jim Salter has written for Ars Technica and elsewhere (good stuff, too).

Even if one does not agree with his assessment, I would like to think his experience in the field of tech journalism warrants a hearing in this situation.

To be clear, this is not a dig at the ejection for a deviation from form rules, just a plea not to dismiss an opinion on something very much in his wheelhouse lightly.
 
Upvote
53 (54 / -1)
IMO, as far as "why they took it down," if it's late on a Friday and there's no time to really check the whole article's provenance, legal might prefer you err on the side of caution and take it down and not leave the misattributed quotes up for 3 days. It's also possible the writer was not answering texts because they took NyQuil and passed out (had COVID before, it's brutal).

That's different than if you have a correction ready and you can just do the retraction and correction all at once and leave the article up.

Which means, with any luck, a new version of the article may get posted, possibly paired with an article about...this thread.
 
Upvote
22 (22 / 0)

AI_Skeptic

Wise, Aged Ars Veteran
179
So, someone who should and does know better is, simultaneously, trying to stay on top of things and gainfully employed in a rapidly-evolving field despite struggling with long-term health (including mental health & focus ability) drawbacks from COVID.
A concern I have is this. The article was written by Benj, with Kyle being a co-writer. Why didn't Benj tell Kyle that he was under the weather and just do a quick check on his sources and make sure everything is fine because he was using a new AI tool to extract quotes?

I don't see any reason why Kyle wouldn't be willing to serve as his editor, just to catch any simple mistakes.

So why didn't Benj ask Kyle to do that?
 
Upvote
45 (46 / -1)

Marcus Andreus

Ars Scholae Palatinae
888
Subscriptor
It is a fundamental part of a journalist’s responsibility to verify their quotations. [...]
That Edward’s may have violated an Ars policy concerning AI is a distraction.
This is kinda where I've ended up. There are two wrongs here that got blended into one.

It's not just using an LLM, which is its own violation. I don't think it's exactly a distraction.

But also: Not verifying several quotes is itself a violation. If there were no chatbot involved, but a journalist asked their real, flesh-and-blood colleague Claude to pull some quotes for them for a story, we'd expect the journalist (and/or their editor) to verify the quotes. Especially if the journalist knows their real, flesh-and-blood colleague Claude is prone to confabulation every now and then.
 
Upvote
47 (47 / 0)

counterpoint

Smack-Fu Master, in training
65
Subscriptor++
I think the claimed chain of events is a perfect example of the lure and risk of LLMs and LLM-based tools, a sort of high-risk-high-reward game that can mess with people's heads.

1. Someone creates a Claude Code skill or tool that is intended to extract text from webpages verbatim. It's possible that, in testing, this tool may even have a very-high accuracy whenever it can access the page properly. When the tool doesn't work, it's supposed to give an error to reduce the risk of hallucination/confabulation (ideally, either accurate extraction or nothing). And, because the tool's intention is to be a quote extracting tool, you might even think to confirm its work (because the user's mindset at the time is "I am using a tool to accurately extract quotes, and I need to confirm it did it properly.")
2. The tool fails to work because the site in question is blocking LLM scraping. The temptation (followed here) is to "paste the error into an LLM and let it tell you why." (Some people may consider this foolish, but this is exceedingly common advice/practice in LLM circles.) The intention here may just be to figure out how you can get the initial tool to work, not necessarily asking it to do the job.
3. But then the LLM, in its response, addresses not only on the error but the "intent" of the request (to extract the quotes) because its training reinforces that it should try to please the user. We don't know, but I'm guessing the reply was an expanded version of something like: "The error you were getting is because the site in question is blocking LLM scraping when using this particular tool. However, here are some quotes from the article you mentioned: x, y, z."

So it's a kind of mental lure. Your initial mindset may be focusing on safety and accuracy, but by the time the LLM at the end "helpfully" and suprisingly gives you the answer you were looking for to begin with, your brain slips up. It's done what you wanted to do to begin with "serendipitously," so you forget about the "high-accuracy tool" (and the need to double-check it) and continue forward. It's very much the "remaining in control at all times while your car is in autopilot" situation @Robin-3 alluded to in a comment above. Of course you know the importance of double-checking everything, the risk of hallucination, and so on, but the more you use it (and the better it seems to be getting), the more likely it lures you into this false sense of confidence, especially in a moment of weakness. There are no "tells" in the text it gives you, beyond the need to double-check 100% of the time.

Maybe the answer indeed is just to not play. No matter how big the cylinder gets, you never know when the one bullet will hit you, even if you know better. And I suppose someone in this particular role feels like it's their job to play with all the tools, but temporarily forgot the cylinder wasn't entirely empty. There have been several Ars articles about lawyers and such who think of LLMs as the latest version of Google search and are simply ignorant of all the risk. But even the people who write up these kinds of cautionary tales can find themselves caught, even if they think they know better, by making a single mistake. (Worse, let's say the tool had worked fine this time, and indeed worked perfectly 20 times, and then one time it fails, silently, invisibly, until...)

Anyway, like many, I work at a company that is pushing us to use LLM technology more and I'm pressured to keep up with it. I do think it's fascinating technology in many ways, but I struggle because of this risk always at the back of my mind: the worst thing is to start trusting it, even subconsciously, because that's when it gets you. I've never really felt this way about other technology until now. It's hard to remain in proximity, trying to understand/appreciate what it can do when seemingly used well, but also remain 100% vigilent forever. Stories like this always stick in my mind.

Sometimes the apparent purpose in making a mistake may be to act as a lesson to others; obviously it really sucks if that happens to you and it sucks that it happened here. But I do think this story is a useful reminder to everyone that, for as far as LLM technology has seemingly evolved these last years, this fundamental risk is always at the root of it all (since it's at the center of how LLMs work), even if you're normally informed, well-intentioned, and cautious. If the original tool had itself given confabulated quotes, it might well have been verified and caught, but it's the combination of failure modes that tends to get you.
 
Upvote
31 (33 / -2)

acefsw

Ars Tribunus Militum
2,916
Subscriptor++
It's not plausible to me. This is not their first rodeo. This isn't even unprecedented in the last 5 years.

I've been loathe to dig up examples because I don't want to be accused of rehashing old points of contention and some of the topics are really sore spots, but this isn't the first article they've nuked (comment thread and all) and their non-response burned a bunch of goodwill and caused a small exodus of readers, subscribers, and community regulars. Ken and Aurich have had to step in then, too. I was one of those calling for a public accounting of the process in the aftermath and certain changes to the way Ars handles things as a matter of course, and we never got it. The changes that were made (which likely directly resulted in Aurich being able to lock things down and get an emergency editorial action so quickly this time) didn't extend to public disclosure about what went wrong and what measures were going to be taken to prevent the issue from happening again.

The "vibrant" discussion you're reading now might even be tame in comparison to what went on just a few years ago when that shit went down. But for many in the community, including dozens I'm still in touch with after they left, the wound was left to fester. Compounding the original mistakes was a sense that the incident was being papered over in hopes that it would just die quietly in the neglect and everyone would just move past it. Well, a lot of people did: they move right off of Ars because of how it was handled.

So I'm not going to cut slack if the whole thing ends here. I've been there before and it's awful. Worse is the implication that after last time, management didn't change its approach. It would be tragic if the same thing happened again when it doesn't have to. That would tell me everything the leadership of Ars really thinks about its readership, let alone the active community. That's why I'm hoping the powers that be at Ars have learned from before and will make different choices that are more responsive to the needs of their readers this time; I have to say that some recent smaller dustups have not left me 100% confident. There's a lot on the line as I see it. At least, there's a lot on the line for me.



This. If a YouTube video contains a bunch of DALL-E graphics, it's slop even if an actual human is in the video doing narration or something.
This became an issue with dubbed animation too. People didn't like the AI-generated audio. The audio was slop even if the video elements attached to it weren't. Therefore the product was slop.


Something like this is sort-of tenable with the "indie web" of small personal websites that still exist, e.g. Neocities.
Sadly the human affiliate content mill crap made it almost impossible on the Internet at large even before press-button slop generators were available.
These days I'm finding myself doubting general information pages that show up in specific searches unless they predate 2023 or thereabouts. Everything after that I treat as "probably slop." Like, if I see a cool plant growing by the road and want to know if I can find a spot for it on my property? A search might turn up hundreds of pages from the last couple of years on exactly that topic, even that specific plant... which itself seems suspicious.
Agree 100%. I was going to unsubscribe the last time they pulled this shit and turned off auto renew, but it renewed anyway. Maybe I screwed it up so let it go at the time.

I really don't have high expectations here and won't be as lazy about pulling my support this time.
 
Upvote
17 (17 / 0)
Post content hidden for low score. Show…

niftykev

Ars Scholae Palatinae
730
One thing that most everyone is seeming to miss or at least not take into account when complaining that Ars took down the article is that in Benj's BlueSky mea culpa, he said that he asked his boss to pull the piece because he was too sick to fix it.

Maybe that's revisionism on his part, but if he's saying he asked it to be pulled maybe not put all the blame on Ars for pulling it?
 
Upvote
27 (28 / -1)

graylshaped

Ars Legatus Legionis
67,692
Subscriptor++
I think it depends on how new a journalist is to the job. For a new journalist, I believe intent should matter, and a new journalist should be taught best practices - which includes not using AI tools to extract information from websites and then quoting what AI generates because these tools are just a word probability generator, and to reinforce that sources need to be tripled checked to minimize liability.

Benj, according to his website, has over 20 years of experience in Journalism as a reporter, and is an expert in AI. This tells me the following:

1. He failed to performed basic due diligence in confirming a quote from a website is valid (which could be performed using a copy/find/paste in the web browser of his choice).
2. He does not understand that GenAI tools are a probability generator, and can not accurately extract text from websites (a massive failing to understand how GenAI operates!)

As I understand it, a reporter should always verify the source three times, just to make sure the source isn't being misquoted. This minimizes the risk of liability.

If he can't do a 30 second search of a website, that shows a massive failure in willingness to verify AI information.
The post prior to yours called it "reckless disregard"; without trying to get into legal standards of negligence I'd lean towards a casual disregard for basic journalism principles. The guy seemed to be responsive to interaction on this--and not incidentally--said in a follow up note that Ars was not one of the sites who had reached out to him. The heck with a web search. Why not send him a note and request for comment?
 
Upvote
15 (15 / 0)
Post content hidden for low score. Show…
Lots of thoughts...

Were I to guess, the "process" in this case probably has the writer as a single point of failure. I would not call that ideal, but (1) I'd be surprised if it was particularly unusual, especially in the current media environment, and (2) even places with "good process" probably run into this at times. For the second point, take a journalist quoting a confidential informant. Unless there is absolutely always someone else there/on the phone (something a confidential informant would likely want to avoid), you have to take the journalist word on any quotes they collect.

I would encourage a more in-depth follow-up once folks are back from holiday.

Probably the first thing it would be nice to hear is an overview/rational of how you make the call to take a problem piece down versus leaving it up. Honestly, I can imagine arguments for taking either approach in a given case. I would offer that I don't think you have to convince everyone of the right answer in any given case, but I think some folks might feel a bit better if they heard some rationale that seems solid.

From the retraction:


These two sentences do a lot of heavy lifting, and I do not mean that in a facetious manner. Here's my mental model of how that worked:

  • Mobilize a team to work over a holiday weekend.
  • Select a healthy sample of articles from the writer, with a heavy weight on the weekend ones.
  • For each article, compile a list of all facts and quotes.
  • Verify each fact. For quotes from public sources, verify those.
  • For interview-based quotes, contact the quoted parties to assess the veracity of the quotes in the article.
  • If you complete this for a healthy sample of your healthy sample and do not find issues, you can probably conclude that the issue was not pervasive.

If you make it here, you can probably reasonably conclude that there is not a pervasive problem in the writer's work, and dismiss folks for the rest of the holiday weekend--to come back to the rest of the writer's corpus when the holiday weekend (and ideally, some comp time) is over.

There's every possibility I'm wrong on some of the, "what," here. However, it went, I think a lot of folks would like to know how something like this gets handled. That's arguably of extra importance if there is a single point of failure vulnerability here (which seems likely) as well.

I think the, "why," matters, too. It makes sense to set the first priority in the crisis response would be triaging how much bad material is making it into articles and start pulling down problems if they are found in any number. If a good sample does not suggest a serious problem, though, I can see letting people have their weekend and taking more time to thoroughly vet things during business hours.

As of this time, the writer does not seem to have been dismissed to this point. That feels right at this time. I think it's prudent to review the situation carefully, and I think it's appropriate to have that happen primarily during work hours (since there does not appear to to need to be a flood of retractions), as well as recovery if illness is involved. I think also think it is prudent to get some, "trust but verify" information from the writer involved, and it is reasonable that said writer be paid for the time that takes. You can't readily do that if you dismiss them hastily.

In terms of consequences, I think the first thing I would like to hear is that any work involving that writer is on hold for the duration of the investigation. (This is a good time to say what shouldn't need to be said.)

At the conclusion of the investigation...it's tricky. On one hand, I'm not out for metaphorical blood, but on the other I do not have a quick thought on what, short of a dismissal, comes down firmly enough to reinforce editorial standards. (Editorial standards aside, I'm sure a lot of other folks already went through the wringer of the weekend digging for more potential problems, and more still will do the same when they get back, many of them through no fault of their own. It's tough to go light when there is so much impact on other employees.) I suspect I'm not alone in those feelings. If you come up with a response that is short of dismissal and explain it, I would hear you out and suspect others would, as well.

PS: This is at least the second time I've seen Aurich jump on something like this, and as rocky as they are, I suspect they are far better for his jumping in. He's one of the great folks you have, and one of the keenest disappointments in such messes is how it sidelines such great work.

That's a big wall of speculation based off very little actual information provided to the public readership.

I'd like to think there's ongoing work to evaluate and rectify the situation and that readers like us will eventually get more details on what's been done to address the issue and prevent future problems, but as of now nobody without inside knowledgeable has any idea how this whole thing will be wrapped up.
 
Upvote
3 (3 / 0)

Resistance

Wise, Aged Ars Veteran
418
One thing that most everyone is seeming to miss or at least not take into account when complaining that Ars took down the article is that in Benj's BlueSky mea culpa, he said that he asked his boss to pull the piece because he was too sick to fix it.

Maybe that's revisionism on his part, but if he's saying he asked it to be pulled maybe not put all the blame on Ars for pulling it?
My employee did bad thing X, when I found out and confronted them he asked me to do thing Y, I did thing Y. I am not fully responsible for doing thing Y because my employee asked me nicely to do it.

When phrased that way does it still sound reasonable to you?
 
Upvote
7 (15 / -8)
Status
Not open for further replies.