I will have to disagree with that. Even competent people make rare mistakes. It is human to err, and I want more humans at Ars; not less.
Zero-fault mindset does not foster a culture of transparency and honesty.
Help readers understand what happened with a follow-up. But don't throw an otherwise competent person under the bus because they made one (admittedly huge) cockup. Your eagerness for blood will not make the world - or journalism as a whole - a better place. In a world high on LLMs this could have happened to anyone.
Hey, anyone can make a mistake once.
I do want to see Ken or some other senior person at Ars post a follow up about what will change with processes and procedures to make sure there is adequate review and fact checking before publish.
But, if the author made a mistake, and learns, and doesn't make that mistake again, well, everyone is human. Give people a little grace, unless they definitely don't deserve it. I think in this case, they likely do, if it wasn't intentional.
I think we need to give Ars time to investigate and figure out what happened. If someone blatantly ignored Ars policies, maybe they should be fired. But, since we don't know the exact details of how AI text ended up in the article, I think we should let the Ars senior leadership investigate and figure out what the most appropriate response is. I think they will do the right thing.
That is not what I said, or implied. Your statement is that Benji is the most trustworthy author on this issue (AI). He is the person who has admitted to using AI tools improperly, and against policy. Both are violations of trust on this issue. The word you have used is quite literally the opposite of what is accurate. I have not advocated any course of action towards him, or you, but rather pointed out the extreme inaccuracy of what you have said. This entire event demonstrates that inaccuracy and I felt it worth calling out.
As a software developer, I now routinely see contractor analysts give me code that they haven't read and don't understand, to run in my environment. Do I run their code without at least reading it first? Of course not. I make them wait while I read their code. That's what being responsible looks like.As a long time reader and subscriber I find this extremely disappointing. While I understand the insidious appeal of chat bots, I find it difficult to understand how some writers on this site are falling under their spell given the existential threat your profession and this site face from them.
Never, ever do this again.
I expect a follow up post mortem of this issue with the gravitas that this situation merits. I am not calling for a Benj’s head, I believe he is as much a victim of this delusion inducing poison as Scott Shambaugh, whose handling of the situation I find admirable and inspirational. I am begging you to realize that a mania has set in that is rushing us all towards a precipice, and to understand your part in it.
If we as human readers cannot trust you as human writers, then this site is pointless. Consider the wider implications of this! Game out what happens to society as this rot creeps in!
On a related note, as a software developer, I do not greatly respect or appreciate your series of articles evangelizing these chat bots for the purposes of crating software. They read, as do all articles written by people spending excessive time with chat bots, as of people caught up in a manic episode thinking they have become gods gift to painting or stock trading or what have you. I realize that it is your job, but you MUST retain journalistic credibility and independence, and to bear in mind your audience! By the end of your articles on chat bots I tend to think, somewhat pettily, that as a profession who are at least as much under threat as mine, it is deeply short sighted of you to risk being seen as an enthusiastic cheerleader of the denigration of the human value that WE provide.
And if you as journalists have been caught up in this mad rush to the cliff, imagine the mental state of those people developing the chat bots, when the entire world is slavering at their feet and showering them with unimaginable riches. Do you wonder that the employees responsible for the ethical deployment of these tools are fleeing, in some cases to abandon technology entirely?
Or the mental state of soldiers in the so called Department of War, who are demanding that OpenAI remove any remaining restrictions in Claude so that they can put chat bots to use “for all legal purposes”?
Or those of the lawyers who are responsible for defining the framework of those legal purposes?
Or of the politicians who are pushing for the deployment of these tools across state, foreign, fiscal, and military policy?
Splash some cold water in your collective faces, shake off the fever, and get a grip. These aren’t “tools”, when was the last time a hammer or a typewriter caused psychosis?
Your job is to look deeper than how shiny the latest shiny is - for gods sake that is already covered breathlessly by half the planet - to dive into the real story here, that we as a species are rapidly spiraling into a (very foreseen!) period of existential threat for sentient civilizations. Now is your time to prove your worth, get to work.
(I put this comment inline on the post, relying here for visibility since I think it's helpful info)this is obvious trolling, and they're not the only one doing it. do we really want every article for next N days to be filled with comments consisting entirely of "so is this made up too?"?
i have noticed that the forum doesn't seem to indicate temporary bans very well, so it's possible they just got a timeout rather than a permanent ban...
The double byline on the story makes me think Benj probably was online while sick, which is not something I would say is inherently wrong, and saw the legitimately interesting emerging story. So then he asked a colleague to help write it.IMO, Ars needs to take a deep look at itself. BIG "if" here, because I don't know what the actual scenario is, but it's plausible that the below is similar to what might have happened:
If Benj felt so pressured to push out an article quickly, despite being sick, then the fault is not just on Benj for writing it, nor just the editors who didn't review it, but also on Ars' management/ownership. Only in the US (among supposed first-world countries) would such a scenario be even remotely normalized, and even then, it would plainly be the wrong way to do things. A culture where an employee doesn't feel they can call off when sick means that the employees are not secure, they're not safe, and they feel they have to choose their job over their health. It doesn't matter if you have FMLA days left or sick days accrued if you feel that you'll be punished in some way for using those "benefits," or that you have to work through sickness to "keep up" at work.
Now, there are many other things that could have happened. Hell, Benj could even be making up the illness after the fact. But there is something concerning in this entire chain of events that suggests something is deeply wrong.
I probably didn't make myself clear, but what I was trying to say is that I don't accept the AI summary at face value but click the link it provides to the source(s) it cribbed from. I don't know if Google provides those because I rarely use Google anymore.Be careful with using AI summaries for scientific topics. Within my subject area, the AI summary frequently presents some entirely nothing paper from a vanishingly obscure journal instead of the most important / best studies. These summaries also make up facts by combining sentence fragments from their “sources” in a stochastically plausible manner.
Which is not to say that SEO didn’t already do something pretty similar back in the Age of Search. I’m guessing the search companies weighted the models for their own, terrible, SEO algorithms.
If anything you’re better off asking chatGPT. There’s a (slightly) better chance that an actual human might have reviewed the outputs at some point.
Well, except for the fact that Benj Edwards seems to be quite caught up in the thin air of the AI atmosphere. So technically he is doing the same thingI also call nonsense on “being sick made me seek out a new tool and use it uncritically.” People rarely seek out new and inventive complications to their daily workflow when they’re feverish. They just do the same things they always do, but with more mistakes (mistakes such as getting caught inventing quotes).
One can be imperfect and still have standards. I don't plan to change my subscription at this time, but I'm not a fan of everyone's reporting at Ars, and this incident definitely isn't helping to redeem anyone. I think readers are justified in feeling betrayed, based on what information we have been presented with.I hope all the people dropping their subscriptions over this one incident (which is still playing out) are living lives of absolute, infallible perfection.
Agree. I think the chain of events goes a long way towards transparency. Took a bit to piece it together, but it appears like it boiled down to conflation of source material in Scott's post (as he quoted snippets of the bot's posts). Doubly so as this is a great learning lesson at the heart of how AI slop is and will impact how we get information - and clues as to spotting it and what to do about it collectively.Folks. Deleting the story is at best like hitting a double when it's a homer that is needed. I'll cite the policy over at the NYT: the updated story is appended with a quote of the incorrect text, exactly as it was originally published, along with the corrected text. Here, there is no posting of a direct link to the now-deleted story; Ars merely mentions archive.com. Several commenters here show how they found the original story by less-than-direct sleuthing.
I think a great part of the consternation comes from a feeling of "Et tu, Brute?", when in our case Brutus (Benj) was supposed to be very well aware of the golden hammer he was using as a surgeon's scalpel.Why so much focus on the AI aspect? Checking you quoted someone correctly is an age-old requirement for journalism. Ars quoted from an unreliable source. The fact that source was AI is a side-show to the fact that someone failed to do journalism properly.
If I had written in a forum post where I claimed I heard Shambaugh say [whatever], Ars should still either be confident they checked it enough to use it as a quote from Shambaugh or attribute the quote as coming second hand through me. The issue of what tool was over-relied upon is not the primary issue here.
The double byline on the story makes me think Benj probably was online while sick, which is not something I would say is inherently wrong, and saw the legitimately interesting emerging story. So then he asked a colleague to help write it.
It's somewhat analogous to the time, many years ago when I still committed acts of journalism, when one of my colleagues had a broken arm and couldn't type with two hands. But honestly, writing is the easy part. Knowing what to write about is the real work.
So for a few weeks he'd identify a story then use his good hand to cut and paste from sources and send them to me. I'd write the articles and send them back for his thumb up. A couple times when he wanted direct quotes he made phone calls and sent me the recordings with in and out times. I can't remember now how we bylined those stories. It would have been fair to split them but I wasn't too hung up on getting credit so I might have accepted being a ghost. I can't remember now.
Anyway, my point is that while if the covid story is true it would have been perfectly fine for Benj to not do anything with the story, I get that with the web at our fingertips the dividing line between what we browse out of personal interest and what we browse for work is nearly nonexistent. So identifying a story worth covering while sick isn't a huge red flag for me. But FFS once a colleague is roped in to do the actual writing, give them all the sources and have them write it. If you're too sick to write then you're too sick to write.
The bold portion of your comment is where my mind is at right now. I've been around here for a bit and I've never really felt like I needed to check other sources to verify what I'm reading here was factual. Now I'm questioning that.Reading the article and the OG author's comments...it sounds like a mess.
I'm more worried about what this all implies. That AI can now write hit pieces on you and people can lazily report that and how can you verify ANY of that? If the person the hit piece was on didn't comment, how would you know it was true or not? How can we see through AI's "seemingly factual" crap? Even the references you're going to check, how can you verify that?
Fair pointNewspapers can’t delete articles. Online sites have the illusion that they can delete articles.
No, you were clear. My concern is that the linked sources surfaced by the AI are also frequently of low quality as well as being poorly summarised.I probably didn't make myself clear, but what I was trying to say is that I don't accept the AI summary at face value but click the link it provides to the source(s) it cribbed from. I don't know if Google provides those because I rarely use Google anymore.
It's very similar to how I use Wikipedia for research; I read, or skim, the article and then follow the links to sources for the bits that interest me both to get more information and to confirm that what I just read is an accurate representation of it. If it's anything remotely important I don't just believe AI or Wikipedians without checking their sources.
Effing thank you. I’m an AI researcher (among many other things) and this is the way I explain LLMs to non-sciece folks. They are just really good at guessing which chunk of text comes next.It's not really uncanny either when you consider it is an auto complete (despite what AI stand would have you believe) trained on the entire internet with responses meticulously corrected/graded by underpaid global majority workers who often have PTSD now. ALL text that it generates is technically hallucinations and the model has no relationship to facts. [emphasis mine]
When someone steals from the company account, in a sense, the whole company failed. But one guy was the thief.This is a collective failure, and no one person is fully responsible.
This is hilarious and sad. Mr. Edwards' continued employment would prove beyond a shadow of a doubt that Ars Technica is absolutely willing to tolerate this misconduct.Of all people to make this same mistake again, it is less likely to be this author. His reporting will be viewed with a critical eye for years, and he knows it. He'll be on his best behavior because Ars has proven it won't tolerate the black eye, evidence by the retraction and acknowledgement.
Okay, I've been staying out of it but your last paragraph made me think about how that describes exactly how I managed teams too. But where you lose me is that there's a big difference between doing a dumb and blatantly subverting stated policies and procedures.That would be bad. I still don't think that's a fair analogy.
To stay with the car analogy: he made a huge scratch on the front door. It's ugly and totally unprofessional. But he then apologized profusely, took full responsibility, and told me he will do whatever he can to fix it.
So, to be clear: nobody got permanently hurt, and he wants to fix it.
As I find it rare for people to own their mistakes, and as there is no pattern of this being a systematic problem, I am leaning heavily towards forgiveness.
You can call me a shitty leader if you like, but I treat my team the same way. The result: they're HIGHLY productive, NEVER afraid to own up to their mistakes, and we have a healthy culture where discussing f*ckups and fixes is never toxic or dangerous. That goes for me to: I am secure enough in my leadership position to share my own f*ckups with them on equal terms. In leadership terms, this is called "psychological safety," and it's the means to make high-performers stick with their job for a very, very long time.
Damn, you're correct about that. I got a 30 day ban from a thread for doing just that. Quite rightly, too.Even bolding key points of quotes in the forums gets a textual slap from Ars with threats of bans for “manipulating” quotes. We should keep that in mind when we observe the response to this.
No, Ars made a public spectacle of the retraction. You'd be lucky if Gizmodo even acknowledged it.This is hilarious and sad. Mr. Edwards' continued employment would prove beyond a shadow of a doubt that Ars Technica is absolutely willing to tolerate this misconduct.
He can reestablish his credibility with readers invested enough to give him the generous benefit of doubt, you apparently among them. I am not, and I've never been all that impressed with his articles to begin with, because he always struck me as insufficiently skeptical of a field that merits a very critical and analytical eye.Funny. I feel the exact opposite. He's probably the most trustworthy writer on this issue for years to come ...
oh shit. you're absolutely correct. that was a typo. it's a single 9950x3d and dual 3090. that was an honest mistake. It's in a Gigabyte Aorus XTreme AI TOP motherboard which we can all agree is a single CPU board as well.You do not have and will not have a "dual amd 9950x3d" system because there are no motherboards that support that configuration.
I don't think this is a public spectacle at all. It has garnered 20 pages of enthusiastic comment, but it was a pretty standard, responsible retraction while they're doing their due diligence on what happened and why. I'm reasonably satisfied with that. But I will need to hear more.No, Ars made a public spectacle of the retraction. You'd be lucky if Gizmodo even acknowledged it.
Do Ars even have editors like that for their long term/senior writers? Serious question.The AI is a red herring. Benji fabricated a quote. That's at minimum a fireable offense. 20 years ago, he would have been canceled from the writing community.
The people defending Ars and Benji are why journalism is in the toilet.
And the editor? Their first job is to verify sources.
The claim being made is with regard to the intent to break company policy, and "lying about it until caught".the writer himself admitted to using ai to fabricate false quotes.