You know how sometimes one person on a mortgage ends up suing the other person on a mortgage? That's an outcome of trust being violated. Taking joint responsibility for the mortgage means if the other person bails on you, yes you are still responsible for making payments but they can and should be held culpable for the financial and reputational damage they have done to you by violating your trust.Maybe I’m naive, but I feel a joint byline is like jointly signing a mortgage. Both are willingly responsible, and if things go sideways, it’s on both of parties. We give a lot of responsibility to journalists in exchange for higher expectations.
I wanted to highlight this part of your post because I agree with it 100%. We can't overlook this as an opportunity. Not just for a single Ars writer, not just for Ars Technica itself, but it needs to be examined and lessons drawn for all of journalism at the very least.Ironically, this latest article by Benj, the comments related to it and the eventual resolution will teach us more about the actual current state of AIs in publishing then the last 100 articles that fawned over AI (look, Conan next to the boob tube - neat!).
This episode should be a learning experience. End result should be improvement.
Which is the author's job, and which they failed to do.
Based on personal experience, I can imagine a "best case" scenario like this:
No clue whether this is even closer to true, but because I can see it happening (and it's consistent with my mini-mental-model of the writer), I need to wait for the postmortem from the people that know. Ars has enough trust in the bank with me that I believe that will come.
- work while sick (bad idea)
- can't focus eyes; try out new verbatim quote tool (not a policy violation in itself)
- tool doesn't work; rabbit hole--why not? Ask Chat GPT.
- fever related confusion on which window is which; pull what I thought were parts of actual blog, not ChatGPT. (Know better than to use ChatGPT for actual writing, but confused.)
He literally said, “… I decided to try an experimental Claude Code-based AI to help me extract relevant verbatim source material.”
He doesn’t seem to understand that LLMs are so unreliable that expecting them to provide “verbatim” quotes is incredibly stupid. He may as well have been asking for fabricated quotes, because that’s what LLMs do.
Obviously there might be cases where the usage of AI tools can be very useful for a journalist (same as it can be used as a tool for developers) as long as you always verify the results.
Benj's literal job at Ars has been to familiarize himself with the blossoming AI trend and to investigate the technologies, companies, and products flooding the Web and meatspace alike.I also call nonsense on “being sick made me seek out a new tool and use it uncritically.” People rarely seek out new and inventive complications to their daily workflow when they’re feverish. They just do the same things they always do, but with more mistakes (mistakes such as getting caught inventing quotes).
I understand the desire to not continue distributing bad information. However, when the retraction posts simply to a page that is essentially deleted, it's not at all useful for the reader to understand what is happening.
The comment locks I expect and think are good practice. In the absence of information, speculation turns to toxic rumor-mongering and assuming the worst.Specifically:
I expect a higher standard of: transparency; responsiveness; editorial oversight (in the article production process); and, in general, integrity from Ars.
- Memory-holing the original article
- Effectively deleting said article’s comments
- Not referencing the original article in the editor’s statement (i.e., in this article)
- The locking of comments in the Ars forum post about this…incident
I think you're drawing the critical distinction here. I manage people and I have overseen contractors. An error of omission is an "oh fuck, I ran out of time to submit that deliverable/didn't get that email/forgot to call into that meeting because I was in the zone." I have a lot of tolerance for errors like that, because if there's too much work or too few people or not enough time or all of those together, yeah, shit falls through cracks or has to get triaged. Mistakes happen. Work on improving it, but I'm not going to rip anybody a new asshole unless it becomes a trend.Okay, I've been staying out of it but your last paragraph made me think about how that describes exactly how I managed teams too. But where you lose me is that there's a big difference between doing a dumb and blatantly subverting stated policies and procedures.
Forget all the bad car analogies, let's look at this like it was code. Effectively what happened is a developer asked an LLM to write a new function for them on their local machine and then they pushed it directly to Production without running it through Dev, Staging, or QA review. That kind of thing is more than an oopsie.
I say that as someone who once crashed a big ecommerce site by pushing out a small change that required rebuilding the cache of every single URL during the peak load for the year. It was my error and I owned it, but I didn't make that change during a code freeze period without first making my case for why it was necessary and getting approval from everyone else responsible for site stability. It was my fuck up even though in the end it turned out the documentation we all relied on was incorrect and that's why the site crashed. I was forgiven but if I'd just shoved that out all on my own without following procedures, looping in others, and getting the change approved then I damn well should have been fired.
(For those keeping score, yes I actually have been both a professional writer and a manager of front-end web dev. And other things too. "Specialization is for insects" pretty much defines my career.)
Not that your whole quote wasn't relevant and important, but I cut it down to say, yeah. This. Same. This cannot be the end of the conversation. They don't need to shitcan Benj to make a point, either, but I do expect a "this is why this failure happened, this is what we're doing to avoid it, we apologize."So I'm willing to give Ars some space only so long as I can believe they're working to do the right thing, and that the job isn't done yet. We'll have to see if this incident plays out differently. If this "notice" ends up being the final word, then I'll have to make hard choices.
Not being snarky, answering honestly: an excuse is something that makes an action okay, once you understand the excuse. An explanation gives you the why, but that why isn't sufficient to excuse the bad action taken.Out of curiosity, do you see a difference between an excuse and an explanation? Considering that Benji in his posts did accept his responsibility for the situation?
There sites are an alternate way to access their traditional newspapers, though. Ars is not in that camp. I'm not saying one way or the other is better. I think the EU's right to be forgotten law is just dumb. both practically and ethically. IFair pointbut do online news publications really delete articles after publication, like the NYT, WashPost ?
May I suggest a qualifier? An excuse is something that makes something seem okay, but may or may not represent reasonable justification for the choice made.Not being snarky, answering honestly: an excuse is something that makes an action okay, once you understand the excuse. An explanation gives you the why, but that why isn't sufficient to excuse the bad action taken.
For example, "I was drunk when I wrecked my car" is an explanation, but it's not an excuse.
"I had COVID brain fog when I used ChatGPT as a primary source" is, similarly, an explanation--but not an acceptable excuse. Particularly not from a reporter who we're supposed to be trusting to objectively analyze the technology in the first place. That analysis clearly wasn't objective enough, and here we are.
I do have sympathy for Benj feeling enormous pressure to get that piece out, and do it that day. I've got personal experience of that pressure, and it's very real. That's also the Writers' Guild's job to address, and it's still not sufficient mitigation to excuse pumping ChatGPT slop into an article.
Again, I wish Benj well, I don't think this is or even necessarily should be a career ending mistake. But it's definitely not the kind of mistake you get no serious consequences from. And on Ars' side of the equation, there has to be a realization of what message the readers and subscribers AND authors take from this only getting a slap on the wrist.
There's no getting out of this without sending a message. Another commenter earlier pointedly said they would unsubscribe if Benj doesn't keep his job... Which should just make clear that fence sitting isn't going to work. Ars needs to decide what message it's going to send, and then send that message clearly.
And I sincerely hope that clear, unambiguous message is "this is absolutely not acceptable behavior, best of luck at your next job." I'm perfectly fine with the old "we'd like your resignation letter by $date" dodge.
I say none of this out of a spirit of vindictiveness. I say it because this site and this community is important to me.
It's been important to me for more than half my life now, and this is a watershed moment that will unavoidably shape what Ars is. And I do not want it to be the kind of place where you're wondering just how much slop went into an article, and you have to wonder that, because you've seen management tolerate it even when it's this dead-to-rights obvious.
I don't know the answer. But speaking as a professional editor, I can say that verifying sources is something we really don't want to do or have time to do. We really need to depend on our writers to provide us with good sources, correctly quoted and properly cited.Do Ars even have editors like that for their long term/senior writers? Serious question.
It wouldn't surprise me in the least if that experimental tool turns out to be an internal tool being pushed by Condé Nast.He literally said, “… I decided to try an experimental Claude Code-based AI to help me extract relevant verbatim source material.”
He doesn’t seem to understand that LLMs are so unreliable that expecting them to provide “verbatim” quotes is incredibly stupid. He may as well have been asking for fabricated quotes, because that’s what LLMs do.
And this coming from a supposed “senior” AI reporter. He has a recorded interview where he talks about using AI chat bots to help him write (or rather, assemble) articles. Nobody should trust anything he assembles from the slop trough of AI output.
You have to respect the nominative accuracy of Mouth Breathing Troglodyte, though.When you're trying to create a good straw man, you need to know where to draw the line.
Forget all the bad car analogies, let's look at this like it was code. Effectively what happened is a developer asked an LLM to write a new function for them on their local machine and then they pushed it directly to Production without running it through Dev, Staging, or QA review. That kind of thing is more than an oopsie.
I made a similar mistake once. I sent a deliverable directly to a client without internal review, and there were problems, including using the wrong version of a letterhead. The explanation was, I was under a tight deadline and rushed through and didn't think about it and assumed it was ready to roll, because missing the deadline was more important than the internal review (it was not). The excuse was, I was a brand new dad and was on a week of about two hours of sleep a night and I was, literally, not in my right mind. I sent written apologies to the client and immediately went on a week of unpaid leave to get rested and support my wife so we could get to a place where the kid wasn't waking up 8-10 times a night screaming like a circular saw cutting sheet metal. Thank fuck it was a sufficient excuse.Not being snarky, answering honestly: an excuse is something that makes an action okay, once you understand the excuse. An explanation gives you the why, but that why isn't sufficient to excuse the bad action taken.
For example, "I was drunk when I wrecked my car" is an explanation, but it's not an excuse.
"I had COVID brain fog when I used ChatGPT as a primary source" is, similarly, an explanation--but not an acceptable excuse. Particularly not from a reporter who we're supposed to be trusting to objectively analyze the technology in the first place. That analysis clearly wasn't objective enough, and here we are.
I do have sympathy for Benj feeling enormous pressure to get that piece out, and do it that day. I've got personal experience of that pressure, and it's very real. That's also the Writers' Guild's job to address, and it's still not sufficient mitigation to excuse pumping ChatGPT slop into an article.
I don't even necessarily request that. I would not be disappointed if that was the outcome, mind you. It would be reasonable and appropriate. But even if he stays, and the message is something specific about stricter editorial oversight and more reach back support for writers if they can't make a deadline, I'm okay with that. Personally, that is. But it can't just be a retraction and a "sorry, our bad."Again, I wish Benj well, I don't think this is or even necessarily should be a career ending mistake. But it's definitely not the kind of mistake you get no serious consequences from. And on Ars' side of the equation, there has to be a realization of what message the readers and subscribers AND authors take from this only getting a slap on the wrist.
There's no getting out of this without sending a message. Another commenter earlier pointedly said they would unsubscribe if Benj doesn't keep his job... Which should just make clear that fence sitting isn't going to work. Ars needs to decide what message it's going to send, and then send that message clearly.
And I sincerely hope that clear, unambiguous message is "this is absolutely not acceptable behavior, best of luck at your next job." I'm perfectly fine with the old "we'd like your resignation letter by $date" dodge.
I cannot possibly say any of this better than you just did, and I cosign it. Because, same. This site and this community is important to me. But it will become less so, much less so, if this is implicitly tolerable to its editors.I say none of this out of a spirit of vindictiveness. I say it because this site and this community is important to me.
It's been important to me for more than half my life now, and this is a watershed moment that will unavoidably shape what Ars is. And I do not want it to be the kind of place where you're wondering just how much slop went into an article, and you have to wonder that, because you've seen management tolerate it even when it's this dead-to-rights obvious.
I might be spliitting semantic hairs but I think there's a big difference between making an excuse and having an excuse. We're in a linguistic period where the word "excuse" is more often used in the former sense of trying to deflect blame, but we do still use it in the latter sense of agreeing there are mitigating circumstances.May I suggest a qualifier? An excuse is something that makes something seem okay, but may or may not represent reasonable justification for the choice made.
Fine, you want to stay with the car analogies?That would be bad. I still don't think that's a fair analogy.
To stay with the car analogy: he made a huge scratch on the front door.
I think a lot of people don't understand how much of an outlier The New Yorker's editorial process is. I consider myself lucky that I ever had a job where what I wrote was looked over by a copy editor. A fact checker would have been a total fantasy.I don't know the answer. But speaking as a professional editor, I can say that verifying sources is something we really don't want to do or have time to do. We really need to depend on our writers to provide us with good sources, correctly quoted and properly cited.
Of course, "verifying sources" can be more, or less, rigorous; and sometimes rigorous verification of sources, especially in book manuscripts citing other books, isn't realistically possible given time and budget constraints.
However, if sources are to be verified by an editor, then the kind of editor normally tasked with such verification would be a copyeditor. (And most copyeditors I know are already overworked.)
Edit: spelling
Yes, I think that is an meaningful difference, and the line I was trying to suggest exists.I might be submitting semantic hairs but I think there's a big difference between making an excuse and having an excuse. We're in a linguistic period where the word "excuse" is more often used in the former sense of trying to deflect blame, but we do still use it in the latter sense of agreeing there are mitigating circumstances.
Every place where I had the power to push to prod I could have easily done it without going through any intermediary steps. Not the best practice, but this was consistent from small non-profits to one of the largest market cap companies in the world.The problem with this analogy is that apparently the system in place allowed the code to go straight to production without anyone in Dev, Staging or QA review it. Yeah, the bad code is a problem, but so is the system that allowed it to production without more than a cursory glance.
I don't know about you, but if my QA lead found me running a process that allowed straight to production, I'd be smothered in honey and staked out on an anthill.
I would like the retraction/apology to go further because of this exact reason. Ars publishes with bylines for a reason. Ars presumably knows exactly how this happened, but we still don't. If one or both the authors made the judgement error, then an apology from them to the audience (I would hope they've already apologized to the subject) would go a long way to helping us trust Ars still.Two different writers were listed on the byline. Did one use AI without the other's knowledge? Seems like a big lapse in judgement happened somewhere.
This is an interesting take, when delivered by a Lurkus with no subscriptor tag of their own.I hope all the people dropping their subscriptions over this one incident (which is still playing out) are living lives of absolute, infallible perfection.
Thirded.I cannot possibly say any of this better than you just did, and I cosign it. Because, same. This site and this community is important to me.
The people defending Ars and Benji are why journalism is in the toilet.
Pretty sure on mobile on my Android Chrome I can long press and get the tool tip. Just a heads up. IDK if maybe it's different on safari(I put this comment inline on the post, relying here for visibility since I think it's helpful info)
You can mouse over the ban icon for a tool tip giving you more info, though that doesn't work on mobile. But the way to tell a difference between temp and permanent is temp = grey slash circle, permanent = red
I will be moderating low effort trolling, and if it's from low post accounts who only showed up to do it I'm not going to be lenient. Being critical is fine, filling the comments with noise isn't.
But you knew better.Every place where I had the power to push to prod I could have easily done it without going through any intermediary steps. Not the best practice, but this was consistent from small non-profits to one of the largest market cap companies in the world.
I agree with your general sentiment around keeping incorrect text up, but I think it gets thorny with fabricated quotes. Future AIs will inevitably slurp that up, ignore the context that they were fabricated, and then confidently assert that they were actual quotes.
Hmm, doesn't seem to work on Safari. I just tried it on Jim's eject icon, just tries to select it instead of the tool tip appearing.Pretty sure on mobile on my Android Chrome I can long press and get the tool tip. Just a heads up. IDK if maybe it's different on safari
Perhaps the lurker has been debating whether this bloodthirsty community is one he wants to pay to join?This is an interesting take, when delivered by a Lurkus with no subscriptor tag of their own.
While I agree fully with the vast majority of your great post, I strongly, strongly disagree with this part. First, because a good community can and generally does stick to higher standards, and as we see in this thread has been pretty measured. Simply assuming everyone will behave badly is frankly directly insulting. Some of the most controversial issues of the day have gotten (and are getting) discussed on the Ars forums, issues where the stakes and emotions go far far higher than this. And second to that point, if anyone individually goes over the line that's what moderation is for. Right? Moderators can just use the mod yellow box and say "Hi everyone, remember this is ongoing, please keep things cool" and eject the repeatedly misbehaving. And that's what happens on the Soap Box.The comment locks I expect and think are good practice. In the absence of information, speculation turns to toxic rumor-mongering and assuming the worst.
Yes, but I think we can say that objectively speaking, regardless of personal individual aspects around the author himself, there was a clear systemic failure here on the part of Ars itself too. For those parts it doesn't actually matter at all whether someone was malicious/grossly negligent or made a perfectly reasonable mistake or whatever. A few particular systemic things I'd like to see addressed:So I'm willing to give Ars some space only so long as I can believe they're working to do the right thing, and that the job isn't done yet. We'll have to see if this incident plays out differently. If this "notice" ends up being the final word, then I'll have to make hard choices.
Name two sites you consider trustworthy. A good number of your posts represent backseat copy-editing--help us gauge your lofty standard.I will make clear to those in my limited sphere of influence that Ars Technica is not trustworthy.
The thing I think a lot of people are missing is that in order to understand these AI tools, what they can do, and their limitations, they have to be used. And ideally used with real-world use cases.
I think it's entirely expected that an AI reporter would be using these AI tools as much as possible. Obviously something went wrong this time, but you're not likely to be a good AI reporter if you never use the tools you cover.
What went wrong was twofold:The thing I think a lot of people are missing is that in order to understand these AI tools, what they can do, and their limitations, they have to be used. And ideally used with real-world use cases.
I think it's entirely expected that an AI reporter would be using these AI tools as much as possible. Obviously something went wrong this time, but you're not likely to be a good AI reporter if you never use the tools you cover.