Preprint server arXiv will ban submitters of AI-generated hallucinations

Chaster Mief

Ars Centurion
279
Subscriptor
I would imagine the highest penalty would be on the primary authors of the paper? In that context, what validation process does the site take to identify the submitters being who they claim to be? This mainly to reduce the risk of impersonation and defamation.
Whether they validate or not, there is the appeal process as mentioned. I get the concern though. I'm always so surprised to hear about how backstabby the scientific fields can be.
 
Upvote
13 (13 / 0)

Aurich

Director of Many Things
41,247
Ars Staff
Here's an OCR converted version of the text from that Bluesky screenshot for convenience:

Attention @arxiv authors: Our Code of Conduct states that by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the contents were generated.​
If generative Al tools generate inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s).​
We have recently clarified our penalties for this. If a submission contains incontrovertible evidence that the authors did not check the results of LLMgeneration, this means we can't trust anything in the paper.​
The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer- reviewed venue.​
Examples of incontrovertible evidence: hallucinated references, meta- comments from the LLM ("here is a 200 word summary; would you like me to make any changes?"; "the data in this table is illustrative, fill it in with the real numbers from your experiments*) end/​

(I wasn't about to dig up my X login to capture the original, but should be accurate.)
 
Upvote
44 (44 / 0)
The more I read about AI controversies, the more obvious it becomes that it is being used to skip steps. Accelerating your work using tools effectively is good, but skipping steps is just incomplete work. However the skipping is performed or whatever label is slapped on the result, skipping steps is just incomplete work.
"You're not done. Get back to us when you're finished."
 
Upvote
21 (21 / 0)
Here's an OCR converted version of the text from that Bluesky screenshot for convenience:

Attention @arxiv authors: Our Code of Conduct states that by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the contents were generated.​
If generative Al tools generate inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s).​
We have recently clarified our penalties for this. If a submission contains incontrovertible evidence that the authors did not check the results of LLMgeneration, this means we can't trust anything in the paper.​
The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer- reviewed venue.​
Examples of incontrovertible evidence: hallucinated references, meta- comments from the LLM ("here is a 200 word summary; would you like me to make any changes?"; "the data in this table is illustrative, fill it in with the real numbers from your experiments*) end/​

(I wasn't about to dig up my X login to capture the original, but should be accurate.)

"If a submission contains incontrovertible evidence that the authors did not check the results of LLMgeneration, this means we can't trust anything in the paper."

Perfect.
 
Upvote
39 (39 / 0)

MilanKraft

Ars Tribunus Angusticlavius
6,929
Good. Time for more organizations to take a hard line against slop. Next up is instant disbarment for lawyers that get caught submitting fake citations and AI generated hallucinations.
100% this. There is no room whatever for lazy asses in the legal profession having LLMs do their casework for them. Too much is at stake in too many cases to F around with this nonsense.

If firms want their junior staffers or legal assistants to have a leg up in non-critical, non-court-facing research tasks or something like this, or perhaps something trained only on internal policy documents to fill in forms or other minutia, great. Casework, no shot. LLM output should never get near a courtroom.
 
Upvote
11 (11 / 0)

dmsilev

Ars Tribunus Angusticlavius
7,359
Subscriptor
Primary authors? Nope. Every last author whose name appears should be banned as well.
Incredibly easy to abuse. I just have to create a burner account and then submit a LLM generated paper where I list person-I-hate as a coauthor. At the very least, I cause them to waste time going through the appeals process.
 
Upvote
-9 (7 / -16)

justsomebytes

Wise, Aged Ars Veteran
201
Subscriptor
Incredibly easy to abuse. I just have to create a burner account and then submit a LLM generated paper where I list person-I-hate as a coauthor. At the very least, I cause them to waste time going through the appeals process.
Isn't just as abusive the other way? Having a whole list of people responsible for checking a paper who either didn't or were part of making a bad paper, and then while the primary other is banned they submit their own bad papers.
 
Upvote
-9 (0 / -9)
Incredibly easy to abuse. I just have to create a burner account and then submit a LLM generated paper where I list person-I-hate as a coauthor. At the very least, I cause them to waste time going through the appeals process.
There is an appeal process. I would certainly hope that it includes checking both authorship and the data appearing in the paper in question.
 
Upvote
8 (9 / -1)
I would imagine the highest penalty would be on the primary authors of the paper? In that context, what validation process does the site take to identify the submitters being who they claim to be? This mainly to reduce the risk of impersonation and defamation.
Literally the first sentence of the message says that all authors bear full responsibility 🤦‍♂️
 
Upvote
22 (23 / -1)

AstroTom

Smack-Fu Master, in training
20
Incredibly easy to abuse. I just have to create a burner account and then submit a LLM generated paper where I list person-I-hate as a coauthor. At the very least, I cause them to waste time going through the appeals process.
Yes. Does ArXiv have any check that the co-authors listed were actually involved in the paper? If not, this policy would provide an easy way for someone to screw up their professional competitors.
Not to mention for any cranks with an active ArXiv account to cause problems for anyone who rebutted their work.
 
Upvote
-8 (1 / -9)

crmarvin42

Ars Praefectus
3,171
Subscriptor
I am so glad to read this, and am going to forward this article to all the journals I am connected with in my field (we do not use pre-print sites like arXiv, but I would like to see similar penalties for AI slop).

I remember my prof giving our lab 2 long literature review papers to read before our lab meeting. Turns out, one of them plagiarized the other. Since they were so long, only one of us actually read them, and noticed the plagiarism (me). It wasn't subtle. More than 80% of the two papers were identical, down to punctuation and typos. The penalty there, was a ban on submissions for the PhD student who'd done the plagiarism by the journal, a retraction of his PhD by the college, and ban on submissions by his advisor as well for not catching the plagiarism (if we are being generous and assuming he didn't know).

AI can make mistakes as fast as it can generate content. The only solution to that is to make sure, through steep and harsh penalties, that people using AI are held to account for failing to catch those mistakes. That means Laywers should risk being rejected from their bar, Doctors should risk losing their medical license, And scientists should risk losing their right to publish (publish or perish after all).

My only concern with the consequences they outlined above is that all authors bear the risk equally. I get why that is what they settled on, but it risks a lot of folks catching strays. I work with various labs to run research, often outside of my own field of expertise. I contribute enough intellectually to deserve having my name on the paper, but am by no means an expert in everything in the paper, and I usually only contribute to the writing of the sections related to my expertise. In collaborative projects that cross disciplines, there are a lot of folks ill equipped to catch malfeasance on the part of a co-author. The compromise might be that the sword of Damocles falls primarily on the submitting author (and their advisor if a student or post-doc submits the paper), and the potential for a lesser penalty for co-authors. But I can see how that could be abused as well, so it's not like it is a clean solution either.
 
Upvote
14 (15 / -1)
Yes. Does ArXiv have any check that the co-authors listed were actually involved in the paper? If not, this policy would provide an easy way for someone to screw up their professional competitors.
Not to mention for any cranks with an active ArXiv account to cause problems for anyone who rebutted their work.
Pretty sure that would rise to the level of fraud and not only destroy the fraudster’s career but risk them being sued into a well deserved oblivion. Possibly jail time.
 
Upvote
-1 (2 / -3)

ranthog

Ars Legatus Legionis
15,360
My only concern with the consequences they outlined above is that all authors bear the risk equally. I get why that is what they settled on, but it risks a lot of folks catching strays. I work with various labs to run research, often outside of my own field of expertise. I contribute enough intellectually to deserve having my name on the paper, but am by no means an expert in everything in the paper, and I usually only contribute to the writing of the sections related to my expertise. In collaborative projects that cross disciplines, there are a lot of folks ill equipped to catch malfeasance on the part of a co-author. The compromise might be that the sword of Damocles falls primarily on the submitting author (and their advisor if a student or post-doc submits the paper), and the potential for a lesser penalty for co-authors. But I can see how that could be abused as well, so it's not like it is a clean solution either.
One hopes that the appeal process provides for more nuance in how this is applied. Especially if this sort of thing helps to prompt institutions to properly punish such behavior.
 
Upvote
2 (2 / 0)

AstroTom

Smack-Fu Master, in training
20
Pretty sure that would rise to the level of fraud and not only destroy the fraudster’s career but risk them being sued into a well deserved oblivion. Possibly jail time.
While it might be easy to get banned from professional societies, beyond that might require some lawyers and courtrooms. Not sure if there are protocols in place for that.
 
Upvote
-1 (0 / -1)
One obvious question that arises when these problems are found in publications is why nobody caught them sooner. Now, we can at least know that someone is trying to.

Time and interest same as in FOSS. It takes time to go through a research paper, resources to check the references, and an inherent level of knowledge at least as on level as those submitting the paper. On top of that, there also has to be someone in the audience that has at least a passing interest in whatever the paper topic is on before they'll go through all that. shrugs

In the legal field, it's usually understood that another legal expert, whether lawyer or judge, is going to check your work, yet we all know there have been and still are unscrupulous lawyers trying to get away with BSing the courts and opponents, with amusing consequences for the public, but less than amused judges and clients. In much of the rest of academia, that's not a given. You should take notice many papers get retracted years later once they've already caused damage in the field and trust in The Process of science by the general public.

That's the same reason why bugs can exist in FOSS for decades. Even if people are looking for them across the entire breadth of FOSSdom, the likelihood any individual project will receive enough attention to find those bugs is still related to the popularity of the project, and even that's no guarantee.

Some of us are glad that *LMS are becoming useful enough to allow lesser skilled individuals to contribute to that problem. Before, you had to be highly skilled at reading someone else's code and holding a logic construct in your head to reason about it (even if you wrote the flow chart out). Now, you can get one of the newer LMs to help you out, spreading out the resources from burdening the maintainer(s) into those that may have different points of view (and questions) to help guide those searches. You still have to know enough to ask the correct questions and guide the *LM in its search and evaluate the responses, but you no longer need to have many years of experience, just the money to feed the coin slot. Wow... the Ferengi knew what was coming:
CBS_STAR_TREK_DS9_562_IMAGE_1122483_1920x1080.jpg
 
Upvote
2 (3 / -1)

clewis

Ars Tribunus Militum
1,831
Subscriptor++
<snip>

I remember my prof giving our lab 2 long literature review papers to read before our lab meeting. Turns out, one of them plagiarized the other. Since they were so long, only one of us actually read them, and noticed the plagiarism (me). It wasn't subtle. More than 80% of the two papers were identical, down to punctuation and typos. The penalty there, was a ban on submissions for the PhD student who'd done the plagiarism by the journal, a retraction of his PhD by the college, and ban on submissions by his advisor as well for not catching the plagiarism (if we are being generous and assuming he didn't know).

<snip>
How did you decide who was the plagiarizer? I imagine the guilty party would deny the accusation, especially with the penalties that high.
 
Upvote
0 (0 / 0)

clewis

Ars Tribunus Militum
1,831
Subscriptor++
<snip>

That's the same reason why bugs can exist in FOSS for decades. Even if people are looking for them across the entire breadth of FOSSdom, the likelihood any individual project will receive enough attention to find those bugs is still related to the popularity of the project, and even that's no guarantee.

Some of us are glad that *LMS are becoming useful enough to allow lesser skilled individuals to contribute to that problem. Before, you had to be highly skilled at reading someone else's code and holding a logic construct in your head to reason about it (even if you wrote the flow chart out). Now, you can get one of the newer LMs to help you out, spreading out the resources from burdening the maintainer(s) into those that may have different points of view (and questions) to help guide those searches. You still have to know enough to ask the correct questions and guide the *LM in its search and evaluate the responses, but you no longer need to have many years of experience, just the money to feed the coin slot. Wow... the Ferengi knew what was coming:

<large image removed>
No, you still need those years of experience to verify it's actually a bug, and not something the AI made up.
And then you need those years of experience to figure out if it's exploitable.
And then you need those years of experience to evaluate if the AI's fix is actually a fix.

Do not flood FOSS maintainers with AI generated bugs if you can't do the above.
 
Upvote
6 (6 / 0)
It's a surprisingly humane form of collective punishment, which stems from the fact that Arxiv isn't a journal, and doesn't have to play by journal rules and norms. It emphasizes a social contract that shouldn't have been a surprise (in principle) to anybody submitting a paper. The shame, professional recrimination and possible resulting legal liability means that in particular, principal investigators will have to do a better job of mentoring and overseeing their teams.
 
Upvote
3 (3 / 0)

Fatesrider

Ars Legatus Legionis
25,294
Subscriptor
I would imagine the highest penalty would be on the primary authors of the paper? In that context, what validation process does the site take to identify the submitters being who they claim to be? This mainly to reduce the risk of impersonation and defamation.
By my read of the rules Aurich posted, ALL AUTHORS who sign off on it are going to be penalized, since the wording is "author(s)" not "author", and the implication is that anyone who puts their name on it will be penalized.

I don't think it matters if they ban fake authors, as long as they band the lazy fuckwits who don't properly science their submissions.
 
Upvote
0 (2 / -2)
No, you still need those years of experience to verify it's actually a bug, and not something the AI made up.
And then you need those years of experience to figure out if it's exploitable.
And then you need those years of experience to evaluate if the AI's fix is actually a fix.

Do not flood FOSS maintainers with AI generated bugs if you can't do the above.
You're way behind the times. Even if you think you're up as of 2 weeks ago, you're now behind the times. That USED to be true, it no longer is. If people have access to Mythos Preview they have the reasoning they need. They just need to be able to create the adversarial harness that directs the inquiry after the initial questioning. And I never said NO skill level, I said enough to evaluate the results. That's not the same as being able to check the entire logic flow of a million line code base, and don't even TRY to tell me it is.

Also, even the curl maintainers acknowledge that the most recent LLMs other than Mythos have improved in a near quantum leap in the capability to find genuine, useful bug reports. The game has changed, and it's no longer the same as it was even 2 months ago.
 
Last edited:
Upvote
-6 (2 / -8)

ranthog

Ars Legatus Legionis
15,360
No, you still need those years of experience to verify it's actually a bug, and not something the AI made up.
And then you need those years of experience to figure out if it's exploitable.
And then you need those years of experience to evaluate if the AI's fix is actually a fix.

Do not flood FOSS maintainers with AI generated bugs if you can't do the above.
The disrespect for expertise in this society is fucking wild.
 
Upvote
9 (9 / 0)

OldEnuf2kno

Smack-Fu Master, in training
58
Subscriptor
Depending on the paper, our department policy can be alphabetical or lead/contribution sorted.
The academic library I worked at (which did not require its librarians to publish or perish) used lead/contribution as its sort. As the research data librarian I worked with all the departments, schools, colleges, and admin units. Everyone did their publishing authorship sort differently. And there were always outliers who did it different than everyone else in their field or department.
 
Upvote
1 (1 / 0)

fenncruz

Ars Tribunus Militum
1,788
Incredibly easy to abuse. I just have to create a burner account and then submit a LLM generated paper where I list person-I-hate as a coauthor. At the very least, I cause them to waste time going through the appeals process.
You can't just make a burner account. New accounts either need to sign up with their university email or get someone who already published multiple times on the arxiv to endorse then.

It's not fool proof but it raises the bar.
 
Upvote
4 (4 / 0)