AI-generated code could be a disaster for the software supply chain. Here’s why.

Dmytry · Apr 29, 2025

Little-Zen said:
It is an industry-wide problem. These LLMs all do it, the biggest ones are bad about it and the smaller ones are slightly worse.

when the creators of these LLMs and the “experts” are saying “well, we can’t really say why it does what it does, we don’t really understand it” that’s the big red warning sign that we shouldn’t be depending on them for anything.

Oh, everyone knows why they do it, just nobody wants to say it.

Any time it comes up with a new variable name or really new anything that isn't 1:1 an existing snippet taken from an open source project, that's the same mechanism as the one by which it "hallucinates".

The difference is not inside the AI, it's outside of it - users desire rewording and renaming for plagiarism purposes, but do not desire it in package names.

This also goes for the code itself. Users desire re-ordering of calls to hide plagiarism, but do not desire such reordering when it changes behavior or makes it harder to read. (And yes, users technically would prefer true new code to plagiarism, but that's not on the offer).

And the rules governing which changes are desired for plagiarism and which changes are undesirable, are too complex for AI to replicate.

Dmytry · Apr 29, 2025

Ezzy Black said:
OK certainly non-rhetorical question here. How the hell does the code work at all then?

It doesn't, people hit run get an error and edit it out. Newer AI may itself take a several passes and edit it out in the end.

Except when malicious party runs the AI and sees it make a plausible package name. They can then upload a malicious package with that name. From that point onward, when people (or AIs) try to run the generated code that has the same made-up name, they install that package, which runs the installer script and potentially compromises their system, or worse yet their customer's system.

Dmytry · Apr 29, 2025

sarusa said:
The best you can do at this point is have the LLM 'watch' its own output and attempt to cross-check it, which does somewhat work, except the checker attention head is just as prone to bullshit as the original one, so you need at least three to 'vote' on it, which of course skyrockets the energy cost and still doesn't guarantee anything. I have, when playing around with OpenAI (know the enemy), told it it was wrong about something it was right on just to see what it would do, and it completely accepted that it was wrong and rewired everything to justify that. Claude does the same thing.

I think for the package names you can probably have a white listed set of modules that exist at the training cut off date and just filter all generated import statements. Simple and stupid.

Except people now expect it to ingest their codebase and guess calls to modules that are internal or past the training cut off, so it would break that.

As far as stochastic parrotry goes, AI industry's answer to that argument is to do a sort of haruspicy on the neural network's internals. It could be plagiarizing a piece of text verbatim, but the internal mess is always complicated and that supports the point that it was thinking up the plagiarized text on its own. Or something.

sarusa said:
Now, you know what LLMs are really good at? Writing malware! Because it's fine if it only works 50% of the time if you can test it, keep the ones that work, then get it out NOW. China has really ramped up on this.

Yeah, that's a great point.

Well, it also applies to some marginal uses of programming - e.g. displaying a graph of something with matplotlib, a lot of code for that is essentially throwaway and as long as it looks about right nobody cares.

Dmytry · Apr 29, 2025

VividVerism said:
The really insidious miscreants will make their slopsquatted package actually do what it says on the tin in addition to their intended mischief. So the code may even work and leave the developer none the wiser about the malware that hitched a ride with the mostly functional package.

May actually be quite easy to do, when the functionality is just one or two method calls.

It's kind of like typosquatting on steroids - the package name doesn't need to sound like any existing package, and there's a set of bullshitted functionality to go with it.

Search

Search

AI-generated code could be a disaster for the software supply chain. Here’s why.

Dmytry

Ars Legatus Legionis

More options

Dmytry

Ars Legatus Legionis

More options

Dmytry

Ars Legatus Legionis

More options

Dmytry

Ars Legatus Legionis

More options