here’s hoping developers heed the message.
its high time we start calling out specific AI models instead of just putting everything together as “ai” (unless there’s evidence it’s an industry wide problem) . You wouldn’t blame Apple for Google’s privacy issues .
I don't understand why the software world allows anyone to publish packages into the package namespace of major tools without oversight? Even the worst software package mangers like Google's play store do at least do some checking. Are there not commercial services offering vetted package lists as a starting point?Oh, that's fascinating!
Best quote I heard on the subject was "Why are we using AI to create new problems instead of solving old problems?" and that, of course, is the heart of the matter. LLMs do not solve old problems.
I was wondering how the heck do you detect hallucinations, but I did not at all think of package names as an attack vector. How remarkably insidious! Of course, this has always been a problem with people dropping package names with typos and just waiting for someone to bite, but now your code copilot brings the exploit to you!
I wouldn't even know where I'd start with coding today, since you apparently need to understand supply chain first.
Most people know they can't be depended on, but some people think they can get away with not checking, after all, no one holds software companies to account for all the other bugs and security holes.It is an industry-wide problem. These LLMs all do it, the best ones are bad about it and the worst ones are slightly worse.
when the creators of these LLMs and the “experts” are saying “well, we can’t really say why it does what it does, we don’t really understand it” that’s the big red warning sign that we shouldn’t be depending on them for anything.
But aren’t hallucinations an issue with all AI LLMs isn’t it? Some maybe be worst than others, but it’s an intrinsic issue with all LLMs, not just a few specific ones.its high time we start calling out specific AI models instead of just putting everything together as “ai” (unless there’s evidence it’s an industry wide problem) . You wouldn’t blame Apple for Google’s privacy issues .
To take your point further, there are no such things as hallucinations. All LLM output is a statically chosen best guess. Some of those guesses happen to be correct, some incorrect, but there is no difference beyond that.But aren’t hallucinations an issue with all AI LLMs isn’t it? Some maybe be worst than others, but it’s an intrinsic issue with all LLMs, not just a few specific ones.
This is an important insight. OpenAI and other companies claim humans can save time by having the lying machines do the work for them, and in the (supposedly rare) cases when the lying machines do in fact lie, the human will notice it right away and fix it. But we know that this is an utter fantasy. It is in our nature to fall asleep at the proverbial wheel in such cases. (This is just one of a 100 reasons why you shouldn't use AI.)One can argue its solving time and efficiency problem . Developers spend a ton of time on documentation , if AI can do half of that work, that’s even more time they can spend on coding . I guess the problem as always is finding where the AI comes in the assist the human in the process than outright replace human.
I think the main reason is that there isn’t one true source where vetted dependencies can be downloaded from, there are hundreds if not thousand of places to download them from. Some can be downloaded in archives from a web page somewhere, some from a project on GitHub, some old dependencies are still available on Sourceforge, etc. Even being aware of supply chain attacks it is hard to be sure you are getting what you think you are, especially if it is a component you haven’t used before so you don’t know what the official distribution channel is or any of the persons involved.I don't understand why the software world allows anyone to publish packages into the package namespace of major tools without oversight? Even the worst software package mangers like Google's play store do at least do some checking. Are there not commercial services offering vetted package lists as a starting point?
Lying is a big word and projects too much intelligence on these models. It would be better applied to the people selling them.This is an important insight. OpenAI and other companies claim humans can save time by having the lying machines do the work for them, and in the (supposedly rare) cases when the lying machines do in fact lie, the human will notice it right away and fix it. But we know that this is an utter fantasy. It is in our nature to fall asleep at the proverbial wheel in such cases. (This is just one of a 100 reasons why you shouldn't use AI.)
There is no safe LLM right now. The "brand" of the generator doesn't matter in that regard.Any car can get in an accident that doesn’t mean if VW has a safety issue we can brand all cars as being unsafe .
I'd bet my bottom dollar the amount of testing and legislation in place to make cars safe vastly exceeds the amount used to ensure LLM safety and efficacy. Once the two approach each other, then maybe you can use that analogy.Any car can get in an accident that doesn’t mean if VW has a safety issue we can brand all cars as being unsafe .
Sure there’s also potential for hallucinations but it vary based on the model, models guardrails , grounded data, prompt and other mechanisms such as RAG.
There are existing models that would work. Packaged software for Linux. Some distributions have a small archive. Some a huge one. Many have multiple levels of packaging from lightly vetted to more strongly checked. You can subscribe to other sources too if you're happy to do so. It's far form fool proof, but it's far better than we are seeing here.I think the main reason is that there isn’t one true source where vetted dependencies can be downloaded from, there are hundreds if not thousand of places to download them from. Some can be downloaded in archives from a web page somewhere, some from a project on GitHub, some old dependencies are still available on Sourceforge, etc. Even being aware of supply chain attacks it is hard to be sure you are getting what you think you are, especially if it is a component you haven’t used before so you don’t know what the official distribution channel is or any of the persons involved.
I am not aware of any commercial service offering vetted packages, and even if there are I doubt they would be able to keep up with the open source community. Remember, it isn’t one package that needs to be vetted, it is every available version of it.
I don't understand why the software world allows anyone to publish packages into the package namespace of major tools without oversight? Even the worst software package mangers like Google's play store do at least do some checking. Are there not commercial services offering vetted package lists as a starting point?
AI-generated code could be a disaster for the software supply chain. Here’s why.
I don't understand why the software world allows anyone to publish packages into the package namespace of major tools without oversight? Even the worst software package mangers like Google's play store do at least do some checking. Are there not commercial services offering vetted package lists as a starting point?
Okay, I'm actually glad you brought this up because it's an excuse to do a ~~LINGUISTIC SIDEBAR~~Lying is a big word and projects too much intelligence on these models. It would be better applied to the people selling them.
The exact problem we’re trying to solve is that if too many people want to get paid a fair wage for an honest day’s work, it becomes marginally more difficult for billionaires to send their girlfriends into space.Pure code generation has never been a problem that needed solving. It's easy to write lots of code. It's hard to write secure, stable code that solves the problem you're trying to solve, and LLMs don't help with that at all. It's still unclear to me what problem we're trying to solve, here.
You guys are overthinking this. Saying "lying machine" whenever the opportunity arises is like saying "microshaft" on slashdot in 2007, or talking about "sleepy Joe" with your facebook friends. It's not going to change any minds, but it gets a giggle out of the people on your team.Okay, I'm actually glad you brought this up because it's an excuse to do a ~~LINGUISTIC SIDEBAR~~
I completely agree with your take here. The output that's extruded by generative AI models is more accurately called "bullshit", which are things that are expressed with no regard for whether or not they're true.
BUT, dropping the word "bullshit" into a conversation about AI is much more likely to turn people against you. It's seen in common conversation as exaggeration and it's hard to explain and justify its use every time.
But if you can think of words other than "bullshit", that we can use to substitute for "lying", I am honestly open ears. One reason I love engaging on Ars, is I can respond to an article and then witness how it's received. Sometimes, I do okay. Sometime, it blows up in my face and I think about how I can do better next time.
Reposting a comment I made late on an older article.
For programming I find using an LLM to be much like having a junior programmer. You don't give it a big project and let it run wild, you give it little tasks.
Like sometimes I need a bash script to do a thing. I don't use bash all the time so I don't always remember all the syntax and how to do string parsing and everything that you might need to do. Sure, I could spend time searching and reading and writing and troubleshooting and forget it all again by the next time I need it. Or I can fire it into ChatGPT and get something that either works correctly the first time, or requires just a little bit of massaging, either way it has saved my time with the added bonus of including all the error handling, logging, and commenting that I probably would have been in too much of a rush (or too lazy) to add to what is basically a little helper script (you've done it too so don't look at me like that).
It's been even more helpful on the home front. My wife is not a native English speaker. She is proficient, but sometimes her phrasing can be awkward, downright nonsensical if she's been attacking the thesaurus, or embarrassing if she makes an accidental euphemism, so she is really self-conscious about it. For years she was constantly asking me to proofread everything from journal papers to emails to fucking text messages, it would drive me nuts and it's not like I'm a professional languager. Now she can fire it into ChatGPT, ask it to do some minimal cleanup, and review it to correct any mistakes it might have made (which it does sometimes, but so would I if I didn't fully understand the context) all by herself. Even with the review it's faster than getting me to do it because she doesn't have to wait for it to be avilable, deal with it whinging and bellyaching about having to do it, and sit there while it spends 10 minutes humming and hawing over how to rephrase something, it just does it. It's a hyperbole to say ChatGPT saved our marriage, but it certainly saved what's left of my hairline.
I don't trust it to get facts right though. One of the first things I did when ChatGPT came out was to fire in the details about a fairly US centric game setting and tried to generate some ideas for Canada in the setting. Right off the bat it tried to place the headquarters for an organization at the intersection of the Saskatchewan, Manitoba, and Ontario boarder (that's like saying the intersection of the California, Arizona, and New Mexico boarder for you Yanks) and even after correcting it it went right back to saying they shared a common boarder. So I knew right away that it could easily be full of shit, but it's pretty good brainstorming and churning out descriptions for fictional locations, organizations, and NPCs.
So LLMs have their uses, but they also have their limitations and you have to be aware of what those are.
TFW you realize Kevin Roose wrote your todo-list app.But the vibes, man, the vibes!