"The vast majority of Codex is built by Codex," OpenAI told us about its new AI coding agent.
See full article...
See full article...
Great questions! The kinds of questions journalists are paid to ask. I wonder why the journalist writing so many of these AI articles never thinks to ask these types of questions. Even when the comment section is full of suggestions for this very thingOpenAI has a long history of "grand exaggeration" when it comes to AI's supposed capabilities and achievements. Could it be that they are pumping up their coming IPO?
Can we see an independent review of these capabilities??
I'm going to get a good grade in asking the computer to make something for me, something that is both normal to want and possible to achieve.I use them everyday. Github Co-Pilot (Claude, ChatGPT, etc.) If you're getting AI slop then you need to change what you're doing. I'm looking like a super star with fairly typical human in the middle stuff. Python scripts, java, javascript, etc. Just fine. Also great for Teraform and Devops stuff.
Hell, I threw it at a bunch of COBOL and it did really well.
AI slop is a problem, but it's one addressable by understanding prompting techniques, context compression, and session management.
More like lancing a carbuncle at this point.Pop the bubble.
In my teams, the “junior developer” sat in on all meeting, including with stakeholders. a lot was written down, but a lot was conveyed that was not. The team communicated in writing and verbally.OpenAI’s approach treats Codex as what Bayes called “a junior developer” that the company hopes will graduate into a senior developer over time. “If you were onboarding a junior developer, how would you onboard them? You give them a Slack account, you give them a Linear account,” Bayes said. “It’s not just this tool that you go to in the terminal, but it’s something that comes to you as well and sits within your team.”
I deliberately included some implausible explanations to reinforce the plausibility of other explanations.This is a wild statement. Corporations employ armies of workers for what basically boils down to communication and operations.
No, it is the number of engineering managers and CTOs telling devs to use them. Then, we do, because a mandate is a mandate.It's interesting that despite there being countless different jobs that require composing text, the one job that appears to have most widely adopted LLMs to increase productivity, is developers. Can't be a coincidence. Possible explanations:
- Developers are the knowledge workers that are the most adaptable to new ways of doing things.
- Developers are the most inefficient knowledge workers.
- Something about development makes LLMs particularly useful and effective.
- LLMs don't actually provide enough advantages to justify the incredibly wide adoption they have seen and developers are particularly vulnerable to believing that a given technology improves output when it doesn't actually do so.
- Developers aren't using LLMs as widely as it seems and instead they are just the loudest about it.
Alas, this has been a problem at ars for years, at least. They really don't like pushing or contradicting their interviewees, or independently confirming things they say. Once you realize that it's hard to stop seeing it. I like ars - I've been here a long time now - but I really wish they'd commit more to doing research.Bro. If you want a softball interview podcast, do that. If you want to do journalism, this ain't it. Even just a verbatim transcript would be of more worth.
It's not just AI though. I'm a neuroscience nerd, which is where I first noticed the lack of investigation or skepticism. Their reporting on things I knew something about always just seems to be repeating what one source said about a topic. They rarely go to an independent source to try and fact check.Great questions! The kinds of questions journalists are paid to ask. I wonder why the journalist writing so many of these AI articles never thinks to ask these types of questions. Even when the comment section is full of suggestions for this very thing
And why doesn't Ars hire a better journalist since their current writer is failing at basic journalism? They could easily find someone like you right in the comment section if they wanted to! Why are they ignoring great options for hard-hitting journalism?
Guess we'll never know!
Man I use this thing, I type commands into it, and the computer follows the commands. Some might call it a chat interface, some might call it a terminal. When I want to make full programs, I open up an editor that natively understands what I type into it, like it's integrated the things needed to develop the programs. What are you going to call your alternative?All humanity is not gonna open an IDE or even know what a terminal is
I do not believe anything that comes out of OpenAI.OpenAI has a long history of "grand exaggeration" when it comes to AI's supposed capabilities and achievements. Could it be that they are pumping up their coming IPO?
Yeah the AI would likely be substantially different, experts in those fields often have safety and ethics training.Or developers make LLMs so LLMs are uniquely tailored (biased) for developers. If chemists, biologists or physicists knew enough programming to make an LLM it may be wildly different than what software developers have made. LLMs are a product of their creators.
Bra, we just started to force push to main, and we started shipping new code in record time!shipping != improvement
Now these points of data make a beautiful lineBra, we just started to force push to main, and we started shipping new code in record time!
It isn't. Their parent corporation made a devil's deal with OpenAI, and its reporting on the subject has never been the same since.Seriously. I'm getting tired of Arstechnica articles that could've been a press release. For some reason, they're always about AI too.
I came to Ars because it had in-depth journalism about tech. The writers were knowledgable and not easily swayed by market speak.
Maybe Ars isn't the place to find that kind of journalism anymore?
But the commentariat is second to none. I saw the article headline and thought "Ohhhh... interesting! Let's read the comments" Sometimes the comments make me think that the article is also worth reading. The comments are always worth the visit.Seriously. I'm getting tired of Arstechnica articles that could've been a press release. For some reason, they're always about AI too.
I came to Ars because it had in-depth journalism about tech. The writers were knowledgable and not easily swayed by market speak.
Maybe Ars isn't the place to find that kind of journalism anymore?
Sam Altman's previous eye-scanning venture literally partnered with criminals. Want the sources, I am happy to PM you.OpenAI has a long history of "grand exaggeration" when it comes to AI's supposed capabilities and achievements. Could it be that they are pumping up their coming IPO?
Can we see an independent review of these capabilities??
This is definitely what's happening at some companies:As a side note, I'd love to know how much of the increased use of AI is from employees being told in no uncertain terms "management expects everyone to integrate AI into their daily tasks ASAP," whether that integration makes any sense or not.
developers are the most expensive text workersIt's interesting that despite there being countless different jobs that require composing text, the one job that appears to have most widely adopted LLMs to increase productivity, is developers. Can't be a coincidence. Possible explanations:
- Developers are the knowledge workers that are the most adaptable to new ways of doing things.
- Developers are the most inefficient knowledge workers.
- Something about development makes LLMs particularly useful and effective.
- LLMs don't actually provide enough advantages to justify the incredibly wide adoption they have seen and developers are particularly vulnerable to believing that a given technology improves output when it doesn't actually do so.
- Developers aren't using LLMs as widely as it seems and instead they are just the loudest about it.
I mean, that's the dream/hype/pr-bull. Where is the evidence that anything like that is happening or possible? Still, if it does happen, we know it's going to be paper clips.This should be utterly unsurprising to anyone familiar with build tools history. In an effort to eat their own dogfood, build tools have traditionally been used to build themselves, whenever possible. Clang-llvm builds clang-llvm. It is the natural next step for a AI code development project to author and build itself. If it’s producing poor quality code that is clearly a bug in the code base / weights itself and we can fix that and then hypothetically see the improvement over the problem in question but also the rest of the code base. It is the sensible way to move forward.
When the time comes we would also expect to see robots built not in factories but by other robots from a bucket of spare parts. A factory can only deliver the throughput that it was specced for and is forever limited to that until you build another. Robots building robots can grow exponentially and is basically only limited by the logistics of delivering parts to an ever expanding body of robots. (Insert comical descriptions of rabbits with an unlimited supply of food reproducing faster than the speed of sound.) While a factory can bootstrap this process, ultimately the repeated doubling will dwarf its capacity into irrelevance. The early winner in the robot arms race will be the one who makes robot assembling robots with the shortest generational time and cheapest parts list.
See also 3D printers.
To be fair, this is true of a lot of journalism, even at places like the NY Times. Anytime they report on something that I actually know something about, I'm amazed at how wrong the reporting is. But it's especially bad when it comes to reporting on "AI" (a term that is not meaningful unless it is in quotes). Pretty much all journalism is just regurgitating the "AI" hype.Alas, this has been a problem at ars for years, at least. They really don't like pushing or contradicting their interviewees, or independently confirming things they say. Once you realize that it's hard to stop seeing it. I like ars - I've been here a long time now - but I really wish they'd commit more to doing research.
It's not just AI though. I'm a neuroscience nerd, which is where I first noticed the lack of investigation or skepticism. Their reporting on things I knew something about always just seems to be repeating what one source said about a topic. They rarely go to an independent source to try and fact check.
The reporting at Ars on "AI" is been pure uncritical hype since always. It long predates Conde Nast's deal with OpenAI.It isn't. Their parent corporation made a devil's deal with OpenAI, and its reporting on the subject has never been the same since.
Link: Condé Nast Announces Partnership with OpenAI
We've reached the NFTs of NFTs phase of the circlejerk.
Unfortunately, LLMs are also used a lot in HR, Finance and other critical domains. And with far less control than in development.It's interesting that despite there being countless different jobs that require composing text, the one job that appears to have most widely adopted LLMs to increase productivity, is developers. Can't be a coincidence. Possible explanations:
- Developers are the knowledge workers that are the most adaptable to new ways of doing things.
- Developers are the most inefficient knowledge workers.
- Something about development makes LLMs particularly useful and effective.
- LLMs don't actually provide enough advantages to justify the incredibly wide adoption they have seen and developers are particularly vulnerable to believing that a given technology improves output when it doesn't actually do so.
- Developers aren't using LLMs as widely as it seems and instead they are just the loudest about it.
Compiling isn't a valid test , that's what unit tests , integration tests , loading testing etc is for.developers are the most expensive text workers
on code you can easily check if it is "correct" - atleast if it compiles.
Nasty errors in a text are more difficult to spot and handle