OpenAI spills technical details about how its AI coding agent works

AreWeThereYeti

Ars Praefectus
4,507
Subscriptor
Yeah, I write system code (virtual machines), and the amount of "every design decision depends upon every other design decision" is huge, and is why system programming is much harder than application programming.

What I would like to see is a system programmer's evaluation of how this stuff works on real system code. Show me how these systems do at writing an efficient language interpreter. Or compiler. Or garbage collector. Or an operating system.
 
Upvote
8 (8 / 0)

MattGertz

Ars Praetorian
476
Subscriptor
Counterpoint:

I work in a tech support role for a NAS company. I can recall at least two recent cases (i.e. within the last month or two) where we wound up telling a user they were on their own and we wouldn't help them because the LLM they relied on to fix their problem had completely fucked over their system instead.

[...]
Truth. Buried in my thesis above is the point that you have to still understand deeply what you're doing (at least right now, and I'd guess for the next few years). In particular, I always scope out the architecture first, and I don't let the LLM tell me what it should be. If I discover that I've missed a feature, I sketch it out first via spec and architecture, and only then paste it into the chat window.

One of my teachers way-back-when described this sort of situation thusly: "For every cat that correctly knocks over the jar to get out the cream, there's another that will get its head stuck in the jar." Ultimately, LLMs are just a tool, and no one's going to get ahead by trying the drive nails in with a hammer's grip.
 
Upvote
6 (6 / 0)

MattGertz

Ars Praetorian
476
Subscriptor
Good to know we'll no longer need to pay for MS Office. That and Windows licenses have been a significant cost for us. But soon will be able to replace it with an occasional fee paid to Anthropic and get our own bespoke Office. Or just piggy back on someone else who has done that already.
You read me correctly. That is indeed the direction things are heading, IMO. The only rails are that the writer and the reader still need to be able to exchange information, and so standards will need to be in place without drifting. (Which, in turn, makes me wonder if it's the standards that fossilize in teh future rather than the software.)
 
Upvote
3 (3 / 0)

MattGertz

Ars Praetorian
476
Subscriptor
No need to tell pattern-matching models existing solutions. They already patterned-match them easily from your description. Would be interesting to see how much code was lifted or lightly adapted from existing open-source solutions, how good/functional those alleged unit-tests actually are and finally how maintainable code it is...

ETA: Finally, are you including under 12 hours total amount of time from start to finish including code verification and understanding?
To the first point: I don't know, and I will probably never know (until I'm contacted by lawyers, I suppose). As I said, I was careful to describe the problem bottom-up rather than say "copy X's functionality," and the product I was mimicking was not open source and so, in theory, would not be available to the model. But, for all I know, some equally frustrated person may have written it up in OSS before me. However, there's really only one true way to write/read EXIF data per framework, defined by the APIs, so at least that portion of the code would look the same regardless.

Yeah, I wasn't making the 12 hours up. I probably spent two of those hours writing the spec and architecture (which frankly wasn't too complicated), and probably an hour testing, so let's say nine hours wrestling with the LLM. ("Let's move on the stage 5, and don't forget to...") Three hours a day over the course of four days, and I was done.
 
Upvote
5 (5 / 0)

MattGertz

Ars Praetorian
476
Subscriptor
Good grief. I hope you don’t work for the Office team or the pot/kettle error might result in the formation of small black hole.
Hah! Touche; I must admit that I chuckled at that one. No, my work these days is all on new stuff around cloud technology (Azure), focusing on performance and reliability.
 
Upvote
5 (5 / 0)
I tried to get some help with InstallShield MSI + PowerShell development with both Gemini and ChatGPT late last year but apparently neither one has enough training data to work reliably.

With InstallShield, they mixed together the UI and features from 2000 - 2025 giving answers pointing to features that no longer existing in that form.

With PowerShell they hallucinated code that had no chance of working when run as an MSI custom action, or of running properly at the stage of the setup where it was needed.

With MSI they did somewhat better but still made suggestions that could never work.

I haven't tried Opus 4.5, maybe it has better training data but I suspect not since most InstallShield work is done for a company's use and there is no reason to publish and document the code somewhere it can be scraped for training.
One thing I've learned is that you need to explicitly TELL the model what versions of dependencies and libraries you're using. Just yesterday I had Codex fixing a runtime warning in my ESP32 C++ code that was bugging me and it tried to use what I already knew was a deprecated method, adcAttachPin(), to declare a GPIO pin as Analog, when it should have been using pinMode(). I corrected it and it was a simple fix, but it would have thrown a NEW error for using a deprecated method. I'd run across this before and knew to correct it, but still, even the best models make dumb mistakes when you're not explicit enough about some details. And often they make the same mistake even when you do...

Theres no replacement, yet, for a squishy meatbag to verify what the robot is doing.
 
Upvote
5 (5 / 0)

MattGertz

Ars Praetorian
476
Subscriptor
Very interesting.

My question to you would be... You have multiple decades of experience. I am also getting there (where has the time gone?), but as a sysadmin. I am a lousy amateur programmer, doing mostly ESP32 and retro on my old UNIX workstations.

Yes, it helps me in the same way. Even if I go write something for nextstep 3.3 or so, I can dump the documents and it can help me make sense of limitations, syntax issues, and etc, and I can get stuff done.

Of course, I'd never use any of my code in production, but it's been fun.

Now I, and especially you, have the skillset to architect, guide the LLM, catch any gross screw ups and, since you wrote the specification and you are also reviewing the code, it is probably a bit easier for you to catch something odd.

Now imagine a scenario where all the old bears are retired (or dead), and the person writing the specification is not the one resposible for reviewing and implementing the code, and the younger generations of IT professionals grew up in an environment where they don't really learn anything intimately, but use LLMs are shortcuts.

You got something done with TypeScript, but I don't think you can claim knowing TypeScript. What happens when many professionals don't really know and understand the tools they are using to get the job done?

My girlfriend teaches German and English and she says it is getting increasingly difficult to teach foreign languages because kids are not truly understanding the grammar structures and tooling of their own language, because a lot of assignments are being completed with help from LLMs, meaning that the kids don't understand it.

And I remember having serious difficulties learning Slavic languages because we just don't flex certain cases such as vocative and accusative. Funnily enough, I remember struggling to understand how it works in my native language, because we don't flex such cases, and learning the two Slavic languages I have studied got me better at understanding my native language.

Now what would have been of my learning of languages if I had taken shortcuts during my formative years just for the sake of getting the assignment done?

What will happen to the house of cards of code that powers modern life when its maintainers do not understand the tools used to maintain it?
"Now imagine a scenario where all the old bears are retired (or dead)" -- yes, this is a real issue and it does scare me (unless of course I assume I'll be dead then). That situation (brain-drain, I mean, not my death) is already manifested in a way -- tech companies are laying off so many people that critical brain-drain has already happened. In the past, younger engineers who've seen my comments in older code have reached out to be as much as a decade after I wrote it to get more clarification on it, and I don't see that need diminishing. LLM is actually pretty good at understanding code, even better than writing it, but its context window is small enough that it can't put it in context with the big picture, at least for a few more years.

No, I can't claim to "know" TypeScript, but that isn't really a blocker even without LLMs. As heretical as some might view it, most empirical coding languages are decently fungible, and skilled programmers can pick them up pretty easily in a short amount of time because the concepts are nearly the same. Engineers do this all the time as the need occurs, or simply if they are curious about a new language and what to deep-dive into it. If you know C# and JS, picking up TypeScript would be pretty trivial, and I could probably be 80% effective with it in a week or two. Again, that's not a boast -- any good, experienced coder could do that. After several decades, I can code at an expert level at a dozen languages and at an intermediate level in several more -- that's just the job. If I had to go in and fix that TypeScript, it would happen, and it would be fine. Of course, each language has its quirks that need to be learned (Rust's memory handling, for example, or how async is managed in C#), but the gist of an algorithm is going to be the same.

As you say, that knowledge will go away as the current generations of coders retire. But I also remember people being worried decades ago because higher-level languages that abstracted away from assembly code meant that no one could actually debug assembly language. Which is true today for most coders (though, again, they could pick it up at need -- it's still a similar code flow), but very few people need to know how to do this now on a daily basis. Presumably, the same concern was voiced when compilers and assemblers took over from machine language. In the emerging world, the "code" (as humans see it) will all likely be prose metadata that describes a problem and architecture in natural language, except for the few who, like the folks who still write compilers and assemblers, need to keep an eye on the lower-level details.
 
Upvote
6 (6 / 0)

motales

Ars Centurion
336
Subscriptor
Can you link to this version of Whisper that runs on your laptop?

I searched and most Whisper appears to be powered by OpenAI so is running in the cloud not your laptop -- such as https://whisperai.com/, https://www.whispertranscribe.com/
MacWhisper.

You have a choice of models. You have to download them. It's easy from the preferences.

I tried Hear and it works well but I don't know how to make it recognize different speakers. That might be due to the model I set up. Hear is command line, macWhisper is not. It can be set up with cloud models but I prefer to keep my stuff local. You can also use Apple's built in speech recognition, which is surprisingly good. I'm using WhisperKit Large v3 Turbo.

It's not perfect. Descript (which I think does use the cloud) let me edit text and have the voice change accordingly, which is terrific for class lectures and podcasts.
 
Upvote
2 (2 / 0)

Resistance

Wise, Aged Ars Veteran
417
Most of the loss accounting is due charging buildouts for future expansions as 'per user' costs. Ie they would buy hardware for 10x of future users and then cost allocate to the current 1x users.



Per user compute has dropped - the heavy users were early adapters, the casuals is where most of the growth is.



Flat rate cost means flat per user revenue.



Not really.



This is the first real issue you've identified. No good answer , although it is quite possible the senior developers are going to b eliminated as well.



No idea on this.




The availability of raw compute is a huge moat currently. There are a few companies with absurd amounts of compute and the compute for the next few years is already bought out.



Debt is absurdly cheap, every AI company is massively over subscribed (ie there is more cash available than they are willing to sell shares for each of their rounds).



There isn't 'loads of profit' for the same reason Amazon didn't show profit for many years - massive spending on growth.



Every company with GPUs rapidly depreciates them, then revises their lifetime. Their actual lifetime is probably 10-15 years (P100's from 2016 are being sold off, but it is because there isn't enough data centers so it is better to sell them and install more profitable chips that you can charge more for, not that they wouldn't continue being profitable).



Competition drives up costs, so they'd really like to discourage competition.


Here is what it costs to serve 1 million tokens from a tier 1 model - about .25$. An heavy user is 30 million tokens a month. That comes to 7.50$ in costs per heavy user - they will usually be paying a minimum of 20$. Light users are .02$ per month. Medium users are about .25$ per month. Note though that many pro users will hit the query limit for 20$ pro account long before they use that many tokens. So many heavy users will have the 200$ pro accounts.
  • Very few numbers are being released, those that are. 1. Don't look good. 2. Aren't clear enough for you to claim what you're claiming.
  • Everything I've seen shows an increase of per user compute for premium subscribers.
  • There is no flat rate cost for providing the services, use varies with time and user.
  • Yes there is debate about it, look in this thread for some handy examples. I will concede that the usefulness when cost is considered has likely improved.
  • A year or two ago I would have laughed long and hard (metaphorically) at the idea that senior developers would eventually be fully replaced. Now I will laugh medium and medium. I see no point in debating this, but I do believe that both eventualities need to be prepared for and that there is risk that people will be replaced and their replacement turning out to be insufficient to do the job long term.
  • It only takes a minimum of 2-3 companies with no moat (besides capacity) between them to bankrupt all of them by competing on price.
  • How much is the interest on the money required to build a gigawatt data center?
  • Okay, instead of loads of profit, how about "quality financial information that shows that providing LLM services isn't an incredibly risky financial decision".
  • I've read people saying that H100s are non functional after ~2-5 years of continuous use.
  • Can you please provide citations for your cost of 1 million tokens and 30 million token use? Most of the numbers I've seen are external estimates, the rest are defined unclearly, none of the publicly traded providers are telling their shareholders this information.
 
Upvote
5 (5 / 0)

jdale

Ars Legatus Legionis
18,261
Subscriptor
Good to know we'll no longer need to pay for MS Office. That and Windows licenses have been a significant cost for us. But soon will be able to replace it with an occasional fee paid to Anthropic and get our own bespoke Office. Or just piggy back on someone else who has done that already.
I don't see this happening. The spec for an application like Word or Excel is very large before you even start the session. I think you could easily create apps that handle a subset of tasks (e.g., editing rich text is trivial), but how's your workflow going to go if you need to create a new app every time you hit the limits of the old one?

Plus, every time you create a new app, while no doubt you will end up fixing some bugs of the previous version, since you're starting from scratch you'll just create a new set of unknown bugs. Office is certainly far from bug-free but I know how to work with it. The AI isn't going to do comprehensive QA.

As for "piggy back on someone else who has done that already", you could do that already. Why haven't you? Probably because the UX is different from what people know and you can't trust the compatibility when handling the same file types. Neither of those problems are going to be solved by creating a constellation of unsupported bespoke apps.
 
Upvote
3 (3 / 0)

ChrisSD

Ars Tribunus Angusticlavius
6,168
You read me correctly. That is indeed the direction things are heading, IMO. The only rails are that the writer and the reader still need to be able to exchange information, and so standards will need to be in place without drifting. (Which, in turn, makes me wonder if it's the standards that fossilize in teh future rather than the software.)
Indeed, Microsoft is dead as a company. Their cloud service is the only thing left for them (beyond being an AI reseller) but without the Microsoft products that depend on it (and the "Microsoft shops" that are all-in on Microsoft) they'll have significantly reduced market share.

Actually their cloud services will suffer even more greatly if anyone can use AI to become a cloud service provider (or even skip the middle man entirely). The increased competition alone will drive down prices.
 
Last edited:
Upvote
0 (0 / 0)

gkorper

Wise, Aged Ars Veteran
190
Subscriptor++
And you checked that line by line to make sure it was correct as would be necessary for LLM output?

I don't see how you can ensure it is accurate without either doing the converting of the numbers another way yourself to crosscheck or having to verify each entry line by line. Either of which would take far longer than just converting it yourself

For a really low tech example, I'm reasonably confident that I could copy and paste 72 pages into a tidy table in an hour or two with no need to fully cross-check after as a result, so it would probably be faster. And there are definitely ways to automate parts of that even on a document by document basis.
You have clearly not tried to copy and paste pages of table data from a PDF. You will lose all the field boundaries and recreating those could take days and you will probably make mistakes. And god forbid you have any multi-line cells to recombine. I am not the original poster but I have had to solve this problem for dozens of 100+ page PDFs and, with the proper prompts*, LLMs are incredibly accurate (<1 error per 10 pages on text and 0 for numbers). Not accurate enough as a primary source for my needs but good enough to be used to diff against Python extracted data to find variances. Then on top of that each field is validated using pattern matching (years are 4 digits, names don't start with numbers, etc).

*Example prompt snippet:
Markdown (GitHub flavored):
### Transcription Fidelity (CRITICAL)
- **DO NOT fix or correct perceived transcription errors** in the source PDF
[FONT=-apple-system][FONT=-apple-system]- Your job is faithful transcription, NOT interpretation or correction[/FONT]
- If a name looks like a typo (e.g., "Smtih" instead of "Smith"), transcribe the typo as written[/FONT]
- The source PDF may have legitimate historical spelling variants (Patsey/Patsy, Elisabeth/Elizabeth)
- If a name is spelled "Patsey" in the PDF, transcribe it as "Patsey" (not "Patsy")
- Trust what you see in THIS row, even if a similar name is spelled differently elsewhere
- Each row is independent - do NOT cross-reference with other rows to "fix" inconsistencies
[FONT=-apple-system]
[/FONT]
 
Upvote
4 (4 / 0)

MattGertz

Ars Praetorian
476
Subscriptor
Indeed, Microsoft is dead as a company. Their cloud service is the only thing left for them (beyond being an AI reseller) but without the Microsoft products that depend on it (and the "Microsoft shops" that are all-in on Microsoft) they'll have significantly reduced market share.

Actually their cloud services will suffer even more greatly if anyone can use AI to become a cloud service provider (or even skip the middle man entirely). The increased competition alone will drive down prices.
That might be taking the point too far. It's definitely always been true for software/hardware companies that the rule is "change or die," given that the field itself has no logically stopping point (unlike, say, paper towels or WD-40 or whatever). The ground is littered with the corpses of companies that didn't understand that principle (Wang, DEC, etc.), and many others were changed beyond all recognition by not taking it seriously enough (IBM).

Not too long ago, Windows was the top dog at MSFT, and all of us here circled it like planets in orbit -- but that's not true now by any means (and I have some funny stories about that transformation that I'll share someday). Office has been ubiquitous and important, but has only tangential impact on, say, Visual Studio or Bing (or Xbox, though admittedly that's an odd one to bring up these days). Currently Azure is what Wall Street watches -- but to be clear, Azure's tie-in to other Microsoft products isn't locked in by any means, and its features are first and foremost focused on availability, performance, and stability. AI will definitely disrupt it somehow, but Azure's fate is in general not tied to other 1P products. And as fast as AI is moving, it will still take years to fully move the tech sphere. AI costs a lot, and folks will be relying on existing tech for some time.

Of the big five, Microsoft, Meta, Google, Apple, and Amazon are all trying to surf the current AI wave. Right now, I'd give pretty good marks to Google (who was able to leverage a lock on information retrieval, regardless of how it's been presented in recent years) and Microsoft (who got out in front first and has outstanding in-roads globally to leverage for it), and decent credit to Apple for knowing when to dial back on their own efforts and align with others while still producing high-quality hardware. Amazon has the leading Cloud service and ultimate extension into the market, so they'll be safe for many years until they can rationalize their current "list everything vaguely connected with the request" model with the "just give me an answer" flow that LLMs drive. Meta is in a very awkward place, though, with Facebook users aging out, and I suspect they'd need a massive transformation to keep up.

Which is a lengthy way to say that no one's dead yet, but they will be if they don't adapt.
 
Upvote
4 (4 / 0)

Shiunbird

Ars Scholae Palatinae
728
As you say, that knowledge will go away as the current generations of coders retire. But I also remember people being worried decades ago because higher-level languages that abstracted away from assembly code meant that no one could actually debug assembly language. Which is true today for most coders (though, again, they could pick it up at need -- it's still a similar code flow), but very few people need to know how to do this now on a daily basis. Presumably, the same concern was voiced when compilers and assemblers took over from machine language. In the emerging world, the "code" (as humans see it) will all likely be prose metadata that describes a problem and architecture in natural language, except for the few who, like the folks who still write compilers and assemblers, need to keep an eye on the lower-level details.
I think the concern is slightly different. When we moved from machine language to compilers and assemblers, you still had to learn how to code and, to some extent, understand how the underlying structures work, so you can adapt and adopt existing knowledge, notions and concepts to different problems, coding in a different language or even across domains.

Example: my boss is a brilliant programmer but doesn't know all that much about infrastructure. We've had tons of discussions why "storage performance can't be the problem because we are in the cloud" or "our workloads in kubernetes should use exactly the same amount of memory as they do when running as an appservice in azure" - even in face of evidence to the contrary he couldn't accept my points. Now here is a person that at least knows well what he is doing and we eventually dropped the kubernetes idea as it was not cost-effective.

Now imagine a critical mass of future IT professionals that LLM'd their way through their studies who will not have the synapses in their brains resonating when the answer is not directly obvious and relatable to what they can directly see when faced with a problem.

Heck - even when I play with my legacy C and my old workstations, more often than not, LLMs don't help at all. It loops continuously offering solutions that are invalid in the flavour of Obj-C that runs on my nextstep 3.3 workstation. It also fails when I try to use it to write MAX-optimized code for PA-RISC or use HP's OpenGL optimizations in HP-UX. There's no way out for me than reading the manuals.

Now imagine 30-40-50 years down the line? COBOL code still running? And what about the folks who, as you mentioned, should be the ones writing compilers and assemblers and also LLM'd their way through graduation?

I don't know... I am probably not as optimistic as you sound (maybe I am getting your tone wrong)
 
Upvote
3 (3 / 0)

ChrisSD

Ars Tribunus Angusticlavius
6,168
That might be taking the point too far. It's definitely always been true for software/hardware companies that the rule is "change or die," given that the field itself has no logically stopping point (unlike, say, paper towels or WD-40 or whatever). The ground is littered with the corpses of companies that didn't understand that principle (Wang, DEC, etc.), and many others were changed beyond all recognition by not taking it seriously enough (IBM).

Not too long ago, Windows was the top dog at MSFT, and all of us here circled it like planets in orbit -- but that's not true now by any means (and I have some funny stories about that transformation that I'll share someday). Office has been ubiquitous and important, but has only tangential impact on, say, Visual Studio or Bing (or Xbox, though admittedly that's an odd one to bring up these days). Currently Azure is what Wall Street watches -- but to be clear, Azure's tie-in to other Microsoft products isn't locked in by any means, and its features are first and foremost focused on availability, performance, and stability. AI will definitely disrupt it somehow, but Azure's fate is in general not tied to other 1P products. And as fast as AI is moving, it will still take years to fully move the tech sphere. AI costs a lot, and folks will be relying on existing tech for some time.

Of the big five, Microsoft, Meta, Google, Apple, and Amazon are all trying to surf the current AI wave. Right now, I'd give pretty good marks to Google (who was able to leverage a lock on information retrieval, regardless of how it's been presented in recent years) and Microsoft (who got out in front first and has outstanding in-roads globally to leverage for it), and decent credit to Apple for knowing when to dial back on their own efforts and align with others while still producing high-quality hardware. Amazon has the leading Cloud service and ultimate extension into the market, so they'll be safe for many years until they can rationalize their current "list everything vaguely connected with the request" model with the "just give me an answer" flow that LLMs drive. Meta is in a very awkward place, though, with Facebook users aging out, and I suspect they'd need a massive transformation to keep up.

Which is a lengthy way to say that no one's dead yet, but they will be if they don't adapt.
The name "Microsoft" might continue but as they do not have an AI of their own there's no way they'll remain "big tech". Being a front-end for other people's AI is not nothing but it is a very different role in the tech landscape than it's used to (and makes them entirely dependent on actual AI companies). And it will likely be one among very many.
 
Upvote
-1 (0 / -1)
Changing the available tools, switching models, or modifying the sandbox configuration mid-conversation can all invalidate the cache and hurt performance.
Really useful tip, ty! It is also interesting the poor context use is also a partially a choice of convenience xD

I will say, I don't like the auto /compact.. If I have let my context get to 30% it is because we are really in the zone and it has touched or written a lot of the code I want to change.. I know I am burning my credits.. but I swear if you /compact on me in the middle of this Codex will never solve this bug.
 
Upvote
0 (0 / 0)

clb2c4e

Wise, Aged Ars Veteran
145
You have clearly not tried to copy and paste pages of table data from a PDF. You will lose all the field boundaries and recreating those could take days and you will probably make mistakes. And god forbid you have any multi-line cells to recombine. I am not the original poster but I have had to solve this problem for dozens of 100+ page PDFs and, with the proper prompts*, LLMs are incredibly accurate (<1 error per 10 pages on text and 0 for numbers). Not accurate enough as a primary source for my needs but good enough to be used to diff against Python extracted data to find variances. Then on top of that each field is validated using pattern matching (years are 4 digits, names don't start with numbers, etc).

*Example prompt snippet:
Markdown (GitHub flavored):
### Transcription Fidelity (CRITICAL)
- **DO NOT fix or correct perceived transcription errors** in the source PDF
[FONT=-apple-system][FONT=-apple-system]- Your job is faithful transcription, NOT interpretation or correction[/FONT]
- If a name looks like a typo (e.g., "Smtih" instead of "Smith"), transcribe the typo as written[/FONT]
- The source PDF may have legitimate historical spelling variants (Patsey/Patsy, Elisabeth/Elizabeth)
- If a name is spelled "Patsey" in the PDF, transcribe it as "Patsey" (not "Patsy")
- Trust what you see in THIS row, even if a similar name is spelled differently elsewhere
- Each row is independent - do NOT cross-reference with other rows to "fix" inconsistencies
[FONT=-apple-system]
[/FONT]
Sure i have, and cannot code, but transcribing did not take me days, but yes it's a pain. Like you said though, one error per 10 pages is too much.

But cool prompts. Once there is a secondary check method, this is a no brainier
 
Upvote
0 (0 / 0)

Hydrargyrum

Ars Praefectus
4,040
Subscriptor
Very interesting.

My question to you would be... You have multiple decades of experience. I am also getting there (where has the time gone?), but as a sysadmin. I am a lousy amateur programmer, doing mostly ESP32 and retro on my old UNIX workstations.

Yes, it helps me in the same way. Even if I go write something for nextstep 3.3 or so, I can dump the documents and it can help me make sense of limitations, syntax issues, and etc, and I can get stuff done.

Of course, I'd never use any of my code in production, but it's been fun.

Now I, and especially you, have the skillset to architect, guide the LLM, catch any gross screw ups and, since you wrote the specification and you are also reviewing the code, it is probably a bit easier for you to catch something odd.

Now imagine a scenario where all the old bears are retired (or dead), and the person writing the specification is not the one resposible for reviewing and implementing the code, and the younger generations of IT professionals grew up in an environment where they don't really learn anything intimately, but use LLMs are shortcuts.

You got something done with TypeScript, but I don't think you can claim knowing TypeScript. What happens when many professionals don't really know and understand the tools they are using to get the job done?

My girlfriend teaches German and English and she says it is getting increasingly difficult to teach foreign languages because kids are not truly understanding the grammar structures and tooling of their own language, because a lot of assignments are being completed with help from LLMs, meaning that the kids don't understand it.

And I remember having serious difficulties learning Slavic languages because we just don't flex certain cases such as vocative and accusative. Funnily enough, I remember struggling to understand how it works in my native language, because we don't flex such cases, and learning the two Slavic languages I have studied got me better at understanding my native language.

Now what would have been of my learning of languages if I had taken shortcuts during my formative years just for the sake of getting the assignment done?

What will happen to the house of cards of code that powers modern life when its maintainers do not understand the tools used to maintain it?
This is a reasonable point, but also kind of overlooks that the entire history of computing and software tooling is basically iterations of this.

What percentage of .NET application developers these days have any real understanding of IL bytecode and how, exactly, the garbage collector actually works? Let alone a good understanding of x64 and ARM instruction sets and memory protection models? These things become the concern of compiler developers, while the application engineers work to the abstraction they actually see. It is already the case that no one individual understands the entire stack of computing technology. This is just piling another layer on the top. People who have the skills to understand the code that the LLMs generate are going to become analogous to the specialists who can debug problems in the compilers and other low-level tools.

EDIT: Urgh, sorry for repeating similar points to other people who responded to you. I really must form the habit of reading the whole conversation before writing my own reply.
 
Upvote
-1 (0 / -1)

Shiunbird

Ars Scholae Palatinae
728
EDIT: Urgh, sorry for repeating similar points to other people who responded to you. I really must form the habit of reading the whole conversation before writing my own reply.
Don't worry. =)

I am not sure I am making more point clear. My main worry is that we reach a critical mass of people who LLM'd through their education rather than actually learning, so the whole scaffolding collapses. I have no problems with LLM's and other technology giving people more power over their computing experience and allowing them to do more complex tasks without seeking a professional. My concern is with the quality of the professionals of the future.
 
Upvote
2 (2 / 0)