Not sure "people in this thread" are entertaining that belief. It's too black-and-white.A lot of people on this thread think it follows that if an LLM can't write perfect production code 100% of the time, that means they're completely useless. But that's a glaring logical fallacy.
I don't know what can be done about this. Many people overestimate the abilities of AIs. The abilities are also oversold to us by the companies developing said AIs, and the companies investing in said AIs.Not sure "people in this thread" are entertaining that belief. It's too black-and-white.
I hope we can agree on this, though: LLMs are definitely very easy to abuse because they give the illusion of "someone who knows their shit". And we, humans, are too easy to trust the machine that wrote the really good-seeming text on the screen.
Not really. You ask an LLM to search for something and it can instantly run many searches on many combinations of phrases that you may not have thought of, collate all the results and present them to you in a nice way. This is pretty great....
- You're using it instead of a search engine: not a good use at all, because you're literally expecting to see what other people wrote on the Web (you're searching for info), but instead you're given an out-of-date probabilistic regurgitation of the Web. Searching with LLMs is just weird...
Huh... That's not a bad idea at all. Regular search engines should provide this feature, it's very basic and doesn't need the LLM overkill.Actually I was just using an LLM to search for something yesterday and saw that it was doing web searches on terms in foreign languages, which is something I definitely wouldn't have done (or thought to do) myself. Pretty great.
Aren't regular search engines already doing this? They always transform the query you enter and, at the very least, they search for synonyms and usual related phrases as well.Not really. You ask an LLM to search for something and it can instantly run many searches on many combinations of phrases that you may not have thought of, collate all the results and present them to you in a nice way. This is pretty great.
Of course search engines do a lot of processing on whatever you've entered in order to give you good search results.Aren't regular search engines already doing this? They always transform the query you enter and, at the very least, they search for synonyms and usual related phrases as well.
Of course, they can't match the rephrasing functionality of an LLM, but in this case, even basic rephrasing is an improvement, and should already be available. I'd be surprised if it doesn't happen in Google or DDG.
I'm not sure we're talking about the same thing here.Huh... That's not a bad idea at all. Regular search engines should provide this feature, it's very basic and doesn't need the LLM overkill.
Just checked in DuckDuckGo settings and there's nothing about it. Yes, you can set one specific region for results, but not even a specific language.
I remember seeing multilingual results popping up for searches in Google ~20 years (!!) ago, but nothing afterwards (??).
WAIT, just checked Google Search Settings. You can search in multiple languages at once, but you have to manually choose the languages first. Only after setting your preferred languages in the Search Settings you'll see multilingual results. This is very interesting.
I am a doctor with lots of hobbyist enthusiasm. My programming was typically done in Stata for data analysis. Additionally I used to study code written by others to understand how it was working for our research studies. However I understood the basics of web development, concepts of databases , ORMs, routing, proxying, setting up Wordpress blogs etc etc. Lots of good ideas, no support to implement them.
Now with the help of LLMs and coding agents, I feel so empowered. I have developed a fairly complex system of managing eye/ fundus images, done a custom fork of OpenDataKit Central and collect apps that implements short term logins, guided development of medical Research Publication repository, a school eye survey platform, set up a mail server and so on. Personally subscribing to three frontier models and have GLM plan as well. Used Codex, Qwen, Gemini, KiloCode, Opencode and recently AntiGravity. Last 6 months has seen all of this action.
So programmers may be taking more time but semi programmers like me are able to massively enhance productivity. Frankly a bad programmer would have written really bad code earlier will now actually be writing better code.
I continue to be amazed by the strides being taken in this space every week. FYI, my personal blog is at epidemiology dot tech and github handle is drguptavivek. Check fundus_image_xtract repo and Collect and Central repos in GitHub for some of the work done using the LLMs by a medical faculty.
Cheers
Vivek
Now, the agents are leaps and bounds better and I can let them make changes, write tests, build and run the tests, and self-correct without intervention. I just come in afterwards and approve or reject or suggest improvements to the changes. It's easily an order of magnitude improvement in the AI's velocity. You need a new study.
I am retired (5 years) and despite programming everyday in my retirement, I won't be using AI for this... no interest as I am an old skool grey beard.LOL, why in the world would anybody hire you if you're giving them that response.
LLMs are invaluable e.g.
- As a replacement for documentation of programming languages, APIs
- For anything you used to look up on Stack Overflow
- To give you ideas for what might be causing (and how to fix) any particular bug or compiler/linker error
- To help explain or give a quick overview of some unfamiliar code
- To translate some code between programming languages (e.g. if you find a function for something you want to accomplish but it's written in a language that you're not very familiar with)
Frankly, if you're a developer who's not using LLMs for anything these days, I question whether or not you can be considered a professional developer. Or even a good one.
If I interviewed a developer who said that he refused to read any documentation because he already knew what he's doing, I wouldn't hire that person either.
There are a few ways to tackle this. Some agent-focused IDEs will have them essentally 'train' on your codebase to extract styles and patterns that they continue to emulate when generating new code.I am retired (5 years) and despite programming everyday in my retirement, I won't be using AI for this... no interest as I am an old skool grey beard.
Maybe my situation is different but I spent the last 25 years of my career working on my own commercial software (used in mission critical deployments in large govt/corp orgs). While the application had a web based front end (in plain old HTML 4), the critical code was in 'C' for Unix and Linux platforms.
I would have program modules with 10,000+ lines of code.
Whenever I read of dev's talking about using AI, it tends to reference creating a small function or algorithm (i.e. short snippets of code).
So my question is: How would you prompt/use AI to generate 10,000 lines of mission critical code (total code base might be > 2 million lines of code), how do you get it to understand what code needs to be written when the product's use case is unique or not well understood (i.e. example code from it's training doesn't exit)?
If you generate 1000 lines of new code in a new module , do you need to retrain it on these lines so that it understands how to write the next 1000's in the same module?
Can you get it to test iteratively as you layer in new logic into the currently incomplete module (this was standard practice for me... test each new part as I went which might include creating tables and data as part of the sub-code test)?
How do I ensure code consistency (say in structure of variable names, favoured 'C' constructs etc)?
Can you use it to create the test plans, create test databases and data inside the databases to test the product end to end such that all edge cases are covered, to design the code and test the code for robustness, scalability and recoverability?
In the case of robustness, how do I prompt it to wrote code that validates the results of every action performed (especially database result sets)?. Being mission critical means a higher level of standard than say a shopping cart or a social media message.
Again, I am old skool so maybe in today's world, programs are devolved into 1000's of small snippets that are glued together into sequence and maybe AI works for this but I grew up in a "top down" world (but obviously my code was modular...it wasn't a single big arsed slab of code)
Thanks,
Bluck
I am retired (5 years) and despite programming everyday in my retirement, I won't be using AI for this... no interest as I am an old skool grey beard.
You can ask it to do lots of things. That doesn't mean it can do them. Report after report has shown LLMs can't generate truly secure code. From github's octoverse report showing a 173% increase in access control issues to Coderabbit's own whitepaper showing that LLMs make 1.7x more mistakes across all aspects of the software engineering process - including security. This makes sense overall when you consider that LLMs don't actually understand anything about what they're writing, they're just making a guess based on weighting.By default, it won't necessarily focus on 'secure' code, but if you ask it to, it can do it.
What's the most recent report you're referring to?You can ask it to do lots of things. That doesn't mean it can do them. Report after report has shown LLMs can't generate truly secure code. From github's octoverse report showing a 173% increase in access control issues to Coderabbit's own whitepaper showing that LLMs make 1.7x more mistakes across all aspects of the software engineering process - including security. This makes sense overall when you consider that LLMs don't actually understand anything about what they're writing, they're just making a guess based on weighting.
That doesn't make LLMs bad tools. It just makes them tools. Understanding how those tools work is part of using them correctly.
Having read your posts and witnessed your ridiculous behavior in this and similar threads: no, you are not willing to learn, and I have so little respect for you that I am not going to waste any time explaining. Read it again real slow, I guess.Okay, I'm willing to learn. In what way have I misread your post?
I dunno, man. I've read the post like 5 times.Having read your posts and witnessed your ridiculous behavior in this and similar threads: no, you are not willing to learn, and I have so little respect for you that I am not going to waste any time explaining. Read it again real slow, I guess.
Same thing with code. You just have to realize you're not getting code, you're getting something code-shaped.
I'm not sure anyone should be talking authoritatively about either code or LLMs, when their tools of choice are Excel and Copilot ...As for me, I probably get the most use out of LLMs by dropping Excel formulas into CoPilot.
I ran over the below linked article recently, which sounds like a very solid approach to maximizing productivity from Claude Code.I have been doing a bunch of sandbox experimenting with Claude code and it's incredibly capable. It's also very likely to have badly flawed and bugged result as the prompts or codebase get larger and/or less organized. I think the concept of this being powerful has been proven but implementation is kind of a can of worms still.
1. Even if those LLMs can make "working" code, it's not really "working" code because LLMs still don't understand access control and security concerns. Github recently highlighted this in their State of the Octoverse report where Access Control issues went up 173%.
I mean, you can't just say "LLMs don't understand access control and security concerns" and "LLMs can't generate truly secure code" and have what you're saying be axiomatically correct, without first narrowing your definitions. (Let's ignore the semantic minefield of using the word "understand" in the context of anything an LLM does, because LLMs don't and can't "understand" anything.)You can ask it to do lots of things. That doesn't mean it can do them. Report after report has shown LLMs can't generate truly secure code.
That’s a logical extreme. But I’d argue that it’s needlessly extreme when applied universally.I'm currently leaning towards telling everyone I know not to use LLMs for anything you're not personally an expert in, because otherwise how do you judge the validily of the generated output?
Hey...I don't always do rocket science, but when I do, I do it in Excel. Or, sometimes, Matlab.Of course you're getting actual code. It may not always be bug-free if you're trying to one-shot something complex. But, if you're refactoring or adding to a mature project, and if you provide clear instructions in your prompt (just as you would to a human programmer) a decent model will almost always produce excellent code.
I'm not sure anyone should be talking authoritatively about either code or LLMs, when their tools of choice are Excel and Copilot ...
we use Claude with API access,
Of course, it required A LOT of training on how to extract value from the tool.
If you can look at your situation objectively, the response you’re quoting from seems like a good list of answers to draw from when interviewing.No, they're not, and it's just as easy to simply go look at the docs for the language. That's actually better since you don't know when that data was last scraped and you might learn something new if the documentation has been updated.
I know how to code so I look up stuff on Stack Overflow maybe 1-2 times a day and those lookups take 3-5 min. There is no time saving by me asking the AI.
I know how to code so I know how to fix things and I know how to deal with errors.
Again I know how to code so I can simply read it and be done.
I know how to code and it all works basically the same. I also started in C/Java and since everything is based on C/C++ I've never had an issue porting things. The only language that reads different is assembly, otherwise it's all for/if/else, variables, and maybe classes if the language is OO. As I've been saying for many, many years for the most part it all fucks the same.
I never said that I don't read documentation. I have said that I do not want to become a copy paste bot that will eventually be replace. If you are letting the machine do your work and you're simply babysitting you are on the replacement chain, it's just a matter of time.
Thanks for taking time out and visiting theSo I guess the repo is https://github.com/drguptavivek/fundus_img_xtract ?
My immediate thought on that is "Good lord, I hope you ran it through the correct review processes". It appears to process and store patient information; I don't know where you are, but pretty much anywhere in the world, there's a ton of regulations about that, and I'm not at all sure you can rely on LLM-generated code meeting them. It also boasts of "a sophisticated hybrid access control model combining both Role-Based and Attribute-Based Access Control", which is the kind of thing you really need a professional to review for security.
That is the overall point I'm making. I use LLMs myself when I code, but not without oversight. I think they have a particular value when it comes to large documentation sets/codebases and sifting through them. I just have other reasons (Mostly Environmental/Power Grid/Privacy Related) for not necessarily wanting them to exist/keep expanding.Maybe by LLMs not being able to make "working" code, you meant that nothing an LLM puts out should be treated as production-ready (or even runnable at all) without first putting human eyes on it. Or that vibe-coding an entire application all at once without a proper spec and without taking stock of the security side of things is insane. If either of those is in fact the case, we're totally in agreement!
I think 100% that the github report only truly shows how blindly people trust the output. I also think that it's a reminder that the default output for an LLM is not necessarily secure or feature complete.edited to add - Wanted to clarify that I'm not at all doubting the github data, and I absolutely believe their State of the Octoverse report. But at the risk of sounding like I'm making a steve jobs-esque "You're holding it wrong!" accusation, one must not blame the bow for the missed shot. One must blame the archer. (Especially if the archer is standing next to the bow, trying to get it to shoot itself with only vague shouted instructions.)
So the same issue we’ve always had with computers: they will do exactly what you tell them to do. But you have to convert your thoughts into their language. In the AI case, it’s a higher level language that looks like natural language; the output is code, not machine code/assembly/intermediate tokens; and there’s a randomness element thrown in to make the process more difficult to predict.I might also recommend that your team elevates its planning documentation game. This was always good engineering practice but it's also extremely useful context for LLMs. At the very least, make sure you have detailed descriptions of the expectations in your stories or task breakdowns.
And for anything that's going to take a couple weeks or more of implementation work, a document describing the planned changes. Start solving some of the high level problems and revealing the unknown unknowns earlier in the process. Like I said, this is valuable - LLM or no - but LLMs are particularly clueless of these sorts of challenges so writing these things down well really help them.
Context, as described in the article, is a precious resource for LLMs but good context is extremely valuable to them. For me, it's often the difference between an agent's output being a useless waste of time and being functionally what I wanted.
This is one area I’ve found AIs to be somewhat helpful. At times. Specifically, the AI summary at the head of DuckDuckGo results when searching for how to use an API I’m not familiar with has saved me time over sifting through the list of search result links, when the API looks similar to other, unrelated APIs on different platforms. DDG Search assistant seems to keep the context of which platform/library I’m looking for better than the classic search engine does.You're using it instead of a search engine: not a good use at all, because you're literally expecting to see what other people wrote on the Web (you're searching for info), but instead you're given an out-of-date probabilistic regurgitation of the Web. Searching with LLMs is just weird...
This is not a problem limited to computers. As I said, it's just following good planning and documentation practices that were already established but, let's be honest, most folks don't really follow.So the same issue we’ve always had with computers: they will do exactly what you tell them to do. But you have to convert your thoughts into their language. In the AI case, it’s a higher level language that looks like natural language; the output is code, not machine code/assembly/intermediate tokens; and there’s a randomness element thrown in to make the process more difficult to predict.
A big reason good planning and documentation processes aren’t followed is that humans are “good enough” at dealing with sparse information, either filling in something reasonable (experienced people), asking for clarification (junior people), or catching when less reasonable things are filled in (processes that expect humans to create errors). AI code generators sound like they mainly plow ahead producing garbage in the face of insufficient info, which means more precise specifications are needed than for humans, which equates to the old problem of doing more precisely what they’re told than humans. I’m not saying humans wouldn’t work better with precise specs, but that gap in the “good enough” aspect is still similar to every other major breakthrough in programming. At least it’s sufficiently similar to remark on.This is not a problem limited to computers. As I said, it's just following good planning and documentation practices that were already established but, let's be honest, most folks don't really follow.
But historically it wasn't at all about talking to a computer. It was about aligning the development team and the stakeholders.
I used AI to help prepare refactoring edits in a fairly complicated library that's about 10,000 lines of compile-time C++ template magic to dynamically construct an adapter class based on reflection over the arguments for a supplied C++ function. Which sounds a lot like the sort of complicated module you are describing.I am retired (5 years) and despite programming everyday in my retirement, I won't be using AI for this... no interest as I am an old skool grey beard.
Maybe my situation is different but I spent the last 25 years of my career working on my own commercial software (used in mission critical deployments in large govt/corp orgs). While the application had a web based front end (in plain old HTML 4), the critical code was in 'C' for Unix and Linux platforms.
I would have program modules with 10,000+ lines of code.
Whenever I read of dev's talking about using AI, it tends to reference creating a small function or algorithm (i.e. short snippets of code).
So my question is: How would you prompt/use AI to generate 10,000 lines of mission critical code (total code base might be > 2 million lines of code), how do you get it to understand what code needs to be written when the product's use case is unique or not well understood (i.e. example code from it's training doesn't exit)?
If you generate 1000 lines of new code in a new module , do you need to retrain it on these lines so that it understands how to write the next 1000's in the same module?
Can you get it to test iteratively as you layer in new logic into the currently incomplete module (this was standard practice for me... test each new part as I went which might include creating tables and data as part of the sub-code test)?
How do I ensure code consistency (say in structure of variable names, favoured 'C' constructs etc)?
Can you use it to create the test plans, create test databases and data inside the databases to test the product end to end such that all edge cases are covered, to design the code and test the code for robustness, scalability and recoverability?
In the case of robustness, how do I prompt it to wrote code that validates the results of every action performed (especially database result sets)?. Being mission critical means a higher level of standard than say a shopping cart or a social media message.
Again, I am old skool so maybe in today's world, programs are devolved into 1000's of small snippets that are glued together into sequence and maybe AI works for this but I grew up in a "top down" world (but obviously my code was modular...it wasn't a single big arsed slab of code)
Thanks,
Bluck
This is how everyone gets hacked with some stupid Chinese/Russian virus because people are installing some backdoor from some library that doesn't exist, but AI thinks it does.That is the overall point I'm making. I use LLMs myself when I code, but not without oversight. I think they have a particular value when it comes to large documentation sets/codebases and sifting through them. I just have other reasons (Mostly Environmental/Power Grid/Privacy Related) for not necessarily wanting them to exist/keep expanding.
I think 100% that the github report only truly shows how blindly people trust the output. I also think that it's a reminder that the default output for an LLM is not necessarily secure or feature complete.
I think you're asking two different "whys" here — a technical "why," and then a much more heartfelt "why the fuck would they do this" kind of why.Why on Earth would you write code that doesn't pass your own code review tool? But happens all the time.
I’m not willing to write off SraCet, so I’ll take a shot at this swirling pit of pedantric disagreement.I dunno, man. I've read the post like 5 times.
You wrote "It will be your job, developers." and yet you're claiming that you somehow weren't warning of lost jobs.
Maybe somebody else on the thread can chime in and explain how I'm misunderstanding your post, but I think it's far more likely that everybody else thinks you're confused and wrong too.