Here’s what’s really going on inside an LLM’s neural network

JoHBE

Ars Praefectus
4,136
Subscriptor++
Nothing, your assessment is correct I think. I’m just saying that the (rephrased in your terms) question “why does this input cause the output to decode to a statement that we interpret to be false when that almost-identical input decodes to a statement that we interpret to be true?” is, as you say, un-disentangleable to our current tools, whereas “why is it the case that that there exist inputs that result in outputs that decode to statements we interpret to be false?” is very straightforward to answer. The general case of “why does it happen at all?” is well-characterised; the specific case of “why this time and not last time?” is not.
I think that question is about as interesting and insightful as wondering why a particular coin toss turned out heads or tails.. The answer will always be fundamentally about statistics, and not about any exotic mysterious quality.
 
Upvote
-4 (5 / -9)
A few bruised thumbs cast no shadow on the hammer. But two bombs spoiled nuclear weapons for everyone now.

Nuclear weapons have been in continuous strategic use and development since they were invented, right up to the present day, though they haven't been used offensively in the field since the first two. They certainly didn't go away, and I don't see how "convincing people not to use them unless you really, really have to" counts as "spoiling them for everyone..."
 
Upvote
12 (12 / 0)
"Why do they lie generally?" is, as you say, a pretty dull question. "Why did it lie this time?" is poorly-understood at best. I think those two questions are often conflated, which I agree is confusing.

This conflation has led people who know nothing about AI to say that nobody knows how AI works. They then try to explain AI to others with misinformation, believing that their guess can't be worse than anyone else's.
 
Upvote
10 (12 / -2)

Defenestrar

Senator
15,623
Subscriptor++
I'm only a neurobiologist with some Python knowledge, but the following quote from Sci-Fi writer Charles Stross feels plausible enough to me: "What we're getting, instead, is self-optimizing tools that defy human comprehension but are not, in fact, any more like our kind of intelligence than a Boeing 737 is like a seagull."
That's a good point. A 737 will always perform within its design space, but a seagull can pull a Johnathan Livingston and learn to be as fast as thought itself.
 
Upvote
3 (3 / 0)

Pishaw

Ars Scholae Palatinae
1,040
This conflation has led people who know nothing about AI to say that nobody knows how AI works. They then try to explain AI to others with misinformation, believing that their guess can't be worse than anyone else's.
Or, the machine thought it's story was as good or better than the truth. And the truth is fluid, right? Didn't someone say there are 'alternate realities'?

If AI is going to learn to be more 'human', they will learn that from us. And as we are just so fucking shitty, they will learn to be shitty also.

Is there another way to look at this? If so, I don't see it.
 
Upvote
-5 (0 / -5)

nononsense

Ars Tribunus Militum
2,484
Subscriptor++
I think that question is about as interesting and insightful as wondering why a particular coin toss turned out heads or tails.. The answer will always be fundamentally about statistics, and not about any exotic mysterious quality.
I would be interested to know what you refer to as a 'mysterious' quality. Magic? For me, everything is fundamentally about statistics. Everything is explainable if we have enough information and enough time to understand. You seem to be implying that human consciousness has a layer of mystery or magic that a machine could never possess.

Humans are just intelligent apes. There is no mysterious quality that makes us what we are. No one knows if machines can duplicate or surpass our intelligence or even what that will look like.

"Oh, you think you're conscious? Lol, that's cute." -the first AGI converstation
 
Upvote
12 (13 / -1)

monogoto

Wise, Aged Ars Veteran
109
I'm just a layman, but it keeps puzzling me why the question "why they often confabulate information" is considered relevant or interesting. It's just a matter of complicated statistics, what else could it be? What am I missing here? The network follows exactly the same process each time - whether the output ends up lining up with something that we can determine to be true (via external means) or whether it happens to end up lining up with something that we can determine to be false or nonsensical. It's not like the latter cases are caused by some bug or malfunction. Because at no point is there any process or capability invoked that goes beyond statistical relationships. There's no "truth" module, no "double-check" phase, no "how important is this" assessment, no way to suspend the statistics and employ some other approach that would be more suitable at some point.
No scientific method, at least in the urgent batch.
 
Upvote
0 (0 / 0)

zogus

Ars Tribunus Angusticlavius
7,181
Subscriptor
How difficult could it be to deceive us? Half of us believe a giant lizard lives in Loch Ness and the Earth is flat.
Well, you can't fix stupid. As any properly educated person knows, Earth is round and The Man doesn't want you to know that there is a giant lizard in Loch Ness.
 
Upvote
0 (0 / 0)

SeeUnknown

Ars Praetorian
582
Subscriptor
Most of neurobiology looks like this, too: "Find the relevant location for $Behavior, artificially tune its activity up or down, watch what happens".
I find it strangely amusing that we've arrived at a conceptually similar procedure for artificial neural nets.

See for example this neat study, which pinpointed the neurons that control how pregnant female mice build nests for their pups. After finding the neurons, they made them artificially more excitable (by making them sensitive to light) or less excitable (by knocking in an engineered receptor for a specific chemical), and then saw that nests were more or less elaborate. >5 years of work, building on 120 years of neuroscience, neuroanatomy and behaviour studies.

Graphical abstract:
View attachment 81313
I think this bio and compute goes to pull the covers off on how our minds actually work.
 
Upvote
4 (4 / 0)

ljw1004

Wise, Aged Ars Veteran
121
I don't have an inner monologue, and I don't think visually. Instead, as far as I can tell from introspection, I think in concepts and the network of relationships between them. (it's hard for me to translate my thoughts into words, and I don't easily come up with pictures).

The concept-map picture from Anthropic looks AMAZINGLY similar to my subjective impression of how my mind works.
 
Upvote
10 (11 / -1)
While we all appreciate the fact the China is most likely well behind on the technology, it was amusing to hear them say they were waiting to release their AI until it's responses were "appropriately socialist."

I guess this is one way they would do that.
i'm pretty sure they are trying to eliminate any trace of winnie the pooh from generated results and failing. good luck doing that this way too.
 
Upvote
1 (2 / -1)

nivedita

Ars Tribunus Militum
2,256
Subscriptor
Sorry, I still don't get it.

It simply ALWAYS does its "thing" correctly, and it is OUR brain/intelligence/ability to evaluate the ouput/ that introduces concepts like "lie", "truth", "accurate". "utter nonsense", "ALMOST there". Without US as an external interpreter of what comes out of it, it is totally helpless and aimless, and those terms have no meaning.

WHAT exactly is there in the programming/finetuning/tweaking/fundamentals that would be expected to somehow go beyond opaque and un-disentangibly complicated statistical relationships? Each and every output is nothing more than a gamble, hoping that some obscure numbers end up in your favor.
Without an external world to test what you think or say against, there isn’t anything in what you think either. If you look through the previous articles on LLM confabulation, it seems that LLMs do have some notion of how likely whatever they’re saying is likely to be true. They can be tuned between saying only things that they are confident about or being more lax. One extreme means they don’t say very much beyond repeating well known facts, you can’t have much of a conversation with such a setting. The other extreme will always make something up even if it has no data about whatever it is you’re talking about.

Also, “complicated statistics” can be very interesting. The real world is pretty much complicated statistics of how tiny particles interact with each other.
 
Last edited:
Upvote
8 (8 / 0)

ambivalent

Smack-Fu Master, in training
96
I don't have an inner monologue, and I don't think visually. Instead, as far as I can tell from introspection, I think in concepts and the network of relationships between them. (it's hard for me to translate my thoughts into words, and I don't easily come up with pictures).

The concept-map picture from Anthropic looks AMAZINGLY similar to my subjective impression of how my mind works.
The lack of inner monologue is interesting, although I can't say I think much differently. I'd certainly agree that I've also run into the problem that trying to communicate concepts based on complex interrelationships between large corpuses of experience that each also have their own previous judgements of worth is.. not easy. However, I do find it easy to visualise certain things and have excellent spatial reasoning/awareness. Could you try an experiment? Consider responding to this post, don't just type it out, just consider it. How are you going to respond? Now, are you in fact speaking our your response in your mind? If so, would you consider that a mental "draft" that you wouldn't normally perform? If not, do you have any idea what your response will be before you start typing it?
 
Upvote
5 (5 / 0)

rr6013

Ars Scholae Palatinae
678
This is what I always think after I read a comment from a computer person along the lines of "this is nothing like a human brain, this is just an extremely complicated network of connections reacting to input by searching for patterns in that network." Before I decide whether or not LLMs can ever become human-like, I'll need to hear the opinion of someone who's an expert in computers AND neuroscience.
By specification, big enough LLM will categorically define human, by human type, human features and patterns of word association. Today’s LLM can’t. But there’s a tomorrow only one scale away from defining humanity. Then beyond human comprehension- superLLM.

Deepmind’s alpha fold teaches narrow IQ is super.
 
Upvote
-6 (1 / -7)

rr6013

Ars Scholae Palatinae
678
We had some suspicions something like this might be possible after exploring vector steering, where you could push a model by adding particular vectors at particular layers to, say, change the mood, or always bring up King George III, or whatever you may. I imagine that this method is somewhat similar, if rather more advanced.

However, this article is missing the most bemusing part of this project, where Anthropic taught an AI to conduct proper Maoist self-criticism.
INDEED, author omits much yet admit that the LLM conduct was taught behavior, principles and practices of self-censori

HUGE flyover the heads those not reading critically. Gatekeepers, policy, procedural and agents of all stripe and color have automated conformity measure within grasp and not just breath analyzer neither.

ONE LLM psychologically trained, categorically psychiatrist defined would be able to bubble sort any individual public breadcrumbs back traced online into a complete profile of pseudo-facts, characteristics, nuance and pathologies for any number of rated uses, measures and
 
Upvote
-12 (0 / -12)

Lil' ol' me

Ars Scholae Palatinae
690
Subscriptor
I'm only a neurobiologist with some Python knowledge, but the following quote from Sci-Fi writer Charles Stross feels plausible enough to me: "What we're getting, instead, is self-optimizing tools that defy human comprehension but are not, in fact, any more like our kind of intelligence than a Boeing 737 is like a seagull." (check out the entire keynote, it's amazingly prescient for being 6 years old).
AI (or any technology) can only approximate what humans currently understand about a topic (in this case, the mind).

Compare it to food. When all we knew about food was vitamins, then Tang® seemed like a good idea because it provided vitamin C. But we know now that food is more than just vitamins (an actual orange has fiber, anti-oxidants, human digestible sugars, and embodies nutrients & microbes that help our gut flora digest it), and ultra processed foods are bad for us. And we still can't make an orange: there are still lots of things yet to be discovered about it.

Fake orange juice is just an approximation of an orange, the way AI is just a (very crude, but getting better) approximation of thought, sentence composition, or story writing. We can only recreate the parts we currently understand.

Also, consider that things like commercial AI models aren't pure science, but rather commercialized science, so there is always some bias introduced (like Tang®, which was just cheap to produce & market, making it very profitable).
 
Upvote
2 (3 / -1)

isartchance

Smack-Fu Master, in training
59
Clamping values seems able to weaponise safe agents and vice versa taming of artificial beasts.

Humans might hope to monitor, recognise and limit exploits of emerging conflicts between
  • clamped attractor constellations and
  • regulated galaxies of values learned from initial corpus media
augmented by simulated experiences and tuned for instruction following, conversation and conflict resolution alignment specs etc by RLHF etc.
 
Upvote
-1 (0 / -1)

Sadre

Ars Scholae Palatinae
1,009
Subscriptor
Nuclear weapons have been in continuous strategic use and development since they were invented, right up to the present day, though they haven't been used offensively in the field since the first two. They certainly didn't go away, and I don't see how "convincing people not to use them unless you really, really have to" counts as "spoiling them for everyone..."

History of Technology doesn't work by transcribing human deliberations, so you have to kind of get a general framework to explain non-human logical steps in development or non-development, use-or non-use. Which I did.

If you think I didn't mean by "two" exactly the offensive use,I can't help you.

Nuclear weapons have been spoiled for use. They even are spoiled for testing. The technological-human process which I outlined hopefully catches aberrant technology in time. As it has with nuclear explosions over civilian populations.


Don't be dense. Or have at it. This is not the Algonquin roundtable. More like a soup.
 
Upvote
-7 (0 / -7)
When analyzing an LLM, it's trivial to see which specific artificial neurons are activated in response to any particular query. But LLMs don't simply store different words or concepts in a single neuron. Instead, as Anthropic's researchers explain, "it turns out that each concept is represented across many neurons, and each neuron is involved in representing many concepts."
Once upon a time the whole of the Ars readership was keen on all things tech, until one day after most fo teh teh news sites got gobbled up by giant corporations and expanded their audiences to a trove of non-tech oriented folk, which became likely the majority of the radership. Thus prompting Ars in most instances to water down many of their articles so that a more common reader could better understand WTF the tech was.

This article seems toomit this regarding "artificial neurons". Even the reference links to other articles do not clarify it.

In short they are not neurons and do not function like neurons. They are not hardware and are not dedicated transistors or processors or other hardware focused solely on the LLM crunching.

'Artificial neurons' is a really bad naming convention for software functions (segments) written for processing the data dumped into LLMs. That's it. Fancy phrase for "software". Because somehow "artificial neurons" sounds cooler than LLM Software Functions ??

Here's the basics: AWS Q/A

Here's a detailed breakdown: Artificial neurons
 
Upvote
-8 (3 / -11)
Nuclear weapons have been spoiled for use. They even are spoiled for testing. The technological-human process which I outlined hopefully catches aberrant technology in time. As it has with nuclear explosions over civilian populations.
Right up until Putin gets either pissed enough or insane enough to push a button - then your logic will breakdown when he attempts to sterilize Ukraine or toss a few at the US or EU.

But you go ahead and hold onto the notion that Putin (who kills poltical opponents or anyone else that he can that talks bad about how great of a leader he is) or any other radical leader will always stop short of using nukes again because of their humanity.
 
Upvote
3 (3 / 0)

OrangeCream

Ars Legatus Legionis
56,669
Once upon a time the whole of the Ars readership was keen on all things tech, until one day after most fo teh teh news sites got gobbled up by giant corporations and expanded their audiences to a trove of non-tech oriented folk, which became likely the majority of the radership. Thus prompting Ars in most instances to water down many of their articles so that a more common reader could better understand WTF the tech was.

This article seems toomit this regarding "artificial neurons". Even the reference links to other articles do not clarify it.

In short they are not neurons and do not function like neurons. They are not hardware and are not dedicated transistors or processors or other hardware focused solely on the LLM crunching.

'Artificial neurons' is a really bad naming convention for software functions (segments) written for processing the data dumped into LLMs. That's it. Fancy phrase for "software". Because somehow "artificial neurons" sounds cooler than LLM Software Functions ??

Here's the basics: AWS Q/A

Here's a detailed breakdown: Artificial neurons
But the quote you use very specifically says artificial neurons, it doesn't omit it.

Search the article; here's what you get for the first two hits for neuron:
the model's millions of artificial neurons
specific artificial neurons


So yes, while the next five times neuron appears they omit 'artificial', I think it's pretty kosher to try to simplify the text by dropping 'artificial'

And... in all my classwork and reference material we call then neural nets and neurons. It's a given that they're artificial.
 
Upvote
2 (2 / 0)

nivedita

Ars Tribunus Militum
2,256
Subscriptor
Once upon a time the whole of the Ars readership was keen on all things tech, until one day after most fo teh teh news sites got gobbled up by giant corporations and expanded their audiences to a trove of non-tech oriented folk, which became likely the majority of the radership. Thus prompting Ars in most instances to water down many of their articles so that a more common reader could better understand WTF the tech was.

This article seems toomit this regarding "artificial neurons". Even the reference links to other articles do not clarify it.

In short they are not neurons and do not function like neurons. They are not hardware and are not dedicated transistors or processors or other hardware focused solely on the LLM crunching.

'Artificial neurons' is a really bad naming convention for software functions (segments) written for processing the data dumped into LLMs. That's it. Fancy phrase for "software". Because somehow "artificial neurons" sounds cooler than LLM Software Functions ??

Here's the basics: AWS Q/A

Here's a detailed breakdown: Artificial neurons
You understood what an LLM is but not what an artificial neuron is?
 
Upvote
3 (3 / 0)

Pecisk

Ars Scholae Palatinae
947
I'm just a layman, but it keeps puzzling me why the question "why they often confabulate information" is considered relevant or interesting. It's just a matter of complicated statistics, what else could it be? What am I missing here? The network follows exactly the same process each time - whether the output ends up lining up with something that we can determine to be true (via external means) or whether it happens to end up lining up with something that we can determine to be false or nonsensical. It's not like the latter cases are caused by some bug or malfunction. Because at no point is there any process or capability invoked that goes beyond statistical relationships. There's no "truth" module, no "double-check" phase, no "how important is this" assessment, no way to suspend the statistics and employ some other approach that would be more suitable at some point.
It is like using clever physics hack to make things levitate. Cool hack bro but it is not flying, so why do you care about that?
 
Upvote
-2 (0 / -2)

Pecisk

Ars Scholae Palatinae
947
Nuclear weapons have been in continuous strategic use and development since they were invented, right up to the present day, though they haven't been used offensively in the field since the first two. They certainly didn't go away, and I don't see how "convincing people not to use them unless you really, really have to" counts as "spoiling them for everyone..."
"Strategic" is huge stretch to psyops with very expensive and likely to fail end of the days (sorta) weapons.
It is more like russia trying to keep its facade up. Amount of actually usable nukes is decreasing every year. They are expensive and quite pointless.
 
Upvote
-3 (0 / -3)

Pecisk

Ars Scholae Palatinae
947
Right up until Putin gets either pissed enough or insane enough to push a button - then your logic will breakdown when he attempts to sterilize Ukraine or toss a few at the US or EU.

But you go ahead and hold onto the notion that Putin (who kills poltical opponents or anyone else that he can that talks bad about how great of a leader he is) or any other radical leader will always stop short of using nukes again because of their humanity.
No, they will stop because they are sociopaths and strangely enough that keep us safe so far. putin is not crazy. He played madman theory, it did not pan out. He was clever enough not to keep theoretical escalation up. Which means he kinda wants to die from natural causes not caused by radiation or starvation in bunker with collapsed entrance.
Don't get me wrong, risks for humanity to blow up itself are still there. I just don't believe rulers have intention to do so right now. Risks from society collapse, refugee mega crisis, real shooting war due of resources because global warming is here because LLM was kinda cool and very expensive way to get computer sound like sexy actress....now, THAT is not a risk. That's gonna happen.
 
Upvote
-2 (0 / -2)
The lack of inner monologue is interesting, although I can't say I think much differently. I'd certainly agree that I've also run into the problem that trying to communicate concepts based on complex interrelationships between large corpuses of experience that each also have their own previous judgements of worth is.. not easy. However, I do find it easy to visualise certain things and have excellent spatial reasoning/awareness. Could you try an experiment? Consider responding to this post, don't just type it out, just consider it. How are you going to respond? Now, are you in fact speaking our your response in your mind? If so, would you consider that a mental "draft" that you wouldn't normally perform? If not, do you have any idea what your response will be before you start typing it?
That was an LLM responding.
 
Upvote
0 (0 / 0)
Clamping values seems able to weaponise safe agents and vice versa taming of artificial beasts.
Yup. And Anthropic isn't the first to do model editing like similar to this. For example:

https://github.com/zjunlp/EasyEdit
With the right tweaks that can turn a benign local model into a malicious one. It can also be used to detoxify a language model or mitigate bias.

Given LLaMA was released on 4chan, I can guess what the majority of edits are going to be. Precisely nobody is going to pay somebody capable to mitigate bias for commercial use. And if they do, 4chan will screech like they did for the Gemini image generator and the diverse Nazis, and the company will do a 180. It's very hard to do this and not have side effects.
 
Upvote
1 (1 / 0)

Sadre

Ars Scholae Palatinae
1,009
Subscriptor
Right up until Putin gets either pissed enough or insane enough to push a button - then your logic will breakdown when he attempts to sterilize Ukraine or toss a few at the US or EU.

But you go ahead and hold onto the notion that Putin (who kills poltical opponents or anyone else that he can that talks bad about how great of a leader he is) or any other radical leader will always stop short of using nukes again because of their humanity.

I said hopefully. Do I have to put it in special font for you to recognize a conditional? If hope holds. And it will be considered aberrant. Hopefully.

You can't break conditional logic if you do it properly. But you can misread it. As you prove.
 
Upvote
-1 (0 / -1)
You got it. The most meaningful difference between a hot dog detector neural net that I can make on my personal computer and ChatGPT4 is the absolute bonkers value of n in their n-dimensional space, and the absurd cost of the hardware needed to calculate and store that n-dimensional space.
It seems unlikely. Does hot dog detector have context window?

Detector is trivial, there is architecture behind ChatGPT, significant enough for OpenAI to stop being open.
 
Upvote
0 (0 / 0)
With most computer programs—even complex ones—you can meticulously trace through the code and memory usage to figure out why that program generates any specific behavior or output.
Please, don't compare these to programs. You need to compare these to other machine leanring models, as most if not all of the statstical models have these explanations baked in.
Talking about programs makes people believe this is a program, it's not.
 
Upvote
1 (2 / -1)