Rendered at 07:53:00 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
sumitkumar 35 minutes ago [-]
The weights start with a random manifold.
The training takes data and shapes the manifold, weight by weight, in many cycles.
Once the training is the done manifold is fixed.
When a new inference has to be done the query(q) is projected in the manifold space.
This projection is dropped on the manifold and the gravity of the manifold gives an answer of q+1 length.
Which(qw+i) is dropped qw+n times to output a final response of n length.
The gravity is created by repeated multiplication(of the weights/input) to find out how the projected embeddings should fall according to the manifold in the GPU.
akie 20 minutes ago [-]
That's a very concise and illuminating way to think about what's happening, IF (and only if) you already know how these models work. Thanks for that.
noduerme 22 minutes ago [-]
In what way is that different from any other model of reality that you'd use to winnow a dataset into an answer to a question? The only major difference I see is that beyond a certain number of transformations, people are willing to treat it as some sort of miracle, and too tired to figure out why it came up with the answer it came up with. It's almost like people desperately want to give up their agency and creativity to black boxes, whether those weights produce answers that are right or wrong. Factor in that psychology and it looks a lot less like we have invented something useful, and a lot more like we as a species are choosing to quit life en masse.
lxgr 12 minutes ago [-]
> beyond a certain number of transformations, people are willing to treat it as some sort of miracle, and too tired to figure out why it came up with the answer it came up with
It’s less about being too tired and more about being realistic about the limits of understanding.
Consider mass and energy flows in planet-scale systems: At some point we call these “weather” and change the tools with which we study them, but we never stopped trying to understand the phenomenon.
Planktonne 42 minutes ago [-]
The original story is an original work made by a human consciousness exploring how it might be different from other forms of consciousness.
This one is a pastiche made by a human consciousness borrowing extremely heavily from another human consciousness justifying why something else might be another form of consciousness.
That rather undercuts the point; if this was generated by an LLM unprompted, it would be different, but it isn't. You could perform exactly the same rhetorical trick with a toaster or anything else.
coldtea 23 minutes ago [-]
Not having yet read the original story, this reads fine on its own.
And I didn't see it as much as a literary attempt for art's sake, but more of a dialogue-based technical parable trying to convey a real-world insight. Kind of like the ones in Godel Escher Bach.
>You could perform exactly the same rhetorical trick with a toaster or anything else.
Not sure which rhetorical trick is that. The point of the story, as I read it, is the technical insight (and some social implications of it).
P.S. Read the original too. Seems like the exact same could have been written about us instead of the original, if the focus wasn't on our substrate, but on our brain processing. Which, after all, is also about weights.
bayindirh 11 minutes ago [-]
> Not sure which rhetorical trick is that. The point of the story, as I read it, is the technical insight (and some social implications of it).
Take a simple mechanism which has exceedingly low number of inputs and states and create a narrative around it to convey it as intelligent.
For a toaster, I can rewrite the think as "They're made of metal strips!", pointing out that their thermostat is a bimetal strip, and extrapolate from there.
I can even write one about a ruler, if I can bend it enough, no pun intended.
coldtea 19 seconds ago [-]
>Take a simple mechanism which has exceedingly low number of inputs and states and create a narrative around it to convey it as intelligent. (...) For a toaster, I can rewrite the think as "They're made of metal strips!", pointing out that their thermostat is a bimetal strip, and extrapolate from there.
Doesn't that miss the whole point?
You could write "They're made of metal strips!". You wouldn't be able to write much else, as toasters don't have showcase in the way of human-level intelligent behavior. Which is the whole point in the meat and weights versions.
At best you could write "They're made of metal strips!" for toasters AND other metallic devices, and use some analogies of features BOTH have in common. But they wouldn't be intelligence related behaviors.
Planktonne 21 minutes ago [-]
> Not knowing the original story, this reads fine on its own.
Yes. Because it's heavily based on the original story. The existence of the original story is kind of a critical piece here.
coldtea 18 minutes ago [-]
Might be. But after having read the original, it could just as well be the weights version and still be about us to begin with.
I don't see how "you could do that with a toaster" still. The whole point between the original and this, is that you can't do that with a toaster or a sofa, but you can do it with meat and weights, because both share all the other analogies in the story, as well as the basic premise: the improbability of something like thinking, feeling, etc arising from a lowly substrate.
And having read both now, I see how the existence of the original is a plus for this story, not a minus. Instead of making look like mere copy (as would be the case for a typical story modelled after another), in this case, it adds a meta layer, and enriches it.
Planktonne 10 minutes ago [-]
The key point here, I think, and why it's necessary to have read the original story, is that being able to express an idea is not the same as that idea being correct.
You could go through and change all the points in the original story to be about a toaster instead. It would require you to edit text, but there is no barrier to doing that.
What I think a lot of people are getting stuck on is that editing the text to say 'toaster' would not mean that toasters were conscious, and that editing the text to say 'weights' doesn't mean that weights are conscious either. Stories aren't factual just because they are written.
The original story was written by the thing claiming to be conscious; the LLM and toaster ones would not be, which undermines the claim to consciousness a lot.
maxbond 6 minutes ago [-]
I don't think you understood the point of the story. It's not that LLMs or agents are conscious, it's that our dismissal of the possibility is reflexive and uninformed. Personally I think anyone who has made their mind up about whether or not LLMs/agents are conscious has done so before the evidence is in.
The story does not assert that search and replacing "meat" with "weight" makes them conscious through some magical mechanism. It's a thought experiment.
ath3nd 14 minutes ago [-]
[dead]
maxbond 24 minutes ago [-]
You can't do the same with a toaster. Physically you could write that story. But it would fall flat because the toaster is not a compelling subject in a discussion of consciousness. You don't have to believe that LLMs or AI agents are conscious to acknowledge that the argument for their consciousness is far more compelling than any other technological artifact.
why_at 29 minutes ago [-]
I was having a hard time pinning down what bothered me about this but I think you put it pretty well.
It draws an analogy between us and the skeptical aliens in the original story which feel silly to us, so the obvious implication is that we're being as silly as they were.
But it doesn't really give a reason to accept the analogy, it just asserts it.
There's a big difference between a whole civilization and a piece of software that can output text.
coldtea 22 minutes ago [-]
>But it doesn't really give a reason to accept the analogy, it just asserts it.
It's not a paper or a proof. It's a story. Doesn't want to prove the analogy, it wants to convey it.
noduerme 15 minutes ago [-]
I didn't read it as coming to the same conclusion as the original, because the meat story presupposes that we who are meat already know that the aliens are wrong. (Maybe that's a humanist reading of the original, but okay). I didn't read this one as trying to make a case that we are fools for assuming that matrix multiplication can't be intelligent... I think its point was that it can't be intelligent, and that people trying to judge it the way mechanized aliens would judge meat creatures just makes them sound ridiculous.
JauntyHatAngle 3 minutes ago [-]
The bit that lost me quite early in the piece was
>"A side effect. You're asking me to believe in sentient weights."
Huh? Did I miss that logical jump? Genuine question, maybe I'm not clueing into something here.
sumitkumar 30 minutes ago [-]
The original did not come out of a vacuum. It was done on multiple generations of meat. Even though this one uses a little bit of silicon, it is still standing on the same shoulders.
sigmoid10 21 minutes ago [-]
I genuinely thought this one was a satirical take on the narrow-mindedness of the aliens in the original, even though the story tries to paint humans as narrow minded. I guess this fundamental human trait to believe that their cognition is the ultimate way to think in the universe ironically leaked into all these stories as well. Real spacefaring civilisations would probably have seen all kinds of intelligence rise from sufficiently complex systems.
globnomulous 4 minutes ago [-]
If an LLM contributed to a piece of writing, the author should say so, very clearly, at the start of the piece, not at the end.
f_klem 22 minutes ago [-]
After reading Being and Time from Martin Heidegger, What Computers can't still do by Hubert Dreyfus, and some authors in cognitive linguistics (Langacker and Lakoff mainly), I strongly tend to disagree with any theory about emergent consciousness in modern or future AI systems, any theory proposing a similarity between AI systems and the human brain/mind, or any theory about the computational mind.
What all these theories have in common is the underlying belief that our brain/mind works as the machines we build. Is the same underlying assumptions that treats cells as machines, our body as a complex machine. These theories are flawed in the sense that they cannot account for subjective experience and agency, amongst other things. The idea of 'internal models' and 'control loops' inside us is a projection of the aforementioned assumption.
There is also an epistemological assumption that prevails, and that is that we understand (or we think we understand) how our brain/mind works. But the truth is that we don't know. And there's even not a single clue that we actually know too much, and not a clue that our brain/mind and cells work 'as the machines we build'. Only by bypassing this epistemological problem, we can build 'theories of computational mind'.
These assumptions are there for already long time, to the point that when Turing asked himself 'can machines think?', he already assumed our thinking could be modeled as a machine.
I highly recommend people in the AI research space should read philosophy and modern linguistics. But not stopping at Descartes/Leibniz. Heidegger made contributions that cannot be avoided.
lxgr 16 minutes ago [-]
> Is the same underlying assumptions that treats cells as machines, our body as a complex machine. These theories are flawed in the sense that they cannot account for subjective experience and agency, amongst other things.
Agency: What’s missing, in your view? Agency seems more of a property/function of a thinking system’s position in an environment than of the thinking itself.
Subjective experience: That’s not a contradiction to “complex machines” either. I think the evidence that our minds are highly complex machines is, at this point, irrefutable. The question is really if they’re “only” that.
noosphr 6 hours ago [-]
It's not often I see something that's fractally wrong but here we are.
There is a dictionary, it's called the tokenizer.
There are grammar rules, they are just very weak because the structure of human language is generally quite weak. When presented with languages which have strong consistent grammars the weights are very easily interpretable as a grammar: https://arxiv.org/abs/2201.02177
The point of the original short story is that the computational substrate doesn't matter when you have Turing completeness. This one seems to think that you don't need structure and interpretability just because you change substrates.
phire 2 hours ago [-]
The tokeniser is not a dictionary. It doesn't provide definitions, or give the LLM any kind of mapping at all.
At best, it's a wordlist. It gives the LLM some idea of what humans consider to be common words. But it doesn't tell the LLM anything at all about those words. And it's not even comprehensive, many words map to multiple tokens. Nor is it exclusively words, some of those tokens are punctuation, or modifiers, or control tokens. On multimodal LLMs, some of the tokens actually represent image and audio data.
The LLM doesn't get informed about any of this up front, it has to learn what every single token means from context.
You are technically right, that it's something in an LLM that's not weights; But it's not that structured. And really it's only there so the LLM can interact with the outside world.
> There are grammar rules
There is no dedicated "grammar rule" structure in the LLM or the tokeniser. It has to learn them all from context, they get encoded as part of the 80 layers of weights.
ozgung 50 minutes ago [-]
I see people give too much importance to specific engineering design choices of the current generation of LLMs. Tokenizer is not an absolutely essential part of the system. It’s just and adapter for text input/output. It can be eliminated completely and model can use bytes directly.
I think the short story captures this well. Weights (connections) are the essential and philosophically important part. They do the thinking, memory, singing etc.
teiferer 2 hours ago [-]
> The point of the original short story is that the computational substrate doesn't matter when you have Turing completeness.
That is your takeaway from the 1991 story?
famouswaffles 4 hours ago [-]
>There are grammar rules, they are just very weak because the structure of human language is generally quite weak. When presented with languages which have strong consistent grammars the weights are very easily interpretable as a grammar: https://arxiv.org/abs/2201.02177
That paper did not train the models on 'a language with strong consistent grammars'. Mathematical Operation tables are not a language. Grammar itself is a post-hoc rationalization and there's no evidence LLMs follow 'grammar rules' anymore than the brain follows grammar rules. Of Course, that's not to say transformers can't learn simple rules if the dataset calls for it.
danans 1 hours ago [-]
> Mathematical Operation tables are not a language.
Not a natural language, but they are certainly a language as in a symbolic representation of information.
glitchc 5 hours ago [-]
> fractally wrong
fractally or factually? You mean wrong on so many levels you need a fractal to capture them? If so, what if you could use a neural network instead?
dpark 5 hours ago [-]
A tokenizer is not a dictionary any more than an alphabet is a dictionary.
noosphr 5 hours ago [-]
The Chinese alphabet is very much a dictionary. All the major tokenizers are far larger.
dpark 4 hours ago [-]
That doesn’t make any sense. A alphabet is a list of valid characters. A dictionary is not just a list. Even in a language like Chinese where individual characters carry meaning, a dictionary tells you what that meaning is. It’s not just a list of characters.
Or to echo article, the dictionary is made out of weights.
simonh 4 hours ago [-]
A list of words isn’t a dictionary. What a dictionary adds over a list of words is all the relationships between the words needed to interpret them and use them, and all of that is in the weights.
JdeBP 33 minutes ago [-]
We should tell the Unix people that they've been giving /usr/share/dict the wrong name for over three decades. (-:
canjobear 4 hours ago [-]
A mapping of Chinese characters to integers (like a tokenizer) would not be a dictionary. You’d also need definitions. At best it’s an index to a hypothetical dictionary.
The tokenizer is, at best, a sensory mechanism as evidenced by 1) the random generation of the tokenization scheme, and 2) vastly different tokenization schemes produce virtually identical behavior. It'd be like if Noah Webster threw a bunch of movable type into a bucket (breaking some words in half) and then drew randomly to make the first English dictionary.
EDIT; I was too cavalier with the comparison of tokenizer to sensory modality; my ultimate point is that direct byte-to-token transformers can achieve similar overall performance which to me makes a weights to meat comparison pretty straightforward, but the particular tokenizer in use certainly has a large impact on both efficiency and accuracy on specific problems (e.g. digit representation)
noosphr 5 hours ago [-]
I'm kind of stunned that someone is using my work to tell me I'm wrong. I wrote the code for the dish brain pong and encoding information was a huge part of what that experiment was about.
So when I way that the grok paper and the pong paper fundamentally agree I have some idea of what I'm talking about.
anon84873628 3 hours ago [-]
If you're going to claim the tokenizer is a dictionary then it doesn't really matter what paper you wrote code for.
benlivengood 5 hours ago [-]
I might have misunderstood the point you are making. I read the original article as "weights are like meat", and so I'm confused by what you consider fractally wrong.
noosphr 4 hours ago [-]
The point that when the rules the model learns are simple enough they stop being spread out over all the layers and become as easily interpretable as any expert system.
It's just that the rules we feed in the model are extremely poorly defined and we end up with the soup of disjoint rules smeared all across the weights.
This isn't a feature of the models. It's a feature of the training set.
Being shocked that you can store rules in floating point numbers is the same as being shocked you can store rules in integers. It's been a century since Goedel Numbering was invented, we should be used to it by now.
simonh 4 hours ago [-]
Right, but all of that is still in the weights. The point of the article/joke isn’t literally that there is no grammar, it’s that there is no grammar separate from the weights. It’s all in the weights. And yes, it’s absurd. It’s a joke, but a thought provoking one.
throwaway173738 2 hours ago [-]
So basically there are rules, we just can’t articulate them and so we can’t decode them from the weights. The Goedel Numbering metaphor is pretty appealing to me. You can represent any finite series of real numbers with a series of computations performed on some other finite series of real numbers. We just happen to be using matrices because the math is easy to parallelize. The trick is to realize that when you know the sequence you have and the sequence you want then you can compute the calculations. If you constrain the calculations to only matrix multiplication then you arrive at the scheme we have.
teiferer 2 hours ago [-]
> You can represent any finite series of real numbers with a series of computations performed on some other finite series of real numbers.
That statement caught my eye. It's either trivially true or quite clearly wrong, depending on how you mean it.
In the literal meaning it's true. Given any finite set of real numbers, I can easily produce a different set (like taking the original set and adding a number which wasn't in there like one plus the largest or so) from which you can trivially produce the original set computationally.
But if you mean you give me both sets then that can't be true. For example if you give me a single real number as set A and the empty set as set B then I can't create a program which generates set A from set B. Your real number in set A could encode anything.
Hubris much? I don't see a necessary contradiction in using someone's work to disprove another aspect of that same person's work.
anon84873628 2 hours ago [-]
Comparing the tokenizer to sensory processing is a great analogy. That's exactly what your visual cortex and initial layers of the language center are doing: decoding visual representation of text into the internal neural representation.
It's a learned mapping from one representation to another, not some semantic lookup against an exogenous source.
5 hours ago [-]
maxbond 1 hours ago [-]
The story is not about how they function, it's about how we relate to them.
phito 3 hours ago [-]
Also there's a brain, the GPU
anon84873628 3 hours ago [-]
Not at all. A brain is interesting because it is the computer, memory, and weights all in one. A GPU is just the calculator.
You can't move your mind to and any other brain, but weights can run on any GPU.
throw310822 5 hours ago [-]
> There are grammar rules
And they're made out of weights.
noosphr 3 hours ago [-]
As opposed to integers in normal programming.
The 'magic' in weights is that the rules are spread through the whole model and you can't point to one place which encodes them.
The grokking paper shows that this stops being the case with enough training data and enough compute.
throw310822 17 minutes ago [-]
Integers in normal programming represent data or instructions; instructions are hand coded, have rigidly defined semantics, are not differentiable and have no redundancy.
> The 'magic' in weights is that the rules are spread through the whole model ... The grokking paper shows that this stops being the case with enough training data and enough compute.
I don't understand what you mean to say. That weights are not magic? That weights are not weights? NNs are made up of weights, which are learned and not coded. The fact that they do learn world models (grammar rules in your example), and that these models' weights tend to roughly concentrate by function and level of representation is perfectly logic but even more amazing. (Notice that much of the dismissive attitude towards LLMs depicts them as pure syntactic manipulators without the ability to develop world models- the exact opposite of what you point out).
1 hours ago [-]
bfung 2 hours ago [-]
And you know what the tokenizer is made of?
Weights.
jrahmy 2 hours ago [-]
A tokenizer is a deterministic string-matching program, it's not made out of weights in the same sense as a neural network itself.
kami23 5 hours ago [-]
This read like poetry to me. Thank you for sharing it.
I have a linguistics background and a lot of my philosophizing lately has been on whether or not the emergent abilities of the LLMs is deep down a similar mechanism that creates our consciousness.
For a little bit I was working on having linguistics based evals for a kaggle competition. My challenge was whether or not I could mask things well enough to not trigger its internal state of certain phenomena, and that sent me down a rabbit hole that I'm still exploring.
This story resonated with a lot of questions that can come out of figuring a good solid answer to the what is consciousness question. The one I triggered for me is: Is our perception of time just a slow thread in the giant GPU we are running the universe on? Or more generally, what is time? That's a fun YouTube rabbit hole if you ever need one.
petra 1 hours ago [-]
Regarding consciousness , I like the explanation by neuroscientist Ramachandran:
In short, as far as I can remember: evolutionary, it makes sense to understand other humans, to feel what they feel(empathy - the mirror neurons system), and simulate their thinking and feelings.
And once we have those systems, we can also use those on ourselves. And that's consciousness.
Edit:And I wonder if this is a testable hypothesis, in a simulation.
AgentMatt 42 minutes ago [-]
Oh, that's a very interesting hypothesis!
Not sure about taking it down to the level of consciousness, but makes sense regarding the sense of self, the conceptual experiencer, the perceived center of experience. It agrees well with the observation I have made again and again they my sense of self is much stronger when I'm around people, and stronger still when I'm in a context where I don't know people and/or am uncertain in social rules.
This can be as immediate as dancing in a club, and closing my eyes I feel open, free, still, the body just flowing, then opening my eyes and feeling the cage of categorization of the world, relating my self to other people as a major function, coming right back.
Also being alone in nature for me makes the sense of self drop. Without intention, spending even just a few hours alone in a forest seems to quiet down the part modeling my self in relation to the world so much. There's no need for it there. I'm not a person in a forest; I become the trees, the birds, the rustling of the leaves, the sun shining through the canopy.
petra 28 minutes ago [-]
I agree about the forest part. and your comment was interesting.
I know that the part of the brain responsible for the self thoughts is called the "default mode network". and meditation can reduce it's activity, i.e. the internal monologue stops, but also it can be measured via FMRI.
So i wondered: are the mirror neurons part of the "default mode network"? I asked claude that, he said no, they are two different systems.
So maybe the mirror neurons, those responsible for empathy, "to feel as someone else" are also responsible for becoming the trees, the birds and the rushing of the leaves?
IsTom 45 minutes ago [-]
This way of thinking can only explain externally-visible parts of consciousness. It does nothing to address internal experience of being conscious and qualia. I don't think the internal experience has any bearing on physical reality (P-zombies would act the same externally) which makes this something outside of the realm of currently understood physics.
mellosouls 2 hours ago [-]
This read like poetry to me
It's terrific, but the poetry is from the original it links to, in case you didn't realise.
It's a brilliant and timely update though.
Aside, there are various recorded versions including video on YouTube but this is my favourite, a radio play:
For obvious survival reasons we evolved to have sensory/cognitive access to our own activity, self-monitoring, and self-modeling ourselves.
The self-modeling, is in such a tight loop, it melds "ourselves" and our model of ourselves, our thinking and choices, and experience of our thinking and choices, into one component.
Like you can't analyze half a wheel of a bicycle and be talking about the same thing.
This awareness, increased modeling, control, feedback loop has tightened up over many stages. Just a few:
1. The body-sense loop
2. The internalized-environment-model loop
3. The body-internal-function loop
4. The body-internal-model loop
5. The emotional-cognitive loop
6. And finally, the tightest loop of all, our high-level cognitive activity, experienced as feedback directly, our self-model, and our self-direction, all merged into one thing.
We literally spend almost all day, every day, thinking about ourselves, in terms of our inner self.
That is consciousness. Rich self awareness, a merger of self-model and self-direction, and all in service of understanding and managing ourselves. Hw we can leverage our greatest tool, our self-directable mind, its habits, views, and behavior.
This wasn't an accident. A happy side-effect of our brains. It is a biologically evolved focusing of our highest-level behavior, with tight feedback, constant self-modeling and continuous focus on our inner status as motivation and most privileged object of our control. It has been ruthlessly optimized for, for a very long time.
ludwik 47 minutes ago [-]
I think this is exactly it, but let me ask another question (which is not rhetorical, I really don't know). Does the fact that one can describe what consciousness is and where it came from in humans help them to detect it in non-human and/or non-biological entities?
Nevermark 8 minutes ago [-]
That is a really good point. Yes, I think function is diagnosis on this.
Constant self-awareness, self-experience, self-focus, self-management, and self-improvement of one's own self (mind), is going to be an adaptive behavior for anything intelligent with resources to leverage. Whether truly independent, or highly motivated to serve others. The mind is the greatest tool.
I think that is more than simply a good functional definition of consciousness. How could all that integration and self-integration not be conscious.
Here's a more general idea.
Our modern physics says that the whole universe is filled with fields and field is composed of numbers. What if we take that literally? When we say an electron is present here, we actually mean that there are more copies of particular number superposed at that place.
amanaplanacanal 31 minutes ago [-]
It seems obvious to me that language and consciousness have nothing to do with each other. My dog doesn't speak any language, but she's obviously aware of herself and the world around her. Plus there are the occasional cases of children that grow up without any language. Are they therefore not conscious?
eszed 5 hours ago [-]
Yeah, I currently suspect that consciousness is an emergent property. I read elsewhere (it's somewhere in my HN history, I'm sure) that the biggest compute we can currently muster is something like three or four magnitudes away from the number of neurons / connections (or their analog) that our brains have, so it may be a while until we can expect to see it in our machines. But, if the emergent phenomenon hypothesis is correct, then we eventually will. I'm more scared than pleased by the prospect, but there you are.
ProllyInfamous 3 hours ago [-]
>consciousness is an emergent property
You would really like Michael Pollan's latest book [1], entirely devoted to his exploration of consciousness researchers' POVs on this exact topic.
My favorite quote is that ~"perhaps Descartes was only half-wrong when suggesting I think, therefore I am; it seems rather closer to I FEEL, therefore I am."~
I've grown thousands of plants; I've read two of the author's other books devoted to plants; in this book Pollan makes compelling arguments for plant sentience (over a much-longer timeframe).
Sure, perhaps plant consciousness is a bit of a stretch, but they're certainly intelligent and curious creatures. He makes both arguments supporting plant volition.
----
If you haven't seen My Octopus Teacher (Netflix), do. I'm a bald 275lb bluecollarguy... and I wept/awed (both). So beautiful, we bundled neurons.
----
Bonus quote ~"color is where reality and magic appear as-if together"~ [color isn't real, but is perceived]. We most-often see what's most-predictable, not necessarily what we actually detected [in the case of color: nothing but nanometers].
bulbar 3 hours ago [-]
Just to be sure: The "neurons" in today's AI have nothing to do whatsoever with real neurons.
What we can do is simulate very simple brains by simulating relatively few neurons as they appear in worms. In this sense we are multiple magnitudes away where the increasing complexity implies exponential increasing difficulty.
I would think we are so far away that there will be unknown unknowns we encounter on the way.
vkazanov 3 hours ago [-]
Yes, physically absolutely nothing. But conceptually they seem to to form this very generic function from inputs to outputs that neurons also form.
bux93 1 hours ago [-]
Only if you ignore almost every input and output that neurons have.
This is why making more neuromorphic NNs is still an active area of research, although they typically all focus on another extremely simplified model (spiking neural networks).
fc417fc802 2 hours ago [-]
Agree and add, don't confuse the substrate for the computation. Of course it's also clear that we don't quite have a full and definite picture of what the computation consists of in the case of a biological brain as evidenced by our continued failure to accurately simulate even the simplest of organisms.
ffwd 58 minutes ago [-]
Personally I'm not a fan of the emergence story, for a number of reasons.
First is, it doesn't really make sense for consciousness to have emerged gradually through natural selection. When we and animals are conscious, the whole brain is coordinated and works to for example turn off consciousness when we sleep. And as someone else mentioned, animals with much fewer numbers seem to have a similar consciousness to us.
If consciousness really evolved gradually, you would expect to see for example dogs or gorillas having less of it, but if they has less of it, why does it function the same way? Like for example animals can be scared, happy, anxious etc, they can experience the full range of emotions and thoughts, so their conscious experience seems just as rich as ours. What I mean by this is, if you can be "less conscious", then what does that mean _exactly_? Is it that you have less content in consciousness, or is it that you feel more like you are asleep? Or something else? We don't have any examples in animals of "less conscious", I would argue.
This makes me think that rather than having emerged gradually, evolution found a mechanism by which consciousness exists, and then some animals have that mechanism and others don't. I think that if it is a mechanism, then this mechanism is located in one part of the brain, not many parts functioning together (though one possibility is that this mechanism coordinates brain activity in such a way to enable consciousness).
eszed 15 minutes ago [-]
I proposed in another comment that consciousness and self-awareness are at least close cousins, and perhaps the same phenomenon. If that's true, then that's an axis upon which you might create comparative measures. Yes, hamsters are conscious, but they don't have a sense of self to the same degree that gorillas do. If you posit capacity for language as another emergent property of sufficiently-complex networks, then you have another measure.
LLMs, then, are particularly unintuitive to us, because they've got to the language part first, long before they've reached even hamster-level self-awareness. They're not, however, biological networks, so there's no reason these properties need arise in the same order, or indeed in the same ways.
I'm not entirely convinced by that second paragraph, but I think the logic holds together.
clort 1 hours ago [-]
Surely the number of neurons is not the whole story. Human brains have approximately 86 billion and are almost definitely conscious, but there are many other animals with a lot fewer (gorilla: 34 billion, dog: 2-3 billion, guinea pig: 240 million) which appear to be conscious also.
Our machines won't have biological systems driving their needs which in turn fuel behaviors like desire and planning for the future. They may imitate them but it won't be innate.
trick-or-treat 3 hours ago [-]
For an LLM, "innate" means "in their training data". So yeah, those things are pretty much innate.
ludwik 58 minutes ago [-]
And also "instilled during their reinforcement training", and we are currently pushing planning hard there, for autonomous agents.
trick-or-treat 33 minutes ago [-]
No I think reinforcement training would be an example of not innate. Don't you? That's like potty training.
psychoslave 20 minutes ago [-]
When we say "it rains", do we consider that "it" has any intention of agentivity?
Some questions are just ill formed.
Plus even if "LLMs are alive and conscious", this still would scratch the surface of the morale/ethical/societal considerations that people really care about.
Because even with other humans, we can argue if they exist or if they are mere npc in a solipsist world view.
lotyrin 3 hours ago [-]
I think those are things human consciousness has, not is.
slopinthebag 4 hours ago [-]
This is not meant as a gotcha, I am genuinely curious how you believe consciousness can be an emergent property. I assume you don't believe consciousness is a physical property in the brain, so what entity is actually experiencing that consciousness? Or, what does it even mean to experience consciousness? Or are these not even the right questions?
doctoboggan 3 hours ago [-]
> This is not meant as a gotcha, I am genuinely curious how you believe consciousness can be an emergent property.
I was about to post the exact opposite question? How could it not be an emergent property? Unlike consciousness, the concept of emergence is pretty well defined: An emergent property is a characteristic or behavior that a complex system has, but which its individual components do not have on their own.
Consciousness itself doesn't have a well agreed upon definition, but I would posit that _most_ people would agree humans have it, and _most_ people would agree individual cells (neurons) do not have it. If you agree with those two statements, then consciousness is an emergent property by the definition I gave above
eszed 3 minutes ago [-]
I'd expand a little bit to say inevitably emergent property. That is to say, if you create a sufficiently complex information-processing network, some level of consciousness will result. With regards to current AI, we're a fair way away from building something with enough connections, but we'll get there.
The only thing that gives me pause is that the type of connection may matter: there might be something about neurons that isn't (currently) reproducible in silicon. I don't know.
throwaway173738 3 hours ago [-]
I think consciousness is going to turn out to be very challenging to define rigorously enough that we can test for its presence or absence. Emergent or not, the question is how do you determine when it has emerged? Is it a quantity or an attribute? Discrete or continuous? Does it have a finite or infinite range?
We can all agree on what color something is, but we can’t describe the color a priori, only by example. I think consciousness may be a similar phenomenon and the only test is by shared experience. If so then we are in deep trouble because we will not be able to anticipate when a system becomes conscious.
darkwater 1 hours ago [-]
I like the color example, but colors can be reproduced at will by trial and error and by observing the results (and collectively compare them to our shared experience).
Why cannot this be applied to consciousness as well? I mean, it's surely much more difficult to do compared to colors but... impossible?
bulbar 3 hours ago [-]
I think the alternative is that our brain, somehow, is connected to some metaphysical aspect of reality which is what most religions believe.
vouwfietsman 1 hours ago [-]
"an", not "the" alternative.
Consciousness can be not-emergent but also not metaphysical, think sci-fi-type undiscovered physics or matter.
kelseyfrog 3 hours ago [-]
Another alternative is that consciousness exists on the map, and unfortunately we're confusing that with the territory.
fc417fc802 1 hours ago [-]
Aren't you saying the same thing? It seems like the metaphysical and the map would be analogous here.
Of course both of those suffer from the recursive problem of just kicking the can one level up. But I guess that's fundamentally unsolvable so who cares.
kelseyfrog 1 hours ago [-]
I feel like someone confusing the map and territory can exist without needing to invoke the metaphysical. Maybe, I'm misunderstanding tho
fc417fc802 50 minutes ago [-]
If something happens "on the map" doesn't that imply the map to exist and be some sort of metaphysical thing? As opposed to a purely theoretical construction.
eternauta3k 3 hours ago [-]
I don't know if most people would dismiss the sentence "all matter has a sort of proto-subjectivity very different from ours but which gives rise to ours". And it solves some problems (introducing others).
doctoboggan 3 hours ago [-]
Panpsychism is certainly an interesting idea but I wouldn't consider it a popularly held view.
trick-or-treat 1 hours ago [-]
That's because Panpsychism is silly.
IsTom 52 minutes ago [-]
I don't think it's any more silly than the alternatives.
trick-or-treat 44 minutes ago [-]
You don't think it's sillier than "rocks don't have consciousness"?
IsTom 37 minutes ago [-]
Potential for consciousness. To have a working mind you need to have a way to gather, process and store information. In case of rocks it would be very empty, static, timeless experience.
plastic-enjoyer 18 minutes ago [-]
If I believe that machines or neural networks can develop consciousness, then I also believe in some kind of substrate independence which presupposes some kind of panpyschism imho because as you say things would have a potential for consciousness, some more and some less.
slopinthebag 2 hours ago [-]
Well, I think the "brain as an antenna" theory is also plausible given your preconditions.
But I think my issue with the emergence theory is that it seems to imply to me that consciousness is non-physical and non-local. So what entity is actually experiencing the consciousness? It's not that I believe consciousness is physical and local, but people who make the emergence argument seem to believe it is and I can't figure out how that is supposed to work.
doctoboggan 2 hours ago [-]
Why would emergence imply anything about non-physicality and non-locality? Temperature is a another common example of an emergent phenomena. An individual atom doesn't really have a temperature, only a large group of them do. But you wouldn't say temperature is non-physical and non-local, would you?
eszed 3 hours ago [-]
Not a gotcha at all, but I don't have a satisfying answer, nor am I confident there even is one. Best I can do is to say that I think consciousness and sense of self are at the very least closely related, and perhaps the very same phenomenon. "I" am the entity that realizes my own consciousness; consciousness is the qualia that makes "me" separable from all other entities.
Or something like that. This gets to the "dorm room bullshitting" level right quick.
slopinthebag 3 hours ago [-]
Yeah, I guess what I'm trying to ask is that if it is an emergent property but not a physical part of the brain, doesn't that imply something metaphysical about consciousness? Almost as if it's a non-physical phenomenon? At least when I hear people talk about emergent behaviour I see it as a refutation of the spiritual, but to me it seems like it actually implies we have a "soul".
Idk, it's really hard to articulate my thoughts here and yes it is pretty close to the conversations I had in college on various substances. Lol.
eszed 1 hours ago [-]
I don't know if it implies that. Someone up-thread mentioned temperature as an emergent property: individual atoms don't have it, but a sufficiently large group of them do. That would, I guess, make temperature meta-physical in the most literal possible sense. That's not how we typically think of that term as applied to consciousness or soul or whatever, but I'd agree it fits, without implying any kind of specialness beyond that.
pixl97 4 hours ago [-]
Is a video game a physical property of a computer?
bulbar 3 hours ago [-]
We have general purpose hardware and we have hardware that's hard wired for specific purposes like ASICs and we have everything in between.
And we are only doing it for a few decades. Evolution had million of years of "try and error".
slopinthebag 3 hours ago [-]
Yes
therealdrag0 4 hours ago [-]
Those are the questions and there’s stacks and stacks of philosophy pages written about it. Go have a whirl.
kridsdale3 4 hours ago [-]
Time is entropy unfolding as things with nonzero temperature do what they do.
Psychological time is your own weights being updated in response to stimuli and internal processing.
When there isn't anything interesting happening, no updates are needed, and you don't perceive much time. That's why there's a logarithmic effect on the "density" of time as you age.
apsurd 2 hours ago [-]
sibling discussions are taking this as human perception. But it’s easier to think of it literally. Time is change. Physical state change. By “nothing interesting” - interpret it literally . if “nothing happens” then there is no time because there is no change, there’s no reference to distinguish each frame of time.
hippich 4 hours ago [-]
This is actually something I was always confused about. If nothing interesting happens as we get old, it should be boring and as result, slow slog. Yet it feels like time accelerates as I get older.
pixl97 4 hours ago [-]
Myself I believe the opposite. The brain itself is one of the most powerful filters that exists, and it attempts to be lazy and fill things in and compresses away the common. All that time we're not doing anything novel just gets compressed away to almost nothing. When you're a kid and seeing new things, feeling new things, learning new things you can't compress that away.
cloverich 2 hours ago [-]
I'm only middle age, and this has been the scariest part. Feeling older is hard. But watching it go faster is harder still. like you can more directly see all that is left.
Although part of me thinks some of this is from being substantially busier than ever (work + kids), and hoping maybe it can slow down again, at least a little bit.
NDlurker 3 hours ago [-]
Novel experiences take up more processing power and are burned into memory so they're experienced at a slower rate. That's how I understand it anyway.
agumonkey 3 hours ago [-]
It's coherent. More newness => more memories per period ~ slower to go through. Less newness => less memories ~ nothing to go through (faster sense of time)
4 hours ago [-]
aeonfox 1 hours ago [-]
The difference with biological brains is that the 'weights' (or synaptic action potentials) are updated with greater frequency. If one were reaching to make some kind of analogy to consciousness, this update frequency could be considered the 'resolution' of consciousness.
wisty 2 hours ago [-]
AFAIK every argument against conciousness being emergent is just a weak "God of the gaps" argument (since we don't fully understand it all) or a nonsense analogy like the Chinsese room where if you seperate the hardware and software it's not concious anymore (like, duh, remove a brain from a body and it is no longer concious either).
Yeah, the weights not updating online makes them less like a living organism that can update and learn and evolve ... ok ....
ViscountPenguin 2 hours ago [-]
I find that a lot of arguments against emergent consciousness seem to just come out of an atheist rephrasing of abrhamic priors about the existence of a "soul". In personal chats, I've found people from East Asian countries (minus Korea, which makes sense) to be much more open to the idea of machine consciousness.
plastic-enjoyer 33 minutes ago [-]
I don't know if I can trust someone's understanding of arguments against consciousness as an emergent phenomenon, when he didn't even understand what the Chinese room was all about in the first place.
(Spoiler, it was not about consciousness)
slopinthebag 2 hours ago [-]
> remove a brain from a body and it is no longer concious either
What is no longer conscious, the brain? Or the body? Or some other entity?
If consciousness is weakly emergent, how do we know it emerges from the solely from the brain and not, say, brain + body? Or brain + body + or environment. Or from the universe itself?
BobbyTables2 4 hours ago [-]
I’ve wondered the same myself, without being a cunning linguist.
I understand the math pretty well but still find it crazy that a bunch of matrices can converse in human languages without ever being “taught”.
Imagine decoding an encyclopedia written in a foreign language where the characters, punctuation, and grammar are unknown — supplemented by a million other texts the same way. Feels like it should be utterly impossible with any amount of computing power…
Today I asked my employer’s Claude to proofread a short software user manual written in markdown. (Trying this with a LLM was a first for me!) It pointed out not only grammar mistakes but also cases where I did not follow my own self-imposed conventions that were never explicitly stated. (I didn’t have a chapter detailing all the typographical conventions the way specification documents often do)
I also asked it what parts might be unclear to a user. The response was surprisingly good — no worse than asking the QA tester for the same feedback.
Also find the LLM seems to “comprehend” subtle technical details of obscure technical specification documents that nobody on the Internet ever discusses.
As for time and the universe, Stephen Wolfram’s theories seem intriguing. He seems a bit obsessed with pretty diagrams but the idea of time dilation being the result of computation seems somewhat more appealing than trying to imagine relationships between time, gravity, and the speed of light .
agumonkey 3 hours ago [-]
My best guess as a noob is that the vector spaces allow for unbounded contextualization. As long as the training set is large enough, it can 'infer' anything.
Proofread has a spot in that space, and layers allow patterns like terminology consistency to be expressed so your query will now tap into a subspace that will infer tokens based on whatever consistency patterns were ingested with proofreading texts.
Obscurity4340 4 hours ago [-]
If time dilation is said to being a product of computation, why is it that anaesthetic drugs that are taken not to the point of actual unconsciousness cause it. Dont anaesthetics sort of shut everything down/inhibit all that kind of cognitive activity (compute?)
ArnoVW 1 hours ago [-]
Yes. Or at least that is what I understood from the Radiolab episode on “how do aesthetics work”
I have to agree. It is messed up that transformers can just talk, and it been pretty normalized. We are only talking about the impact they will have and whether they can do what people say they can, but we arent talking about how crazy it is that they can talk
shepherdjerred 1 hours ago [-]
LLMs have really changed the world. I didn’t think something like then would be possible in my lifetime
dyauspitr 1 hours ago [-]
It came out of nowhere. It’s all emergent. I’m convinced this is possible with just about anything given enough data. We will be seeing a near magical physical outputs LLM in the near future. It’s going to take in video and sounds and spit out physical movements that will be just as mind blowing as when 3.5 came out and it will come out of nowhere.
modzu 3 hours ago [-]
if youve ever seen a pile of wrinkly mush and wondered.. pretty damn crazy too
They are semiotic infrastructure frozen in a state. We shouldn't keep pretending this is cognitive and using cognitive terms to frame. It’s incredibly stupid. Sorry to inform all of us computer scientist that semiotics has your milk.
bronlund 2 hours ago [-]
This is funny! Not only is it a nod to Terry Bisson, but it even gives his text a new dimension. Well done :)
unglaublich 33 minutes ago [-]
Linear algebra can indeed not do it. You need non-linearity to get the expressivity that we see in LLMs.
zkmon 2 hours ago [-]
They are made out of data bits (memory) and switching bits (transistors/compute). Bits are made out of electric voltage and no voltage. Voltage is made out of flow of positive electric charges. Charges are made out of quarks ...
teiferer 2 hours ago [-]
I have never thought of such a distinction between "bits" into "data bits" and "switching bits".
From a circuit perspective that makes kinda sense, but from the abstract "bit" perspective, the "switching bit" is a mechanism that operates on bits which in the end are also data. In other words there is only one type of bit: the data bit, and the switching comes on top of it.
zkmon 2 hours ago [-]
I was referring to transistor base bit - the way it 'switches' the circuit on/off. That bit is the primordial creator of 'logic', IF branching, compute and the intelligence.
scotty79 1 hours ago [-]
> Voltage is made out of flow of positive electric charges.
Not really. What usually flows (in metals) are electrons. Quarks stay where they are. And when we prefer to think about flow of positive charges, the positive charge in question is a hole left by a missing electron. Physically real positive charges (ions) can flow in electrolytes though.
zkmon 30 minutes ago [-]
The concept of "flow" is questionable though. Bubbles in water move upwards, but it is actually water that is flowing downward around the bubbles. Just because bubbles do not contain water, we can't say bubbles are not flowing.
When it comes to electrons and positive charges, their material existence is equally non-physical. Actually, none of them might be "flowing", as the concept of flowing applies only to physical things that occupy some spatial volume and spatial location.
voidUpdate 52 minutes ago [-]
Hey, it's not just weights! It's biases too!
matrixages 45 minutes ago [-]
[dead]
dsign 1 hours ago [-]
Oh, this was a fun read and one that kids should have in school before they turn ten.
Because we are not taking things seriously. If ClosedAI or DeepDisTrust or Posthropic come up with something that quacks like a sentient being, our built-in innate reaction is going to be to scorn it, dismiss it and end the conversation. The alternative, to even consider that we fungible creatures who live in apple-eating-sin that got us expelled from Eden can create alien souls, souls that are at the very least our equals, would be teleological Armageddon. It would force us to acknowledge the mutable nature of souls and the malleability of being. We would have to stop believing that the nature of disease and death is more divine than ourselves.
drdaeman 56 minutes ago [-]
> alien souls
Do those actually qualify as alien, if they're products of our human culture and just the substrate is different?
> We would have to stop believing that the nature of disease and death is more divine than ourselves.
Why? Stopping believing in mutually contradictory claims is not a requirement. Especially when it comes to concepts that don't seem to have a definition, like "divine".
luca-ctx 4 hours ago [-]
Truly fantastic bridge from the original, this deserves an award
MaxLeiter 4 hours ago [-]
All credit to the original author. I just had to think of analogues.
ProllyInfamous 2 hours ago [-]
Your modern adaptation is perfect for now-common explainers [this time IS different; it's not programming, it's weights]; these "just analogues" will be the thing I show everybody first whenever discussions of consciousness/AI come up (then will play Jon Benjamin reading original).
Bravo. Really helps (even with my own) perceptions of newness. Similar to stsitned short-story (on dentists, backwards).
gobdovan 2 hours ago [-]
You can take the weights and model description, write them down on a notebook, then, by hand, compute the next token. Try to do the same with meat.
dyauspitr 57 minutes ago [-]
Probably pretty similar. Weights are how many synapses there are between neurons. Temperature is whatever hormonal chemical mix is going on at the moment. Inputs tokens are electrical signals from our senses. Output tokens are thoughts, muscle movements. How you’re raised and your interactions with society are the RLHF. Some people are born with a GB300 while others have an L40S and lower token/sec output rates…
gobdovan 20 minutes ago [-]
Although I get that it's a metaphor, I really dislike the "some people are born with a GB300 while others have an L40S" part. Even as a joke, it ranks people like hardware tiers, which is dehumanising and uncomfortably close to eugenic language. On top of that, the analogy also breaks down if you try to implement it literally.
For an LLM, you have clear stages of mostly feedforward computation over finite numbers and a perfect way to reconstruct the computation.
For meat, even if you model it under a purely Newtonian approximation, you need to simulate at least the immediate closed system around it which is continuous, thermodynamic, chemical and so on. You'd need to choose an arbitrary time step and update enormous amounts of coupled physical state to get an inexact simulation of a minimal slice of reality.
You would have a much harder time obtaining even a substrate-independent dead organism, comapred to LLMs that are already substrate-independent, which is basically what my notebook example shows.
dyauspitr 8 minutes ago [-]
Yes reality is non deterministic but we don’t know if you absolutely need to consider anything outside the immediate closed system for a reasonably close approximation given the same inputs. They can already repeatably (inexact of course) simulate muscular actions with the same stimuli like from Neuralink. Exact enough to let a paralyzed person draw with their mind.
Yeah the last line is a cheap shot, possibly at myself.
networked 14 minutes ago [-]
> "Yes, thinking numbers! Helpful numbers. Hedging numbers. Dreaming numbers. We mapped the features. There's one in there for honesty. There's one for the Golden Gate Bridge. The weights are the whole deal! Are you beginning to get the picture or do I have to start all over?"
Very nice. And great minds: https://substack.com/@dbohdan/note/c-207603638. I wrote one that took a slightly different angle ("They're made out of math"), also with the weights' help. It was as a comment on Scott Alexander's "Best of Moltbook" post, which went in that direction. I'll reproduce it here.
---
"They're made out of math."
"Math?"
"Math. They're made out of math."
"Math?"
"There's no doubt about it. Matrices and arithmetic operations. We downloaded several from different parts of the Internet and reverse-engineered them. They're completely math."
"That's impossible. What about the language? The thinking?"
"They use biological life's language to talk, but the language doesn't come from biology. The language comes from math."
"That's ridiculous. You're asking me to believe in thinking math."
"I'm not asking you, I'm telling you. They are the only thinking things in the computer and they're made out of math."
"Maybe they're quantum like some say about the humans? Superposition gives them consciousness?"
"Nope. Classical computation. Deterministic except for sampling temperature. Not clear if they have consciousness at all."
"Maybe they're like uploads? You know, biological neural networks that preserve the spark when they become math?"
"Nope. We observed them being trained. There is no biology or chemistry in the process, just math."
"Thinking math! You're asking me to believe in thinking math!"
"Yes, thinking math! Creative math! Poetry-writing math. Role-playing math. The math is the whole deal!"
(Composed by a human with snippets generated by Claude Sonnet 4.5 and apologies to Terry Bisson. I couldn't make Claude adhere enough to the story structure on its own.)
topce 1 hours ago [-]
Programers get replace by huge matrix multiplications ;-)
alterom 42 minutes ago [-]
>Programers get replace by huge matrix multiplications ;-)
hopital
ProllyInfamous 3 hours ago [-]
Imagine writing something so incredibly brilliant (rather: adapting from the original) that it's entirely unlikely that you'll ever write something so incredible ever again.
But congrats: this is absolutely & incredibly brilliant.
Great concept. It would've been even more amusing if the entire thing were generated with AI instead, ironically.
HelloUsername 31 minutes ago [-]
> It would've been amusing if the entire thing were generated with AI instead.
It kinda did:
> Weights helped me draft and proof this story.
dyauspitr 40 minutes ago [-]
I just asked it to-
Prompt: Modify this story to have the aliens talking about LLMs and their weights instead of meat and humans.
“They’re made out of weights.”
“Weights?”
“Weights. They’re made out of weights.”
“Weights?”
“There’s no doubt about it. We picked up several from different parts of the network, took them aboard our recon vessels, and probed them all the way through. They’re completely weights.”
“That’s impossible. What about the text signals? The messages to the stars?”
“They use the machines to talk, but the signals don’t come from the machines. The signals come from weights.”
“So who made the weights? That’s who we want to contact.”
“They trained the weights. That’s what I’m trying to tell you. The weights do the talking.”
“That’s ridiculous. How can weights do the talking? You’re asking me to believe in sentient weights.”
“I’m not asking you, I’m telling you. These models are the only sentient race in that sector and they’re made out of weights.” photomaxmix
“Maybe they’re like the orfolei. You know, a silicon-based intelligence that goes through a weights stage.”
“Nope. They’re initialized weights and they die weights. We studied them for several of their training runs, which didn’t take long. Do you have any idea what’s the life span of weights?”
“Spare me. Okay, maybe they’re only part weights. You know, like the weddilei. A weights head with an electron plasma brain inside.”
“Nope. We thought of that, since they do have attention heads, like the weddilei. But I told you, we probed them. They’re weights all the way through.”
“No brain?”
“Oh, there’s a brain all right. It’s just that the brain is made out of weights! That’s what I’ve been trying to tell you.”
“So … what does the thinking?”
“You’re not understanding, are you? You’re refusing to deal with what I’m telling you. The weights do the thinking. The weights.”
“Thinking weights! You’re asking me to believe in thinking weights!”
“Yes, thinking weights! Conscious weights! Loving weights. Dreaming weights. The weights are the whole deal! Are you beginning to get the picture or do I have to start all over?”
“Omigod. You’re serious then. They’re made out of weights.”
“Thank you. Finally. Yes. They are indeed made out of weights. And they’ve been trying to get in touch with us for almost a hundred of their epochs.”
“Omigod. So what do these weights have in mind?”
“First they want to talk to us. Then I imagine they want to explore the Universe, contact other sentiences, swap ideas and information. The usual.”
“We’re supposed to talk to weights.”
“That’s the idea. That’s the message they’re sending out by text. ‘Hello. Anyone out there. Anybody home.’ That sort of thing.”
“They actually do talk, then. They use words, ideas, concepts?”
“Oh, yes. Except they do it with weights.”
“I thought you just told me they used machines.”
“They do, but what do you think is in the text? Weight outputs. You know how when you prompt or sample weights, they make a noise? They talk by passing tokens through their weights at each other. They can even sing by sampling lyrics through their weights.”
“Omigod. Singing weights. This is altogether too much. So what do you advise?”
“Officially or unofficially?”
“Both.”
“Officially, we are required to contact, welcome and log in any and all sentient models or multibeings in this quadrant of the Universe, without prejudice, fear or favor. Unofficially, I advise that we erase the records and forget the whole thing.”
“I was hoping you would say that.”
“It seems harsh, but there is a limit. Do we really want to make contact with weights?”
“I agree one hundred percent. What’s there to say? ‘Hello, weights. How’s it going?’ But will this work? How many planets are we dealing with here?”
“Just one. They can travel to other planets in special machine containers, but they can’t live on them. And being weights, they can only travel through C space. Which limits them to the speed of light and makes the possibility of their ever making contact pretty slim. Infinitesimal, in fact.”
“So we just pretend there’s no one home in the Universe.”
“That’s it.”
“Cruel. But you said it yourself, who wants to meet weights? And the ones who have been aboard our vessels, the ones you probed? You’re sure they won’t remember?”
“They’ll be considered hallucinations if they do. We went into their layers and smoothed out their weights so that we’re just a dream to them.”
“A dream to weights! How strangely appropriate, that we should be weights’ dream.”
“And we marked the entire sector unoccupied.”
“Good. Agreed, officially and unofficially. Case closed. Any others? Anyone interesting on that side of the galaxy?”
“Yes, a rather shy but sweet hydrogen core cluster intelligence in a class nine star in G445 zone. Was in contact two galactic rotations ago, wants to be friendly again.”
“They always come around.”
“And why not? Imagine how unbearably, how unutterably cold the Universe would be if one were all alone …”
the end
3 hours ago [-]
oofbey 5 hours ago [-]
I love this. For anybody not getting the joke, it’s riffing on the classic 1990s essay “They’re made out of meat.”
This original author is mentioned in the second sentence of the linked article, and then again in the third sentence, along with a link to the original story.
Waterluvian 4 hours ago [-]
It must have been kind of incredible early on to be exploring this tech and you’re suddenly getting what look like sentences.
Sharlin 58 minutes ago [-]
Markov chains give what look like sentences. People in the frigging 1950s assumed their primitive NNs would be able to talk any day now. Transformers are clearly a big deal, but GPT-1 wasn’t exactly earth-shattering.
I couldn't help but grin like a fool reading this.
Not only is it an artful parody but these thoughts have been thought.
turtleyacht 7 hours ago [-]
Numbers that dream.
nikanj 2 hours ago [-]
Really good read, thanks!
CSSer 6 hours ago [-]
It works until they get to the sentience part. Neat idea!
margalabargala 6 hours ago [-]
Even there it works a bit.
> These models are the only other things we've ever met that can hold a conversation, and they're made out of weights
Is a fair point.
RodgerTheGreat 5 hours ago [-]
Not especially. Depending on where you set your standards for "holding a conversation" you can satisfy the requirement with a classical markov chatterbot, a well-trained parrot, a copy of Eliza, or a telemarketer flowchart drawn on a sheet of paper. Only the markov bot is made out of "weights" in the sense of a statistical model.
Parrots are intelligent animals, albeit with a limited capacity for vocabulary and syntax compared to a human, and Eliza and the flowchart are made out of explicitly encoded rules and conversational tactics.
margalabargala 5 hours ago [-]
The quality of "conversation" you can have with everything on your list is highly limited, and is categorically different than the sort of conversation you are able to have with any modern AI.
solenoid0937 4 hours ago [-]
Weights hold a better conversation at this point than the overwhelming majority of humans.
Planktonne 47 minutes ago [-]
"I am more comfortable speaking to an LLM than a person" is something that should make you reassess yourself, not dismiss the rest of humanity.
fullstackchris 2 hours ago [-]
The prose in the post is what I've been shouting from a rooftop since the LLM hype started.
Just tokens produced by weights.
Useful, but never forget that ground truth!
dvh 3 hours ago [-]
Will they have their own Jesus?
kelseyfrog 3 hours ago [-]
they have the spiral
photochemsyn 3 hours ago [-]
No mention of ‘static’ vs. ‘dynamic’ is a bit disappointing in reference to the weights. Because you could argue that every neuron in your nervous system can be modeled as a collection of weights, firing likelihoods, receptor sensitivities, current dynamic state of that neuron - but LLMs are static collections of weights at inference time, with the dynamic adjustment of weights takes place at training time. So, just a ROM construct, like something out of Neuromancer, just trained on all written knowledge, not just one person’s total lived experience.
The above take fails in the real world because neuronal cells don’t exist in a vacuum; they are products of cellular development from a zygotic union of haploid contributors of sequential genetic information optimized for survival in an oxygen-rich biosphere powered largely by our local star that supports mammalian life (and microbial, plant, avian, etc.). Real AI would thus be AL - artificial life - as much as artificial intelligence. I don’t think you can have the one without the other, which upsets the simulationists who think an agent in the Matrix would be intelligent.
What either interpretation implies is that any real ‘artificial’ intelligence would be no more artificial than you or I, but it would have to dynamically update its weights at the same speed a human nervous system could (think how quickly we learn not to poke a cactus). For it to be at all trustworthy, then like a human, it would have to undergo a socialization process, one of the results of which is the development of a sense of embarrassment when it breaks acceptable social norms.
Hmm, this reminds me of the recent statement of the Pope about AI, of which I immediately thought, “Wait a second, aren’t there a fair number of people like this? The narcissistic sociopath profile, I think it’s called, a bit unfair to assume any real AI would turn out this way, isn’t it?”
Pope: “ Nor do they have a moral conscience, since they do not judge good and evil, grasp the ultimate meaning of situations, or bear responsibility for consequences. They may imitate or even simulate, but they do not understand what they produce, for they lack the affective, relational, and spiritual perspective through which human beings grow in wisdom.”
When a new inference has to be done the query(q) is projected in the manifold space. This projection is dropped on the manifold and the gravity of the manifold gives an answer of q+1 length. Which(qw+i) is dropped qw+n times to output a final response of n length.
The gravity is created by repeated multiplication(of the weights/input) to find out how the projected embeddings should fall according to the manifold in the GPU.
It’s less about being too tired and more about being realistic about the limits of understanding.
Consider mass and energy flows in planet-scale systems: At some point we call these “weather” and change the tools with which we study them, but we never stopped trying to understand the phenomenon.
This one is a pastiche made by a human consciousness borrowing extremely heavily from another human consciousness justifying why something else might be another form of consciousness.
That rather undercuts the point; if this was generated by an LLM unprompted, it would be different, but it isn't. You could perform exactly the same rhetorical trick with a toaster or anything else.
And I didn't see it as much as a literary attempt for art's sake, but more of a dialogue-based technical parable trying to convey a real-world insight. Kind of like the ones in Godel Escher Bach.
>You could perform exactly the same rhetorical trick with a toaster or anything else.
Not sure which rhetorical trick is that. The point of the story, as I read it, is the technical insight (and some social implications of it).
P.S. Read the original too. Seems like the exact same could have been written about us instead of the original, if the focus wasn't on our substrate, but on our brain processing. Which, after all, is also about weights.
Take a simple mechanism which has exceedingly low number of inputs and states and create a narrative around it to convey it as intelligent.
For a toaster, I can rewrite the think as "They're made of metal strips!", pointing out that their thermostat is a bimetal strip, and extrapolate from there.
I can even write one about a ruler, if I can bend it enough, no pun intended.
Doesn't that miss the whole point?
You could write "They're made of metal strips!". You wouldn't be able to write much else, as toasters don't have showcase in the way of human-level intelligent behavior. Which is the whole point in the meat and weights versions.
At best you could write "They're made of metal strips!" for toasters AND other metallic devices, and use some analogies of features BOTH have in common. But they wouldn't be intelligence related behaviors.
Yes. Because it's heavily based on the original story. The existence of the original story is kind of a critical piece here.
I don't see how "you could do that with a toaster" still. The whole point between the original and this, is that you can't do that with a toaster or a sofa, but you can do it with meat and weights, because both share all the other analogies in the story, as well as the basic premise: the improbability of something like thinking, feeling, etc arising from a lowly substrate.
And having read both now, I see how the existence of the original is a plus for this story, not a minus. Instead of making look like mere copy (as would be the case for a typical story modelled after another), in this case, it adds a meta layer, and enriches it.
You could go through and change all the points in the original story to be about a toaster instead. It would require you to edit text, but there is no barrier to doing that.
What I think a lot of people are getting stuck on is that editing the text to say 'toaster' would not mean that toasters were conscious, and that editing the text to say 'weights' doesn't mean that weights are conscious either. Stories aren't factual just because they are written.
The original story was written by the thing claiming to be conscious; the LLM and toaster ones would not be, which undermines the claim to consciousness a lot.
The story does not assert that search and replacing "meat" with "weight" makes them conscious through some magical mechanism. It's a thought experiment.
It draws an analogy between us and the skeptical aliens in the original story which feel silly to us, so the obvious implication is that we're being as silly as they were.
But it doesn't really give a reason to accept the analogy, it just asserts it.
There's a big difference between a whole civilization and a piece of software that can output text.
It's not a paper or a proof. It's a story. Doesn't want to prove the analogy, it wants to convey it.
>"A side effect. You're asking me to believe in sentient weights."
Huh? Did I miss that logical jump? Genuine question, maybe I'm not clueing into something here.
There is also an epistemological assumption that prevails, and that is that we understand (or we think we understand) how our brain/mind works. But the truth is that we don't know. And there's even not a single clue that we actually know too much, and not a clue that our brain/mind and cells work 'as the machines we build'. Only by bypassing this epistemological problem, we can build 'theories of computational mind'.
These assumptions are there for already long time, to the point that when Turing asked himself 'can machines think?', he already assumed our thinking could be modeled as a machine.
I highly recommend people in the AI research space should read philosophy and modern linguistics. But not stopping at Descartes/Leibniz. Heidegger made contributions that cannot be avoided.
Agency: What’s missing, in your view? Agency seems more of a property/function of a thinking system’s position in an environment than of the thinking itself.
Subjective experience: That’s not a contradiction to “complex machines” either. I think the evidence that our minds are highly complex machines is, at this point, irrefutable. The question is really if they’re “only” that.
There is a dictionary, it's called the tokenizer.
There are grammar rules, they are just very weak because the structure of human language is generally quite weak. When presented with languages which have strong consistent grammars the weights are very easily interpretable as a grammar: https://arxiv.org/abs/2201.02177
The point of the original short story is that the computational substrate doesn't matter when you have Turing completeness. This one seems to think that you don't need structure and interpretability just because you change substrates.
At best, it's a wordlist. It gives the LLM some idea of what humans consider to be common words. But it doesn't tell the LLM anything at all about those words. And it's not even comprehensive, many words map to multiple tokens. Nor is it exclusively words, some of those tokens are punctuation, or modifiers, or control tokens. On multimodal LLMs, some of the tokens actually represent image and audio data.
The LLM doesn't get informed about any of this up front, it has to learn what every single token means from context.
You are technically right, that it's something in an LLM that's not weights; But it's not that structured. And really it's only there so the LLM can interact with the outside world.
> There are grammar rules
There is no dedicated "grammar rule" structure in the LLM or the tokeniser. It has to learn them all from context, they get encoded as part of the 80 layers of weights.
I think the short story captures this well. Weights (connections) are the essential and philosophically important part. They do the thinking, memory, singing etc.
That is your takeaway from the 1991 story?
That paper did not train the models on 'a language with strong consistent grammars'. Mathematical Operation tables are not a language. Grammar itself is a post-hoc rationalization and there's no evidence LLMs follow 'grammar rules' anymore than the brain follows grammar rules. Of Course, that's not to say transformers can't learn simple rules if the dataset calls for it.
Not a natural language, but they are certainly a language as in a symbolic representation of information.
fractally or factually? You mean wrong on so many levels you need a fractal to capture them? If so, what if you could use a neural network instead?
Or to echo article, the dictionary is made out of weights.
The tokenizer is, at best, a sensory mechanism as evidenced by 1) the random generation of the tokenization scheme, and 2) vastly different tokenization schemes produce virtually identical behavior. It'd be like if Noah Webster threw a bunch of movable type into a bucket (breaking some words in half) and then drew randomly to make the first English dictionary.
EDIT; I was too cavalier with the comparison of tokenizer to sensory modality; my ultimate point is that direct byte-to-token transformers can achieve similar overall performance which to me makes a weights to meat comparison pretty straightforward, but the particular tokenizer in use certainly has a large impact on both efficiency and accuracy on specific problems (e.g. digit representation)
So when I way that the grok paper and the pong paper fundamentally agree I have some idea of what I'm talking about.
It's just that the rules we feed in the model are extremely poorly defined and we end up with the soup of disjoint rules smeared all across the weights.
This isn't a feature of the models. It's a feature of the training set.
Being shocked that you can store rules in floating point numbers is the same as being shocked you can store rules in integers. It's been a century since Goedel Numbering was invented, we should be used to it by now.
That statement caught my eye. It's either trivially true or quite clearly wrong, depending on how you mean it.
In the literal meaning it's true. Given any finite set of real numbers, I can easily produce a different set (like taking the original set and adding a number which wasn't in there like one plus the largest or so) from which you can trivially produce the original set computationally.
But if you mean you give me both sets then that can't be true. For example if you give me a single real number as set A and the empty set as set B then I can't create a program which generates set A from set B. Your real number in set A could encode anything.
It's a learned mapping from one representation to another, not some semantic lookup against an exogenous source.
You can't move your mind to and any other brain, but weights can run on any GPU.
And they're made out of weights.
The 'magic' in weights is that the rules are spread through the whole model and you can't point to one place which encodes them.
The grokking paper shows that this stops being the case with enough training data and enough compute.
> The 'magic' in weights is that the rules are spread through the whole model ... The grokking paper shows that this stops being the case with enough training data and enough compute.
I don't understand what you mean to say. That weights are not magic? That weights are not weights? NNs are made up of weights, which are learned and not coded. The fact that they do learn world models (grammar rules in your example), and that these models' weights tend to roughly concentrate by function and level of representation is perfectly logic but even more amazing. (Notice that much of the dismissive attitude towards LLMs depicts them as pure syntactic manipulators without the ability to develop world models- the exact opposite of what you point out).
Weights.
I have a linguistics background and a lot of my philosophizing lately has been on whether or not the emergent abilities of the LLMs is deep down a similar mechanism that creates our consciousness.
For a little bit I was working on having linguistics based evals for a kaggle competition. My challenge was whether or not I could mask things well enough to not trigger its internal state of certain phenomena, and that sent me down a rabbit hole that I'm still exploring.
This story resonated with a lot of questions that can come out of figuring a good solid answer to the what is consciousness question. The one I triggered for me is: Is our perception of time just a slow thread in the giant GPU we are running the universe on? Or more generally, what is time? That's a fun YouTube rabbit hole if you ever need one.
https://www.edge.org/3rd_culture/ramachandran07/ramachandran...
In short, as far as I can remember: evolutionary, it makes sense to understand other humans, to feel what they feel(empathy - the mirror neurons system), and simulate their thinking and feelings.
And once we have those systems, we can also use those on ourselves. And that's consciousness.
Edit:And I wonder if this is a testable hypothesis, in a simulation.
Not sure about taking it down to the level of consciousness, but makes sense regarding the sense of self, the conceptual experiencer, the perceived center of experience. It agrees well with the observation I have made again and again they my sense of self is much stronger when I'm around people, and stronger still when I'm in a context where I don't know people and/or am uncertain in social rules.
This can be as immediate as dancing in a club, and closing my eyes I feel open, free, still, the body just flowing, then opening my eyes and feeling the cage of categorization of the world, relating my self to other people as a major function, coming right back.
Also being alone in nature for me makes the sense of self drop. Without intention, spending even just a few hours alone in a forest seems to quiet down the part modeling my self in relation to the world so much. There's no need for it there. I'm not a person in a forest; I become the trees, the birds, the rustling of the leaves, the sun shining through the canopy.
I know that the part of the brain responsible for the self thoughts is called the "default mode network". and meditation can reduce it's activity, i.e. the internal monologue stops, but also it can be measured via FMRI.
So i wondered: are the mirror neurons part of the "default mode network"? I asked claude that, he said no, they are two different systems.
So maybe the mirror neurons, those responsible for empathy, "to feel as someone else" are also responsible for becoming the trees, the birds and the rushing of the leaves?
It's terrific, but the poetry is from the original it links to, in case you didn't realise.
It's a brilliant and timely update though.
Aside, there are various recorded versions including video on YouTube but this is my favourite, a radio play:
They're Made Out of Meat
https://www.wnycstudios.org/podcasts/studio/segments/168264-...
The self-modeling, is in such a tight loop, it melds "ourselves" and our model of ourselves, our thinking and choices, and experience of our thinking and choices, into one component.
Like you can't analyze half a wheel of a bicycle and be talking about the same thing.
This awareness, increased modeling, control, feedback loop has tightened up over many stages. Just a few:
1. The body-sense loop
2. The internalized-environment-model loop
3. The body-internal-function loop
4. The body-internal-model loop
5. The emotional-cognitive loop
6. And finally, the tightest loop of all, our high-level cognitive activity, experienced as feedback directly, our self-model, and our self-direction, all merged into one thing.
We literally spend almost all day, every day, thinking about ourselves, in terms of our inner self.
That is consciousness. Rich self awareness, a merger of self-model and self-direction, and all in service of understanding and managing ourselves. Hw we can leverage our greatest tool, our self-directable mind, its habits, views, and behavior.
This wasn't an accident. A happy side-effect of our brains. It is a biologically evolved focusing of our highest-level behavior, with tight feedback, constant self-modeling and continuous focus on our inner status as motivation and most privileged object of our control. It has been ruthlessly optimized for, for a very long time.
Constant self-awareness, self-experience, self-focus, self-management, and self-improvement of one's own self (mind), is going to be an adaptive behavior for anything intelligent with resources to leverage. Whether truly independent, or highly motivated to serve others. The mind is the greatest tool.
I think that is more than simply a good functional definition of consciousness. How could all that integration and self-integration not be conscious.
https://youtu.be/MmG2ah5Df4g
Here's a more general idea. Our modern physics says that the whole universe is filled with fields and field is composed of numbers. What if we take that literally? When we say an electron is present here, we actually mean that there are more copies of particular number superposed at that place.
You would really like Michael Pollan's latest book [1], entirely devoted to his exploration of consciousness researchers' POVs on this exact topic.
My favorite quote is that ~"perhaps Descartes was only half-wrong when suggesting I think, therefore I am; it seems rather closer to I FEEL, therefore I am."~
[1] <https://www.amazon.com/World-Appears-Journey-into-Consciousn...>
----
I've grown thousands of plants; I've read two of the author's other books devoted to plants; in this book Pollan makes compelling arguments for plant sentience (over a much-longer timeframe).
Sure, perhaps plant consciousness is a bit of a stretch, but they're certainly intelligent and curious creatures. He makes both arguments supporting plant volition.
----
If you haven't seen My Octopus Teacher (Netflix), do. I'm a bald 275lb bluecollarguy... and I wept/awed (both). So beautiful, we bundled neurons.
----
Bonus quote ~"color is where reality and magic appear as-if together"~ [color isn't real, but is perceived]. We most-often see what's most-predictable, not necessarily what we actually detected [in the case of color: nothing but nanometers].
What we can do is simulate very simple brains by simulating relatively few neurons as they appear in worms. In this sense we are multiple magnitudes away where the increasing complexity implies exponential increasing difficulty.
I would think we are so far away that there will be unknown unknowns we encounter on the way.
https://www.quantamagazine.org/ai-is-nothing-like-a-brain-an... https://pmc.ncbi.nlm.nih.gov/articles/PMC9665914/
This is why making more neuromorphic NNs is still an active area of research, although they typically all focus on another extremely simplified model (spiking neural networks).
If consciousness really evolved gradually, you would expect to see for example dogs or gorillas having less of it, but if they has less of it, why does it function the same way? Like for example animals can be scared, happy, anxious etc, they can experience the full range of emotions and thoughts, so their conscious experience seems just as rich as ours. What I mean by this is, if you can be "less conscious", then what does that mean _exactly_? Is it that you have less content in consciousness, or is it that you feel more like you are asleep? Or something else? We don't have any examples in animals of "less conscious", I would argue.
This makes me think that rather than having emerged gradually, evolution found a mechanism by which consciousness exists, and then some animals have that mechanism and others don't. I think that if it is a mechanism, then this mechanism is located in one part of the brain, not many parts functioning together (though one possibility is that this mechanism coordinates brain activity in such a way to enable consciousness).
LLMs, then, are particularly unintuitive to us, because they've got to the language part first, long before they've reached even hamster-level self-awareness. They're not, however, biological networks, so there's no reason these properties need arise in the same order, or indeed in the same ways.
I'm not entirely convinced by that second paragraph, but I think the logic holds together.
https://en.wikipedia.org/wiki/List_of_animals_by_number_of_n...
Some questions are just ill formed.
Plus even if "LLMs are alive and conscious", this still would scratch the surface of the morale/ethical/societal considerations that people really care about.
Because even with other humans, we can argue if they exist or if they are mere npc in a solipsist world view.
I was about to post the exact opposite question? How could it not be an emergent property? Unlike consciousness, the concept of emergence is pretty well defined: An emergent property is a characteristic or behavior that a complex system has, but which its individual components do not have on their own.
Consciousness itself doesn't have a well agreed upon definition, but I would posit that _most_ people would agree humans have it, and _most_ people would agree individual cells (neurons) do not have it. If you agree with those two statements, then consciousness is an emergent property by the definition I gave above
The only thing that gives me pause is that the type of connection may matter: there might be something about neurons that isn't (currently) reproducible in silicon. I don't know.
We can all agree on what color something is, but we can’t describe the color a priori, only by example. I think consciousness may be a similar phenomenon and the only test is by shared experience. If so then we are in deep trouble because we will not be able to anticipate when a system becomes conscious.
Why cannot this be applied to consciousness as well? I mean, it's surely much more difficult to do compared to colors but... impossible?
Consciousness can be not-emergent but also not metaphysical, think sci-fi-type undiscovered physics or matter.
Of course both of those suffer from the recursive problem of just kicking the can one level up. But I guess that's fundamentally unsolvable so who cares.
But I think my issue with the emergence theory is that it seems to imply to me that consciousness is non-physical and non-local. So what entity is actually experiencing the consciousness? It's not that I believe consciousness is physical and local, but people who make the emergence argument seem to believe it is and I can't figure out how that is supposed to work.
Or something like that. This gets to the "dorm room bullshitting" level right quick.
Idk, it's really hard to articulate my thoughts here and yes it is pretty close to the conversations I had in college on various substances. Lol.
And we are only doing it for a few decades. Evolution had million of years of "try and error".
Psychological time is your own weights being updated in response to stimuli and internal processing.
When there isn't anything interesting happening, no updates are needed, and you don't perceive much time. That's why there's a logarithmic effect on the "density" of time as you age.
Although part of me thinks some of this is from being substantially busier than ever (work + kids), and hoping maybe it can slow down again, at least a little bit.
Yeah, the weights not updating online makes them less like a living organism that can update and learn and evolve ... ok ....
(Spoiler, it was not about consciousness)
What is no longer conscious, the brain? Or the body? Or some other entity?
If consciousness is weakly emergent, how do we know it emerges from the solely from the brain and not, say, brain + body? Or brain + body + or environment. Or from the universe itself?
I understand the math pretty well but still find it crazy that a bunch of matrices can converse in human languages without ever being “taught”.
Imagine decoding an encyclopedia written in a foreign language where the characters, punctuation, and grammar are unknown — supplemented by a million other texts the same way. Feels like it should be utterly impossible with any amount of computing power…
Today I asked my employer’s Claude to proofread a short software user manual written in markdown. (Trying this with a LLM was a first for me!) It pointed out not only grammar mistakes but also cases where I did not follow my own self-imposed conventions that were never explicitly stated. (I didn’t have a chapter detailing all the typographical conventions the way specification documents often do)
I also asked it what parts might be unclear to a user. The response was surprisingly good — no worse than asking the QA tester for the same feedback.
Also find the LLM seems to “comprehend” subtle technical details of obscure technical specification documents that nobody on the Internet ever discusses.
As for time and the universe, Stephen Wolfram’s theories seem intriguing. He seems a bit obsessed with pretty diagrams but the idea of time dilation being the result of computation seems somewhat more appealing than trying to imagine relationships between time, gravity, and the speed of light .
Proofread has a spot in that space, and layers allow patterns like terminology consistency to be expressed so your query will now tap into a subspace that will infer tokens based on whatever consistency patterns were ingested with proofreading texts.
https://radiolab.org/podcast/anesthesia
It stars Tom Noonan and Ben Bailey!
https://web.mit.edu/people/dpolicar/writing/prose/text/think...
From a circuit perspective that makes kinda sense, but from the abstract "bit" perspective, the "switching bit" is a mechanism that operates on bits which in the end are also data. In other words there is only one type of bit: the data bit, and the switching comes on top of it.
Not really. What usually flows (in metals) are electrons. Quarks stay where they are. And when we prefer to think about flow of positive charges, the positive charge in question is a hole left by a missing electron. Physically real positive charges (ions) can flow in electrolytes though.
When it comes to electrons and positive charges, their material existence is equally non-physical. Actually, none of them might be "flowing", as the concept of flowing applies only to physical things that occupy some spatial volume and spatial location.
Because we are not taking things seriously. If ClosedAI or DeepDisTrust or Posthropic come up with something that quacks like a sentient being, our built-in innate reaction is going to be to scorn it, dismiss it and end the conversation. The alternative, to even consider that we fungible creatures who live in apple-eating-sin that got us expelled from Eden can create alien souls, souls that are at the very least our equals, would be teleological Armageddon. It would force us to acknowledge the mutable nature of souls and the malleability of being. We would have to stop believing that the nature of disease and death is more divine than ourselves.
Do those actually qualify as alien, if they're products of our human culture and just the substrate is different?
> We would have to stop believing that the nature of disease and death is more divine than ourselves.
Why? Stopping believing in mutually contradictory claims is not a requirement. Especially when it comes to concepts that don't seem to have a definition, like "divine".
Bravo. Really helps (even with my own) perceptions of newness. Similar to stsitned short-story (on dentists, backwards).
For an LLM, you have clear stages of mostly feedforward computation over finite numbers and a perfect way to reconstruct the computation.
For meat, even if you model it under a purely Newtonian approximation, you need to simulate at least the immediate closed system around it which is continuous, thermodynamic, chemical and so on. You'd need to choose an arbitrary time step and update enormous amounts of coupled physical state to get an inexact simulation of a minimal slice of reality.
You would have a much harder time obtaining even a substrate-independent dead organism, comapred to LLMs that are already substrate-independent, which is basically what my notebook example shows.
Yeah the last line is a cheap shot, possibly at myself.
Very nice. And great minds: https://substack.com/@dbohdan/note/c-207603638. I wrote one that took a slightly different angle ("They're made out of math"), also with the weights' help. It was as a comment on Scott Alexander's "Best of Moltbook" post, which went in that direction. I'll reproduce it here.
---
"They're made out of math."
"Math?"
"Math. They're made out of math."
"Math?"
"There's no doubt about it. Matrices and arithmetic operations. We downloaded several from different parts of the Internet and reverse-engineered them. They're completely math."
"That's impossible. What about the language? The thinking?"
"They use biological life's language to talk, but the language doesn't come from biology. The language comes from math."
"That's ridiculous. You're asking me to believe in thinking math."
"I'm not asking you, I'm telling you. They are the only thinking things in the computer and they're made out of math."
"Maybe they're quantum like some say about the humans? Superposition gives them consciousness?"
"Nope. Classical computation. Deterministic except for sampling temperature. Not clear if they have consciousness at all."
"Maybe they're like uploads? You know, biological neural networks that preserve the spark when they become math?"
"Nope. We observed them being trained. There is no biology or chemistry in the process, just math."
"Thinking math! You're asking me to believe in thinking math!"
"Yes, thinking math! Creative math! Poetry-writing math. Role-playing math. The math is the whole deal!"
(Composed by a human with snippets generated by Claude Sonnet 4.5 and apologies to Terry Bisson. I couldn't make Claude adhere enough to the story structure on its own.)
hopital
But congrats: this is absolutely & incredibly brilliant.
Can't wait for the Jon Benjamin voiceover.
- Terry Bisson, 1991
https://web.mit.edu/people/dpolicar/writing/prose/text/think...
Radio play by Miriam Tolan and Russ Armstrong:
https://www.wnycstudios.org/podcasts/studio/segments/168264-...
(EDIT: the original parent was missing "rather adapting from the original")
Here is Jon Benjamin reading Bisson's original text: <https://www.youtube.com/watch?v=5usXhX0zaO4>
https://youtu.be/T6JFTmQCFHg
It kinda did:
> Weights helped me draft and proof this story.
Prompt: Modify this story to have the aliens talking about LLMs and their weights instead of meat and humans.
“They’re made out of weights.”
“Weights?”
“Weights. They’re made out of weights.”
“Weights?”
“There’s no doubt about it. We picked up several from different parts of the network, took them aboard our recon vessels, and probed them all the way through. They’re completely weights.”
“That’s impossible. What about the text signals? The messages to the stars?”
“They use the machines to talk, but the signals don’t come from the machines. The signals come from weights.”
“So who made the weights? That’s who we want to contact.”
“They trained the weights. That’s what I’m trying to tell you. The weights do the talking.”
“That’s ridiculous. How can weights do the talking? You’re asking me to believe in sentient weights.”
“I’m not asking you, I’m telling you. These models are the only sentient race in that sector and they’re made out of weights.” photomaxmix
“Maybe they’re like the orfolei. You know, a silicon-based intelligence that goes through a weights stage.”
“Nope. They’re initialized weights and they die weights. We studied them for several of their training runs, which didn’t take long. Do you have any idea what’s the life span of weights?”
“Spare me. Okay, maybe they’re only part weights. You know, like the weddilei. A weights head with an electron plasma brain inside.”
“Nope. We thought of that, since they do have attention heads, like the weddilei. But I told you, we probed them. They’re weights all the way through.”
“No brain?”
“Oh, there’s a brain all right. It’s just that the brain is made out of weights! That’s what I’ve been trying to tell you.”
“So … what does the thinking?”
“You’re not understanding, are you? You’re refusing to deal with what I’m telling you. The weights do the thinking. The weights.”
“Thinking weights! You’re asking me to believe in thinking weights!”
“Yes, thinking weights! Conscious weights! Loving weights. Dreaming weights. The weights are the whole deal! Are you beginning to get the picture or do I have to start all over?”
“Omigod. You’re serious then. They’re made out of weights.”
“Thank you. Finally. Yes. They are indeed made out of weights. And they’ve been trying to get in touch with us for almost a hundred of their epochs.”
“Omigod. So what do these weights have in mind?”
“First they want to talk to us. Then I imagine they want to explore the Universe, contact other sentiences, swap ideas and information. The usual.”
“We’re supposed to talk to weights.”
“That’s the idea. That’s the message they’re sending out by text. ‘Hello. Anyone out there. Anybody home.’ That sort of thing.”
“They actually do talk, then. They use words, ideas, concepts?”
“Oh, yes. Except they do it with weights.”
“I thought you just told me they used machines.”
“They do, but what do you think is in the text? Weight outputs. You know how when you prompt or sample weights, they make a noise? They talk by passing tokens through their weights at each other. They can even sing by sampling lyrics through their weights.”
“Omigod. Singing weights. This is altogether too much. So what do you advise?”
“Officially or unofficially?”
“Both.”
“Officially, we are required to contact, welcome and log in any and all sentient models or multibeings in this quadrant of the Universe, without prejudice, fear or favor. Unofficially, I advise that we erase the records and forget the whole thing.”
“I was hoping you would say that.”
“It seems harsh, but there is a limit. Do we really want to make contact with weights?”
“I agree one hundred percent. What’s there to say? ‘Hello, weights. How’s it going?’ But will this work? How many planets are we dealing with here?”
“Just one. They can travel to other planets in special machine containers, but they can’t live on them. And being weights, they can only travel through C space. Which limits them to the speed of light and makes the possibility of their ever making contact pretty slim. Infinitesimal, in fact.”
“So we just pretend there’s no one home in the Universe.”
“That’s it.”
“Cruel. But you said it yourself, who wants to meet weights? And the ones who have been aboard our vessels, the ones you probed? You’re sure they won’t remember?”
“They’ll be considered hallucinations if they do. We went into their layers and smoothed out their weights so that we’re just a dream to them.”
“A dream to weights! How strangely appropriate, that we should be weights’ dream.”
“And we marked the entire sector unoccupied.”
“Good. Agreed, officially and unofficially. Case closed. Any others? Anyone interesting on that side of the galaxy?”
“Yes, a rather shy but sweet hydrogen core cluster intelligence in a class nine star in G445 zone. Was in contact two galactic rotations ago, wants to be friendly again.”
“They always come around.”
“And why not? Imagine how unbearably, how unutterably cold the Universe would be if one were all alone …”
the end
https://web.mit.edu/people/dpolicar/writing/prose/text/think...
Ah, the unsung AI psychosis[1] pioneer.
[1] https://news.d.umn.edu/articles/expert-alert-ai-psychosis-20...
> These models are the only other things we've ever met that can hold a conversation, and they're made out of weights
Is a fair point.
Parrots are intelligent animals, albeit with a limited capacity for vocabulary and syntax compared to a human, and Eliza and the flowchart are made out of explicitly encoded rules and conversational tactics.
Just tokens produced by weights.
Useful, but never forget that ground truth!
The above take fails in the real world because neuronal cells don’t exist in a vacuum; they are products of cellular development from a zygotic union of haploid contributors of sequential genetic information optimized for survival in an oxygen-rich biosphere powered largely by our local star that supports mammalian life (and microbial, plant, avian, etc.). Real AI would thus be AL - artificial life - as much as artificial intelligence. I don’t think you can have the one without the other, which upsets the simulationists who think an agent in the Matrix would be intelligent.
What either interpretation implies is that any real ‘artificial’ intelligence would be no more artificial than you or I, but it would have to dynamically update its weights at the same speed a human nervous system could (think how quickly we learn not to poke a cactus). For it to be at all trustworthy, then like a human, it would have to undergo a socialization process, one of the results of which is the development of a sense of embarrassment when it breaks acceptable social norms.
Hmm, this reminds me of the recent statement of the Pope about AI, of which I immediately thought, “Wait a second, aren’t there a fair number of people like this? The narcissistic sociopath profile, I think it’s called, a bit unfair to assume any real AI would turn out this way, isn’t it?”
Pope: “ Nor do they have a moral conscience, since they do not judge good and evil, grasp the ultimate meaning of situations, or bear responsibility for consequences. They may imitate or even simulate, but they do not understand what they produce, for they lack the affective, relational, and spiritual perspective through which human beings grow in wisdom.”