Anthropic Research Reveals How LLMs Perceive Text
Anthropic research reveals how LLMs perceive text, offering the latest insights into AI understanding.
Recent research by Anthropic uncovers how large-language models (LLMs) create internal representations that are similar to the human biological perception we use to comprehend space.
Claude 3.5 Invstigation Overview
This investigation was made by studying the Claude 3.5 Haiku’s capacity to add line breaks within a certain text size, which requires the model to keep track of its current location as it creates content.
Andreas Volpini tweeted about the paper and drew an analogy between fragmenting content to facilitate AI consumption. In a wider perspective, his post functions as a metaphor to describe how both models and writers can navigate through structure, locating an equilibrium at the point of where one segment ends and the other begins.
The research paper isn’t focused on reading content, but rather writing text and determining the best place to place an end of line to make the text fit into an undetermined fixed size.
The reason for doing this was to understand the inner workings of an LLM in terms of keeping an eye on text location as well as word choices and line break boundaries when writing.
Investigating Line Break Decisions in LLMs
The research focused on the way Claude 3.5 Haiku decides when to break lines which requires monitoring the number of characters and adjusting words within a defined bound.
Instead of simply reading text this task is focused on generation and highlights the model’s intricate internal tracking of the position as well as word length and lines break lines.
How the Model Learns Line Constraints
“Claude” 3.5 Haiku was tasked with creating text that was sized to a specific line length. It was required to decide whether the word to come be inserted on the existing line, or should a break be required.
This was a challenge that required thinking as well as memory and plan which researchers visualized with graphs of attribution that show distinct functions in tracking the number of characters in predicting words, as well as signaling break in lines.
Fluid, Continuous Counting Instead of Stepwise Tracking
Incredibly, the model does not track the character count by tallying every character individually instead, but using the smooth geometric structure.
A continuously-changing “curve” allows the model to follow positions in a fluid manner by mimicking spatial awareness than the rigid method of counting symbols.
According to the research paper:
“One essential feature of the representation of line character counts is that the “boundary head” twists the representation, enabling each count to pair with a count slightly larger, indicating that the boundary is close. That is, there is a linear map QK which slides the character count curve along itself. Such an action is not admitted by generic high-curvature embeddings of the circle or the interval like the ones in the physical model we constructed. But it is present in both the manifold we observe in Haiku and, as we now show, in the Fourier construction. “
Specialized Attention Head Detects Line Boundaries
Researchers identified the existence of a “boundary head,” a special attention component that can detect when the line’s boundary is close.
Attention heads are of a minor importance and this boundary head is twisted by its internal representation to signal the distance to the line’s edge by dragging the curve of position to identify the boundary on its own.
How Boundary Detection Works Technically
Claude 3.5 Haiku compares two internal signals to predict line break The current count of characters as compared to. the maximum permitted line length.
Boundary attention heads are dynamically aligned these signals, shifting the focus when the counts are close to matching and causes it to anticipate an eventual line break. Multiple of these heads work to determine precisely the number of characters left.
The researchers explain:
“To detect an approaching line boundary, the model must compare two quantities: the current character count and the line width. We find attention heads whose QK matrix rotates one counting manifold to align it with the other at a specific offset, creating a large inner product when the difference of the counts falls within a target range. Multiple heads with different offsets work together to precisely estimate the characters remaining.“
Final Decision: To Break or Not to Break the Line
After the model determines its position as well as the anticipated length of the word to come it will activate internal features that determine if the word is within or over the limit.
Certain features encourage the use of new lines, while others keep the line back if there’s enough space. This balancing act determines whether the model cuts the line or continues to write.
According to the research:
“The final step of the linebreak task is to combine the estimate of the line boundary with the prediction of the next word to determine whether the next word will fit on the line, or if the line should be broken.”
Can Models Be Fooled by Visual Illusions?
As a follow-up study, researchers examined how the text “perception” could be tricked by human visual illusions. The introduction of artificial tokens such as “@@” affected the model’s internal tracking of its position like how optical illusions can alter the perception of humans.

The interruption changed the model’s attention pattern and affected the line-break predictions showing that the model’s perception is based on the context it has learned and the patterns.
But the effect was a bit specific, as out of more than 180-character sequences tested, there were just a few codes influenced by code obstruction in the border detection.
They explained:
“We find that it does modulate the predicted next token, disrupting the newline prediction! As predicted, the relevant heads get distracted: whereas with the original prompt, the heads attend from newline to newline, in the altered prompt, the heads also attend to the @@.”
Language Models Develop Visual-Like Perception Over Text
Anthropic’s research reveals that LLMs do more than just process symbols, they make smooth geometric maps of text, which resembles the sensory system within biological brain systems.
Researchers draw similarities between the earliest layers of language models that perceive input and visual perception in the brains of animals.
Thay suggested:
“Although we sometimes describe the early layers of language models as responsible for “detokenizing” the input, it is perhaps more evocative to think of this as perception. The beginning of the model is really responsible for seeing the input, and much of the early circuitry is in service of sensing or perceiving the text similar to how early layers in vision models implement low level perception.”
There after they write:
“The geometric and algorithmic patterns we observe have suggestive parallels to perception in biological neural systems. …These features exhibit dilation—representing increasingly large character counts activating over increasingly large ranges—mirroring the dilation of number representations in biological brains. Moreover, the organization of the features on a low dimensional manifold is an instance of a common motif in biological cognition. While the analogies are not perfect, we suspect that there is still fruitful conceptual overlap from increased collaboration between neuroscience and interpretability.”
Demystifying the Magic Behind LLMs
The late Arthur C. Clarke famously stated:
“Any sufficiently advanced technology is indistinguishable from magic.”
This research expands our understanding about how models for language interpret the content and arrange it, thereby moving past what is known as the “black box” myth.
Although it may not immediately enhance SEO strategies, knowing that LLMs produce visual maps that are structured and resemble perceptions of text could help you make better decision-making about content formatting and structure.
Final Thought
The knowledge gained from this can transform magic into concrete knowledge, helping marketers in the digital age as well as SEO professionals alike.
Read the detail research here: When Models Manipulate Manifolds: The Geometry of a Counting Task