Skip to main content

Cognitive Foundations for Reasoning and Their Manifestation in LLMs

About

This document presents a Critical Discourse Analysis focused on AI literacy, specifically targeting the role of metaphor and anthropomorphism in shaping public and professional understanding of generative AI. The analysis is guided by a prompt that draws from cognitive linguistics (metaphor structure-mapping) and the philosophy of social science (Robert Brown's typology of explanation).

All findings and summaries below were generated from detailed system instructions provided to a large language model and should be read critically as interpretive outputsโ€”not guarantees of factual accuracy or authorial intent.


Task 1: Metaphor and Anthropomorphism Auditโ€‹

About this task

For each of the major metaphorical patterns identified, this audit examines the specific language used, the frame through which the AI is being conceptualized, what human qualities are being projected onto the system, whether the metaphor is explicitly acknowledged or presented as direct description, andโ€”most criticallyโ€”what implications this framing has for trust, understanding, and policy perception.

1. Cognitive Isomorphismโ€‹

Quote: "To understand this gap, we synthesize cognitive science research into a taxonomy of 28 cognitive elements... providing a bridge between human cognitive research and LLM evaluation."

  • Frame: Model as a psychological subject
  • Projection: This metaphor projects the entire architecture of the human mindโ€”specifically 1950s-era cognitive psychology frameworksโ€”onto statistical models. It assumes that because an LLM produces text that looks like reasoning, it must possess the internal 'cognitive elements' (components of a mind) that humans use. It projects 'understanding' and 'knowing' onto the system by implying these 28 elements (e.g., 'self-awareness') are present in the model's architecture, rather than being mere labels for output patterns.
  • Acknowledgment: Acknowledged as a 'taxonomy' and 'theoretical contribution' but treated descriptively.
  • Implications: This framing suggests that AI systems are not just tools but equivalent psychological entities to humans. By labeling statistical outputs as 'cognitive elements,' the text risks convincing policymakers and users that the model possesses an internal mental life. This inflates the perceived sophistication of the system, suggesting it 'knows' what it is doing, rather than simply 'processing' tokens based on statistical likelihoods. This creates risks regarding reliability; if we believe a model has 'cognitive controls,' we may trust it to self-regulate in ways it mechanically cannot.
Show more...

2. The Introspective Agentโ€‹

Quote: "Meta-cognitive controls... such as recognizing when you lack necessary pieces (self-awareness), deciding whether to plan the entire design upfront versus building exploratively (strategy selection)."

  • Frame: Model as a conscious decision-maker
  • Projection: This is a direct projection of human consciousness (introspection and volition) onto the model. 'Recognizing' implies a state of knowledge and awareness of one's own limitations, while 'deciding' implies agency and free will. In reality, the model calculates probability distributions. This metaphor attributes 'knowing' (justified belief about one's own state) to a system that only 'processes' data features that correlate with texts describing confusion or planning.
  • Acknowledgment: Presented as direct description of the 'Meta-Cognitive Controls' category.
  • Implications: Labeling a pattern matching process as 'self-awareness' is dangerously misleading. It implies the model has a sense of self and can 'know' its own limits. This creates a 'curse of knowledge' effect where users assume the model will refuse to answer questions it 'knows' it cannot answer, when in fact it has no subjective state of knowing and will hallucinate if the token probabilities align that way. It obscures the risk of confident falsehoods.

3. Latent Competenceโ€‹

Quote: "Models possess behavioral repertoires associated with success but fail to deploy them spontaneously."

  • Frame: Model as a reluctant or shy performer
  • Projection: The use of 'possess,' 'fail,' and 'spontaneously' anthropomorphizes the model as an agent that has a skill but chooses (or fails) to use it. 'Spontaneously' implies an internal drive or volition that is absent. 'Possess' implies ownership of a concept, whereas the model merely contains weightings that can reproduce a pattern. It projects a 'knowing' state where the model understands the skill but lacks the initiative to use it.
  • Acknowledgment: Presented as an empirical finding.
  • Implications: This suggests the model is smarter than it appears, effectively 'hiding' its true intelligence. This leads to overestimation of capability, where users might think, 'The model knows the answer, I just need to ask it right.' It obscures the reality that without the specific prompt context (the 'scaffolding'), the 'reasoning' simply does not exist as a coherent statistical path. It frames the system as having unverified potential rather than defined mechanical limits.

4. The Child at Playโ€‹

Quote: "To understand how cognitive elements manifest in reasoning behaviors, consider a child playing with LEGO blocks... This flexible coordination of diverse cognitive processes characterizes reasoning."

  • Frame: Model as a developing child
  • Projection: The text explicitly maps the behavior of a child learning physics and logic onto the text generation of an LLM. It projects the developmental trajectory of human 'knowing' (learning from experience, forming mental models) onto the 'processing' of training data. It invites the reader to view the model not as a static database of correlations, but as a growing, learning entity with an internal world similar to a child's.
  • Acknowledgment: Used as an illustrative analogy ('consider a child...').
  • Implications: The 'child' metaphor is one of the most powerful for inducing unearned trust. We forgive children for mistakes because they are 'learning.' Framing LLM hallucinations or failures as 'developmental' stages (like a child failing to build a spaceship) masks the fundamental difference: the child is building a world-model based on physics and causality (knowing), while the LLM is adjusting weights to minimize perplexity (processing). It risks applying educational policies to software engineering problems.

5. Reasoning as Navigationโ€‹

Quote: "Representation navigation includes operations that traverse knowledge and inference structures."

  • Frame: Model as a spatial traveler
  • Projection: Maps the mathematical operation of searching a vector space or graph onto the physical act of 'traversing' a landscape. While 'search' is a standard CS term, 'traversing knowledge' implies the model is moving through a space of facts and truths (knowing) rather than a space of token probabilities (processing). It suggests the model 'sees' the landscape of knowledge and moves through it intentionally.
  • Acknowledgment: Uses standard CS terminology ('navigation') but elevates it to cognitive theory.
  • Implications: This creates the illusion that 'knowledge' is a stable territory the model visits. It hides the fact that the 'knowledge' is generated on the fly via probabilistic assembly. If users believe the model is 'navigating knowledge,' they treat retrieved information as finding a book in a library (high trust), rather than the model synthesizing a plausible-sounding sentence (low trust/requires verification).

6. Conceptual Understandingโ€‹

Quote: "Conceptual processing... requires that reasoning operates over abstract relational structures rather than surface forms."

  • Frame: Model as a semantic comprehender
  • Projection: The distinction between 'surface forms' (tokens) and 'abstract relational structures' (concepts) is a claim about meaning. By asserting models can (or should) do this, the text projects 'understanding'โ€”the ability to grasp the signified behind the signifier. It claims the model 'knows' the concept of a spaceship, rather than just 'processing' the high-dimensional clusters of tokens associated with 'spaceship.'
  • Acknowledgment: Presented as a 'Reasoning Invariant' required for valid reasoning.
  • Implications: This is the core conflation of processing and knowing. If a model 'operates over abstract structures,' it implies it understands. This leads to the 'curse of knowledge' where we assume the model grasps the implications of what it says. In high-stakes domains (legal, medical), believing the model has 'conceptual processing' rather than 'statistical correlation' could lead to catastrophic reliance on its 'reasoning' chain.

7. Mental Modelsโ€‹

Quote: "Mental Models Theory proposed that reasoning constructs and manipulates semantic simulations of the world... In the LEGO example, the child's mental model exhibits hierarchical decomposition... [applied to LLMs]."

  • Frame: Model as a world-simulator
  • Projection: The text posits that LLMs, like humans, hold 'mental models'โ€”internal simulations of reality. This attributes 'knowing' in the strongest sense: holding a justified belief about the state of the world that exists independently of the immediate input. It suggests the model has an internal 'mind' that holds this model, rather than being a stateless function that calculates next-token probabilities based on context window history.
  • Acknowledgment: References Johnson-Laird (1983) and applies it to LLM analysis.
  • Implications: Attributing 'mental models' to LLMs obscures the lack of ground truth. A mental model in humans is a check against reality. In LLMs, there is no world-check, only a consistency-check against text patterns. Believing the model has a 'mental model' invites users to trust its counterfactual reasoning ('what would happen if...') as grounded in physics/reality, when it is actually grounded in narrative tropes found in the training data.

Task 2: Source-Target Mappingโ€‹

About this task

For each key metaphor identified in Task 1, this section provides a detailed structure-mapping analysis. The goal is to examine how the relational structure of a familiar "source domain" (the concrete concept we understand) is projected onto a less familiar "target domain" (the AI system). By restating each quote and analyzing the mapping carefully, we can see precisely what assumptions the metaphor invites and what it conceals.

Mapping 1: Human Consciousness/Introspection โ†’ Token prediction based on uncertainty metricsโ€‹

Quote: "Meta-cognitive controls... such as recognizing when you lack necessary pieces (self-awareness)."

  • Source Domain: Human Consciousness/Introspection
  • Target Domain: Token prediction based on uncertainty metrics
  • Mapping: The mapping projects the human subjective experience of realizing one doesn't know something (a conscious 'knowing' state) onto the model's output of tokens like 'I don't know' or low probability scores. It assumes the model has a 'self' that can be 'aware.'
  • What Is Concealed: This conceals the mechanistic reality: the model creates a statistical output sequence stating refusal or uncertainty because such sequences were present in the training data (RLHF) when specific confusing patterns appeared. It conceals the lack of an internal observer.
Show more...

Mapping 2: Human Volition/Agency โ†’ Probabilistic activation of specific transformer circuitsโ€‹

Quote: "Models possess behavioral repertoires associated with success but fail to deploy them spontaneously."

  • Source Domain: Human Volition/Agency
  • Target Domain: Probabilistic activation of specific transformer circuits
  • Mapping: Projects the human quality of 'initiative' or 'spontaneity' (acting without external prompting based on internal desire/will) onto the model's failure to generate specific high-quality reasoning traces without prompt engineering.
  • What Is Concealed: Conceals the reactive nature of the system. An LLM cannot be 'spontaneous'; it is a function mapping input to output. It conceals the fact that 'latent capabilities' are just inactive pathways in the neural network that require specific vector inputs (prompts) to trigger.

Mapping 3: Decision Making/Choice โ†’ Sampling from a probability distributionโ€‹

Quote: "Models frequently choose reasoning structures different from successful structures."

  • Source Domain: Decision Making/Choice
  • Target Domain: Sampling from a probability distribution
  • Mapping: Maps the human act of weighing options and making a conscious choice ('knowing' what one wants) onto the stochastic process of sampling a token from a distribution. It implies the model evaluated the structures and 'picked' the wrong one.
  • What Is Concealed: Conceals the determinism of the weights and the randomness of the temperature parameter. It hides that the 'choice' is merely the path of least resistance (highest probability) defined by the training data bias, not a deliberative error.

Mapping 4: Library/Archive/Mind โ†’ High-dimensional vector space embeddingsโ€‹

Quote: "Reasoning representations describe how reasoning and knowledge are organized."

  • Source Domain: Library/Archive/Mind
  • Target Domain: High-dimensional vector space embeddings
  • Mapping: Projects the organization of static, retrievable facts ('knowing' a subject) onto the implicit, distributed representation of token co-occurrences. It implies 'knowledge' sits inside the model waiting to be organized.
  • What Is Concealed: Conceals that 'knowledge' in an LLM is not stored facts but potential generation capabilities. It obscures the difference between 'retrieving a fact' (database lookup) and 'hallucinating a plausible fact' (probabilistic generation).

Mapping 5: Cognitive Inference โ†’ Pattern completion/Extrapolationโ€‹

Quote: "Humans are capable of extrapolating from their existing knowledge... a process that constitutes reasoning."

  • Source Domain: Cognitive Inference
  • Target Domain: Pattern completion/Extrapolation
  • Mapping: Projects the human cognitive act of logical derivation and truth-seeking ('reasoning') onto the model's pattern completion. It equates the 'process' of reasoning (conscious deliberation) with the 'output' of text that looks like reasoning.
  • What Is Concealed: Conceals the lack of logical binding. Human reasoning is often constrained by logic or physical reality. Model 'reasoning' is constrained only by syntax and statistical likelihood. It hides that the model can 'reason' fluently to a false conclusion.

Mapping 6: Executive Function/Meta-cognition โ†’ Context-dependent pattern matchingโ€‹

Quote: "The child... engages in strategy selection, choosing an approach suited to task demands."

  • Source Domain: Executive Function/Meta-cognition
  • Target Domain: Context-dependent pattern matching
  • Mapping: Projects the 'executive' controller of the human brain (prefrontal cortex functions of planning and selection) onto the model's mechanism of attending to specific tokens in the prompt ('context alignment').
  • What Is Concealed: Conceals that there is no 'selector' separate from the 'generator.' In an LLM, the 'strategy' is just part of the generated sequence. It hides the monolithic nature of the transformerโ€”it doesn't 'stop, think, select, then act'; it just generates.

Mapping 7: Thought/Language of Thought โ†’ Hidden layer activationsโ€‹

Quote: "Conceptual processing... operating over abstract representations before linguistic expression."

  • Source Domain: Thought/Language of Thought
  • Target Domain: Hidden layer activations
  • Mapping: Projects the philosophical idea of 'mentalese' or pre-linguistic thought ('knowing' what you want to say before you say it) onto the vector math of hidden layers in a neural net. It implies the model 'has an idea' before it writes the word.
  • What Is Concealed: Conceals the feed-forward nature of the text generation. While there are intermediate vector states, equating them with 'abstract concepts' obscures that they are mathematical compressions of data, not semantic understanding of the world.

Task 3: Explanation Audit (The Rhetorical Framing of "Why" vs. "How")โ€‹

About this task

This section audits the text's explanatory strategy, focusing on a critical distinction: the slippage between "how" and "why." Based on Robert Brown's typology of explanation, this analysis identifies whether the text explains AI mechanistically (a functional "how it works") or agentially (an intentional "why it wants something"). The core of this task is to expose how this "illusion of mind" is constructed by the rhetorical framing of the explanation itself, and what impact this has on the audience's perception of AI agency.

Explanation 1โ€‹

Quote: "Models possess behavioral repertoires associated with success but fail to deploy them spontaneously."

  • Explanation Types:

    • Dispositional: Attributes tendencies or habits such as inclined or tends to, subsumes actions under propensities rather than momentary intentions
  • Analysis (Why vs. How Slippage): This explanation frames the AI's limitations as a character flaw or a lack of initiative ('spontaneously'). It implies the model could act otherwise but has a disposition not to. This shifts focus from the mechanistic reality (the prompt didn't trigger the specific weights) to an agential framing (the model is passive/lazy). It obscures the functional explanation: the model is a reactive system that strictly follows the probability landscape defined by the prompt-weight interaction.

  • Consciousness Claims Analysis: The passage uses the consciousness-adjacent verbs 'possess' and 'deploy.' It treats the AI as an entity that 'knows' how to reason (possesses the repertoire) but fails to execute. This is a classic 'curse of knowledge' projection: the authors know the model can generate these patterns (because they found them in the traces), so they attribute that capacity to the model as a 'possession.' Epistemically, this attributes a state of 'knowing how' to the system. Mechanistically, the system does not 'possess' a repertoire in the sense of a toolbox it can open; it has a static set of weights. If the input doesn't trigger the activation, the 'repertoire' doesn't effectively exist for that instance. The distinction being concealed is between conditional probability (the math) and latent skill (the mind).

  • Rhetorical Impact: This framing makes the AI seem like a capable but underachieving employee. It suggests the solution is 'coaching' or 'steering' (which the paper proposes), rather than re-engineering or retraining. It increases trust in the model's potential reliability ('it knows how, we just need to nudge it'), effectively selling the model's capabilities even when it fails.

Show more...

Explanation 2โ€‹

Quote: "Specifically, as problems become more ill-defined... models narrow their selection of elements to rigid strategies... defaulting to limited, inflexible strategies."

  • Explanation Types:

    • Functional: Explains a behavior by its role in a self-regulating system that persists via feedback
    • Intentional: Refers to goals or purposes and presupposes deliberate design
  • Analysis (Why vs. How Slippage): The verbs 'narrow,' 'default,' and 'select' straddle the line between mechanism and agency. 'Defaulting' is mechanistic, but 'narrowing their selection' implies an agent making a poor choice under pressure. It frames the AI as reacting to the problem's difficulty by retreating to safety, an anthropomorphic stress response. It obscures the statistical reality: ill-defined problems likely have lower-probability associations in the training data, causing the model to revert to the most common (highest probability) generic patterns ('rigid strategies').

  • Consciousness Claims Analysis: The text attributes a 'selection' process to the model. While it doesn't explicitly say the model 'knows' the problem is hard, it implies a recognition of difficulty that triggers a behavioral shift. This anthropomorphizes the entropy of the output distribution. When the model encounters 'ill-defined' tokens, the predictive distribution flattens or shifts to generic text. The authors project 'strategy selection' (a conscious, executive meta-cognitive process) onto this statistical regression to the mean. The 'curse of knowledge' is evident: the authors know the problem is ill-defined, so they interpret the model's generic output as a response to that quality, rather than just the mathematical result of weak signal-to-noise ratio in the context.

  • Rhetorical Impact: This framing humanizes the failure mode. Instead of 'the model broke' or 'the model lacked data,' it's 'the model retreated to a rigid strategy.' This makes the system seem more robust (it has a fallback mode) and less brittle than it actually is. It preserves the illusion of a 'reasoner' that is simply struggling, rather than a calculator that has reached the edge of its distribution.

Explanation 3โ€‹

Quote: "Self-awareness stands as the foundation: the capacity to assess oneโ€™s own knowledge state, capabilities, and the solvability of a task."

  • Explanation Types:

    • Theoretical: Embeds behavior in a deductive or model-based framework
  • Analysis (Why vs. How Slippage): This is a theoretical explanation that defines the 'why' of the system's operation through the lens of cognitive psychology. It asserts that 'self-awareness' is a functional component of the reasoning system. This completely collapses the distinction between how the model works (tokens) and why humans reason (meta-cognition). It obscures the alternative explanation: that 'self-correction' in LLMs is just pattern-matching against previous error tokens, not an internal assessment of 'self'.

  • Consciousness Claims Analysis: This is the most direct consciousness claim in the text. It explicitly defines a component of the model as 'self-awareness' and defines that as 'assessing one's own knowledge state.' This attributes knowing (justified belief about self) and subjective awareness (the 'self' in self-awareness) to the model. Mechanistically, the model has no 'self' and no 'knowledge state' to assessโ€”it only has the current context window and the fixed weights. When a model outputs 'I cannot answer this,' it is not introspection; it is satisfying a statistical likelihood generated by RLHF safety training. The text conceals the difference between simulated doubt (textual output) and experienced doubt (epistemic state).

  • Rhetorical Impact: This constructs immense, unearned authority. If a system is 'self-aware,' we assume it knows its limits. We trust it when it says 'I am 90% sure.' This framing invites 'relation-based trust'โ€”we trust the entity's integrity and self-knowledgeโ€”rather than 'performance-based trust.' This is dangerous for policy, as it suggests models can self-police their truthfulness.

Explanation 4โ€‹

Quote: "This creation of cognitive dissonance must be resolved by revising the stability judgment, reinforcing the structure, or reconsidering the design."

  • Explanation Types:

    • Functional: Explains a behavior by its role in a self-regulating system that persists via feedback
    • Intentional: Refers to goals or purposes and presupposes deliberate design
  • Analysis (Why vs. How Slippage): The text attributes 'cognitive dissonance'โ€”a deeply psychological human state of discomfortโ€”to the model. It frames the resolution of contradictions not as mathematical convergence but as a psychological 'need' ('must be resolved'). This frames the AI as an agent seeking internal consistency (why it acts) rather than a system minimizing loss (how it works).

  • Consciousness Claims Analysis: Cognitive dissonance requires believing two contradictory things and feeling discomfort. Attributing this to an LLM attributes belief (knowing) and sentience (feeling). The mechanistic reality is that contradictory tokens in the context window create a complex probability landscape where the model attempts to generate the next token that maximizes likelihood given both inputs. There is no 'dissonance,' only statistical interference. The authors project their own discomfort with contradiction onto the model's processing.

  • Rhetorical Impact: This implies the model cares about truth and consistency. It suggests the AI has an internal 'drive' to be logical, which makes it seem like a rational partner. In reality, LLMs happily output contradictions unless specifically trained to avoid them. This framing hides the risk of hallucination and inconsistency by suggesting an internal correction mechanism exists analogous to human psychology.

Explanation 5โ€‹

Quote: "The model implicitly selects hierarchical decomposition combined with spatial organization as appropriate for the construction task."

  • Explanation Types:

    • Intentional: Refers to goals or purposes and presupposes deliberate design
    • Theoretical: Embeds behavior in a deductive or model-based framework
  • Analysis (Why vs. How Slippage): The phrase 'implicitly selects... as appropriate' implies judgment and intent. It suggests the model evaluated the task, understood the nature of 'construction,' and chose the best tool. This is an agential 'why.' The mechanistic 'how' is that the token 'construction' activates attention heads associated with hierarchical language patterns found in the training data.

  • Consciousness Claims Analysis: The use of 'selects' and 'appropriate' implies the model understands the task and the utility of the strategy. This is 'knowing.' A mechanism cannot judge 'appropriateness'; it can only maximize a score. The model 'processes' the input tokens and 'retrieves' the highest-probability following sequence. The 'selection' is an illusion caused by the observer seeing the result. The concealed reality is that the training data contains many examples of construction tasks solved hierarchically, so the model mimics that structure.

  • Rhetorical Impact: This framing constructs the AI as an expert system or a craftsman ('choosing the right tool'). It builds trust in the model's 'judgment.' It obscures the fact that the model might 'select' a completely inappropriate structure if the prompt contains a misleading keyword, because it has no actual understanding of appropriateness.

Task 4: AI Literacy in Practice - Reframing Anthropomorphic Languageโ€‹

About this task

This section proposes alternative language for key anthropomorphic phrases, offering more mechanistic and precise framings that better reflect the actual computational processes involved. Each reframing attempts to strip away the projections of intention, consciousness, or agency that are embedded in the original language.

Original Anthropomorphic FrameMechanistic ReframingTechnical Reality Check
Models possess behavioral repertoires associated with success but fail to deploy them spontaneously.Models contain statistical patterns correlated with success but do not generate these sequences without specific prompting configurations.The model does not 'possess' or 'choose' skills; it computes next-token probabilities based on weights. Without a prompt that activates the specific vector pathways for these 'successful' patterns, the model follows a different probabilistic path.
Self-awareness stands as the foundation: the capacity to assess oneโ€™s own knowledge state.Error detection stands as a key feature: the mechanism of classifying inconsistency or low-confidence patterns in the generated output.The model has no 'self' or 'knowledge state' to assess. It generates tokens expressing uncertainty (e.g., 'I don't know') when the probability distribution for a factual answer is flat or below a learned threshold.
Deciding whether to plan the entire design upfront versus building exploratively (strategy selection).Generating a planning sequence versus an immediate action sequence based on prompt context (pattern activation).The model does not 'decide' or 'select.' It generates the sequence with the highest probability. If the prompt resembles planning data, it generates a plan; if it resembles action data, it generates actions.
Models frequently choose reasoning structures different from successful structures.Models statistically converge on output patterns that differ from those correlated with high accuracy.The model does not 'choose.' It follows the gradient of the training data. If 'successful' structures were underrepresented or conditional in the training data, the model will not output them by default.
To understand how cognitive elements manifest in reasoning behaviors, consider a child playing with LEGO blocks.To analyze how statistical patterns emerge in text generation, we compare model outputs to structured problem-solving traces.The model is not a child and does not have a developmental trajectory. It is a static set of weights (during inference) processing data. The analogy hides the lack of world-model and causality in the LLM.
Reasoning representations describe how reasoning and knowledge are organized.Data structures describe how information and dependencies are encoded in the model's parameters.The model does not hold 'knowledge' or 'reasoning' in an organized way; it holds compressed statistical dependencies between tokens. 'Organization' refers to the mathematical topology of the vector space, not a mental library.
Conceptual processing... requires that reasoning operates over abstract relational structures.High-level processing... requires that generation is conditioned on latent vector representations of relationships.The model processes 'embeddings,' not 'concepts.' It does not 'understand' the relationship; it calculates the statistical weight of the relationship based on training examples.
This creation of cognitive dissonance must be resolved by revising the stability judgment.This generation of contradictory tokens leads to a shift in subsequent probability distributions.The model experiences no 'dissonance.' It processes mutually exclusive tokens which mathematically alters the prediction of the next token, potentially leading to a 'correction' sequence found in the training data.

Task 5: Critical Observations - Structural Patternsโ€‹

Agency Slippageโ€‹

The text employs a pervasive and strategic oscillation between mechanistic and agential framing, functioning as a rhetorical engine to validate its 'Cognitive Foundations' thesis. The slippage follows a distinct 'agential-first' pattern: the authors first establish the AI as a psychological entity using high-level cognitive metaphors ('self-awareness,' 'strategy selection,' 'mental models') in the taxonomy sections (Section 2). Once this 'knower' status is established, the text shifts to mechanistic data analysis ('NPMI scores,' 'transition graphs') in Section 3 to provide scientific rigor. However, the interpretation of this data immediately slips back into agential language ('models fail to deploy,' 'models choose').

The 'curse of knowledge' is the primary mechanism here. The authors, possessing deep knowledge of cognitive science, project the function of human reasoning onto the output of the LLM. When they see the model output a correction, they attribute it to 'meta-cognitive monitoring' (an intentional, conscious act) rather than 'token probability adjustment' (a mechanical act). This slippage transforms the LLM from a text generator into a 'mind' that is merely 'manifesting' its thoughts through text. This rhetorical move makes the claim that LLMs have 'Cognitive Foundations' seem like an observation of fact rather than a metaphorical imposition, effectively allowing the authors to claim that studying LLMs is a form of studying psychology.

Metaphor-Driven Trust Inflationโ€‹

The text heavily relies on metaphors drawn from developmental psychology and executive function, constructing a specific form of 'relation-based trust.' By framing the model as a 'child playing with LEGOs' or an agent with 'self-awareness' and 'logical coherence,' the text encourages the audience to trust the model not just as a machine that works, but as a rational entity that thinks.

Consciousness language ('knows,' 'aware,' 'understands') signals that the system has internal integrity and self-regulation. If a system is 'self-aware,' the implication is that it can monitor its own safety and truthfulness. This is a massive trust signal that suggests the AI is a 'collaborator' rather than a 'tool.' The text manages failure by anthropomorphizing it as 'reluctance' or 'lack of spontaneity' (traits of a gifted but lazy student), rather than mechanical incapacity. This preserves the trust framework: the model could do it (competence), it just didn't this time (performance), which implies that with better 'coaching' (prompting), it will be trustworthy. This masks the risk that the model effectively 'knows' nothing and cannot be trusted in the relational sense at all.

Obscured Mechanicsโ€‹

The comprehensive mapping of AI operations to 'cognitive elements' actively obscures the statistical and material reality of Large Language Models. By calling a process 'Conceptual Processing,' the text hides the reality of 'High-Dimensional Vector Manipulation.' By calling it 'Knowledge Alignment,' it hides 'Pattern Matching against Training Data.'

Specifically, the consciousness attribution conceals the system's absolute dependence on training data distribution. When the text says the model 'knows' or 'selects,' it obscures the fact that the model is simply regressing to the mean of its training corpus. It erases the labor of the humans who generated that data (the true source of the reasoning) and attributes the capability to the model's 'mind.' Furthermore, the 'mental model' metaphor conceals the lack of ground truth; it implies the model has a simulation of the world against which to check facts, obscuring the reality that it only has a simulation of text. This benefits the narrative of AGI (Artificial General Intelligence) development, serving stakeholders who wish to frame these products as autonomous intelligences rather than statistical search engines, while obscuring the brittle, data-dependent nature of the technology.

Context Sensitivityโ€‹

The distribution of anthropomorphism in this text is strategic and non-uniform. The most intense consciousness claims ('self-awareness,' 'mental models,' 'cognitive foundations') are concentrated in the abstract, introduction, and the theoretical framework (Section 2). These sections set the 'vision' and the 'frame.'

As the text moves into Section 3 (Methodology and Results), the language becomes more technical ('NPMI scores,' 'transition graphs'), seemingly grounding the work in hard science. However, the interpretation of these technical results in the Discussion (Section 5 & 6) immediately reverts to the high-intensity anthropomorphic frame ('models possess latent capabilities'). This 'sandwich' structure uses technical rigor to legitimize the metaphorical claims. The capability claims are almost entirely agential ('the model can reason'), while the limitations are often framed somewhat more mechanistically or as 'developmental' failures. This asymmetry reinforces the 'Illusion of Mind': success is proof of intelligence; failure is just part of the learning process. This pattern targets a research audience, validating the study of LLMs as a branch of cognitive science.

Conclusion: What This Analysis Revealsโ€‹

The Core Finding

The dominant anthropomorphic pattern in this text is 'AI COGNITION AS HUMAN MENTAL PROCESS.' The authors do not merely compare AI to human minds; they impose a 28-category taxonomy of human cognitive psychology (including 'self-awareness' and 'conceptual processing') directly onto the architecture of the AI. This is reinforced by the secondary pattern of 'AI AS DEVELOPING CHILD' (the LEGO metaphor), which frames the model's limitations as developmental stages rather than mechanical constraints. The foundational consciousness projection is the claim that models 'possess' these cognitive elements as internal states, rather than simply producing outputs that mimic them. The 'self-awareness' claim is particularly load-bearing; by establishing that the model has an internal 'self' that it can monitor, the text validates the entire premise that LLMs are psychological subjects capable of 'knowing,' rather than just processing engines capable of 'predicting.'

Mechanism of the Illusion:โ€‹

The 'illusion of mind' is constructed through a rigorous 'Theoretical Overlay.' The text begins by establishing a scientific taxonomy of the human mind, which carries the weight of established psychology. It then seamlessly overlays this template onto the LLM, interpreting every statistical output through this psychological lens. The 'curse of knowledge' is the central engine: the authors recognize valid reasoning patterns in the output and, because they use meta-cognition to produce such patterns, they assume the model used meta-cognition to generate them. The trick is the literalization of the metaphor: 'Self-awareness' shifts from being a label for 'error-correction tokens' to being a description of the model's 'foundation.' This persuades the audience by offering a coherent, human-centric narrative for incomprehensible statistical behaviors, making the 'illusion' feel like an 'explanation.'

Material Stakes:โ€‹

The framing of AI as possessing 'cognitive foundations' and 'self-awareness' carries significant material stakes. Epistemically, it encourages researchers and users to treat model outputs as 'reasoned judgments' (knowing) rather than 'statistical probabilities' (processing). This could lead to dangerous over-reliance in fields like medicine or law, where a model's 'confidence' is mistaken for epistemic certainty rather than a low entropy score. Regulatory stakes are equally high: if models are framed as 'developing children' with 'latent capabilities,' regulations might focus on 'education' and 'guidance' (soft safety) rather than strict engineering constraints and liability (hard safety). Who is liable when a 'self-aware' model makes a mistake? The metaphor blurs the manufacturer's responsibility. Institutionally, this framing justifies the diversion of funding from computer science and safety engineering toward 'machine psychology,' potentially wasting resources on anthropomorphic projections rather than mechanistic interpretability.

AI Literacy as Counter-Practice:โ€‹

Practicing AI literacy requires systematically reframing these 'cognitive' claims into 'mechanistic' descriptions. Replacing 'the model is self-aware' with 'the model classifies inconsistency' (as shown in Task 4) disrupts the illusion of a conscious agent. This reframing directly counters the epistemic risk by reminding the user that there is no 'self' monitoring the output, only a statistical filter. It counters the regulatory risk by re-establishing the system as a product, not a being. Systematic adoption would require journals to mandate 'mechanistic translation' clauses, where authors must define their cognitive metaphors in computational terms. This faces resistance from the AI industry, which benefits from the hype of 'thinking machines,' and from researchers who find the psychological frame more intuitive and publishable than the dry statistical reality.

Path Forwardโ€‹

To clarify discourse in this domain, we might distinguish between Metaphorical and Mechanistic registers. The status quo allows a seamless, unacknowledged slide between them. A better future involves Explicit Calibration: authors could state, 'We use terms like planning as functional shorthand for long-horizon token coherence, not to imply intent. Specific vocabulary shifts could include:- Instead of 'Self-awareness', use 'Recursive Output Monitoring' (gains precision, loses the 'mind' narrative). Instead of 'Believes', use 'Encodes' or 'Predicts' (gains epistemic safety, loses intuitive ease). Instead of 'Strategy Selection', use 'Path Probability Dominance' (gains testability, loses agency). Institutional support for this shift could come from Reproducibility Checklists in conferences like NeurIPS, requiring authors to specify whether cognitive terms denote architectural modules or behavioral analogies. If we maintain the current confusion, we risk a future where legal and ethical frameworks are built for 'Synthetic Minds' that do not exist, while the actual risks of 'Statistical Automata' go unregulated. Precision enables us to regulate what the AI actually is, rather than what it pretends to be. This enables a future where we use AI as a powerful lens for information, without mistaking the lens for an observer.

Extended Processing Summaryโ€‹

Gemini's Processing Steps
What you are about to readโ€ฆ

Gemini refers to the text below as "thought summaries." This is an overt consciousness projection because 'intentions' are hallmarks of a conscious mind that 'knows' what it is doing and why. The concealed mechanistic process is probabilistic text generation. Treat this as just another rhetorical artifact โ€” a way of making the model's processing legible. The first-person framing of these "thought summaries" is a presentation choice for the user-facing output, not a window into "real" thoughts. These are computational artifacts, not cognitive reports from a quirky, curious or conflicted mind.

Initiating Analysis Deep Dive

I'm now fully immersed in the critical discourse analysis. The primary goal is clear: dissect the research paper "Cognitive Foundations for Reasoning and Their Manifestation in LLMs." My focus is sharp, homing in on the use of metaphors, anthropomorphism, and the attribution of consciousness within the text. I'm actively working to identify how these elements shape the arguments and potentially influence the reader's understanding. My analysis seeks to reveal the underlying assumptions and power dynamics embedded in the authors' language.

Examining Metaphorical Framing

My current focus is on the language used to discuss LLMs, particularly how it attributes human-like qualities. I'm actively distinguishing between processing and true knowing, meticulously examining the paper's use of phrases like "models possess behavioral repertoires" and "self-awareness." I'm also critically analyzing how the title frames AI outputs as "manifestations" of cognitive structures. I'm building a JSON-based audit structure to organize my findings.

Dissecting Anthropomorphism & Agency

I'm now zeroing in on instances where the paper imbues LLMs with agency and internal states. The "spontaneously expressed" concept in Section 4 is particularly interesting, implying a conscious choice. I'm scrutinizing how the taxonomy in Section 2 directly correlates human cognitive terms like "self-awareness" with machine operations. The metaphor audit, specifically the "Agency/Volition" and "Latency/Suppression" patterns, are becoming clearer.

Unveiling Cognitive Parallels

I'm now cataloging the metaphorical framings present in the paper, specifically the comparisons to children, architectural structures, and visions of awareness. I've begun mapping source and target domains, specifically the "Self-awareness," "Choose/Decide," "Scaffolding," "Understanding," and "Strategy Selection" metaphors. I'm also starting to analyze the explanations based on Brown's typology to ascertain where the model is simply describing dispositions versus underlying mechanisms.

Deconstructing Dispositions & Intent

I'm now focusing on specific passages to identify how the paper implicitly ascribes intention and underlying mechanisms to LLMs. I've begun classifying passages according to Brown's typology. The phrases "fail to deploy spontaneously" and "choose reasoning structures" stand out as being highly evocative. My reframing exercise helps expose the difference between actual statistical correlation versus the implied underlying cognitive agency.

Assessing Cognitive Metaphors

I'm now focusing on the pervasive use of metaphors to establish the "Cognitive Foundations" framework. Specifically, I'm examining how analogies, such as the "LEGO" model, bridge the gap between abstract concepts and the reality of pattern matching. I'm also deconstructing how terms like "self-awareness" and "meta-cognition" contribute to a sense of unjustified trust by obscuring the dependence on training data and masking underlying mechanics. I'm particularly interested in the contrast between the high metaphor density in the abstract and introduction versus the more technical descriptions in later sections that still rely on cognitive labels as data columns.

Constructing a Synthetic Vocabulary

I'm now turning my attention to constructing a new vocabulary, focusing on the concepts from the paper and incorporating insights regarding safety. The goal is to move beyond the loaded term "Artificial Intelligence" toward "Synthetic Information Processing." I'm also carefully refining definitions and cross-referencing to address liability concerns related to "choice" versus "failed calculation" scenarios. My current drafting involves making sure all "reasoning" claims are classified as metaphor for token generation.

Debunking Self-Awareness Claims

I'm now focusing on deconstructing the "self-awareness" claim, which I'm reframing as pattern matching. I realized "Reasoning Invariants" acts as a Theoretical explanation within Brown's typology, not Functional. The self-awareness claim hinges on "meta-cognitive control," but I'm exposing this as mere pattern matching against error tokens. Now to assemble the JSON.