Position: Beyond Reasoning Zombies — AI Reasoning Requires Process Validity

About
Analysis Metadata
📊 Audit Dashboard

About

This document presents a Critical Discourse Analysis focused on AI literacy, specifically targeting the role of metaphor and anthropomorphism in shaping public and professional understanding of generative AI. The analysis is guided by a prompt that draws from cognitive linguistics (metaphor structure-mapping), the philosophy of social science (Robert Brown's typology of explanation), and accountability analysis.

All findings and summaries below were generated from detailed system instructions provided to a large language model and should be read critically as interpretive outputs—not guarantees of factual accuracy or authorial intent.

Metaphor & Illusion Dashboard

Anthropomorphism audit · Explanation framing · Accountability architecture

Metaphor AuditExplanation Audit

Deep Analysis

Select a section to view detailed findings

Section:

The text relies on a foundational pattern of 'Formalized Anthropomorphism.' It attempts to tame the ambiguity of AI through rigorous mathematical definitions ($S_t, B_t, R_t$), yet it populates these formulas with deeply loaded consciousness terms: 'Beliefs,' 'Reasoning,' 'Evidence,' and 'Goals.' This creates a system where the mathematical rigor acts as a vehicle for validating the anthropomorphic projection. A secondary, load-bearing pattern is the 'Zombie vs. Reasoner' dichotomy. This binary assumes that 'true' reasoning (consciousness-adjacent valid processing) is possible and distinct from 'mere' simulation. This distinction upholds the entire argument: without the assumption that 'real' reasoning exists and is superior, the critique of 'r-zombies' collapses into a critique of all computation. The 'Reasoning' frame relies on the 'Belief' frame; if the system has no beliefs, it cannot reason, only calculate.

"Reasoning is the process of selecting and applying sequences of rules that act on prior beliefs and current evidence to obtain principled belief updates in evolving states."

Explanation Types:

FunctionalTheoretical

✓ Mechanistic "How"

🔍Analysis

This explanation focuses on the how (mechanism) of reasoning, breaking it down into component parts (rules, beliefs, evidence). It is functional because it describes the role of each component in the transition of states. However, it relies on theoretical constructs ('beliefs', 'rules') that are imposed definitions rather than observable physical components of a neural net. By framing it mechanistically, it emphasizes the procedure but obscures the physical reality—that these are matrix multiplications, not 'rule applications' in the symbolic sense.

🧠Epistemic Claim Analysis

The passage uses strong consciousness/epistemic verbs ('beliefs', 'principled updates') within a mechanistic syntax. It claims the system has beliefs to update. This is a clear example of the 'curse of knowledge': the authors define $B_t$ as 'beliefs' and then attribute the epistemic properties of belief (justification, updates) to the variable. While the description is technical (states, transitions), the vocabulary attributes mental content to the processing states. It asserts that the system is an epistemic agent ('obtain principled belief updates') rather than a data transformer.

🎯Rhetorical Impact

The framing constructs the AI as a rational, logical engine. It increases trust by using the language of logic and validity ('principled', 'rules'). It suggests that if we can just see the 'rules,' the system is trustworthy. It obscures the risk that the 'rules' might be incomprehensible matrices. It positions the AI as a valid participant in logic, elevating it from a tool to a 'reasoner' that follows principles.

How/Why Slippage

20%

of explanations use agential framing

2 / 10 explanations

Unacknowledged Metaphors

63%

presented as literal description

No meta-commentary or hedging

Hidden Actors

88%

agency obscured by agentless constructions

Corporations/engineers unnamed

Explanation Types

How vs. Why framing

20%

agential

Acknowledgment Status

Meta-awareness of metaphor

63%

direct

Actor Visibility

Accountability architecture

88%

hidden

Source → Target Pairs (8)

Human domains mapped onto AI systems

Source

Philosophy of Mind / Horror Fiction (Zombies)

→

Target

AI Systems (Large Language Models) with unverified internal logic

Source

Epistemology / Human Cognition (Belief)

→

Target

Computer Memory / Data Variables ($B_t$)

Source

Human Agency / Teleology

→

Target

Optimization Algorithm / Loss Function

Source

Psychiatry / Perception

→

Target

Probabilistic Text Generation Errors

Source

Pedagogy / Biology

→

Target

Parameter Adjustment / Curve Fitting

Source

Social Psychology / Interpersonal Relationships

→

Target

System Reliability / Verification

Source

Autonomy / Self-Governance

→

Target

Unsupervised / Self-Supervised Learning algorithms

Source

Cognitive Psychology (Dual Process Theory)

→

Target

Computational Compute Cycles / Search Trees

Metaphor Gallery (8)

📊 Badge Guide

Frame: Metaphor type

Red = Unacknowledged / Hidden actors

Amber = Hedged / Partial attribution

Green = Acknowledged / Actors named

The Reasoning Zombie (r-zombie)
Model as undead/soulless imitatorExplicitly AcknowledgedHidden (agency obscured)
"Analogously, r-zombies are systems that superficially behave as autonomous reasoners, but lack valid internal reasoning mechanisms... an imperfect r-zombie could produce convincing but untrustworthy (or adversarial) CoT by emulating reasoning structure rather than content."
Computational States as Beliefs
Data parameters as epistemic convictionsDirect (Unacknowledged)Hidden (agency obscured)
"Prior beliefs are the outputs of previous reasoning steps... They are intermediate conclusions... Current beliefs denote the conclusions drawn in the transition from t-1 to t."
The Goal-Oriented Decision Maker
Algorithm as intentional agentDirect (Unacknowledged)Hidden (agency obscured)
"Definition 2.2 (Reasoner, informal). A goal-oriented decision-maker that implements reasoning."
Epistemic Trust in Software
Tool reliability as social contractExplicitly AcknowledgedPartial (some attribution)
"epistemic trust in machine reasoning has been championed most in mathematical domains... the shift from deterministic systems... has raised new specters for epistemic trust"
Hallucination as Feature
Statistical error as psychiatric conditionDirect (Unacknowledged)Hidden (agency obscured)
"evidence that hallucination is a feature and not a bug... accuracy collapse on tasks of scaling complexity"
The Learning Agent
Parameter optimization as educationDirect (Unacknowledged)Hidden (agency obscured)
"The agent learns a policy that maps states to actions... Rules can be learned autonomously from data on-the-fly."
Chain-of-Thought (CoT) as Thinking
Text generation as cognitive traceHedged/QualifiedHidden (agency obscured)
"CoT 'reasoning traces' often serve as a stand-in for the LRM’s internal reasoning process... we contend that an imperfect r-zombie could produce convincing but untrustworthy... CoT"
Evidence and Experience
Input data as empirical experienceDirect (Unacknowledged)Hidden (agency obscured)
"Evidence is a form of exogenous or intrinsically obtained information... Rules... can be learned autonomously from data on-the-fly. Rules can be fixed or continuously updated in light of new information."

Reframed Language Samples

Original Quote	Mechanistic Reframing	Technical Reality	Human Agency Restoration
A goal-oriented decision-maker that implements reasoning.	A computational system that executes an optimization algorithm to minimize a specified loss function through iterative data processing.	The system does not make decisions or hold goals; it executes a pre-defined path-finding algorithm based on gradient descent or tree search to satisfy a mathematical stopping criterion.	Developers at [Company] designed the objective function and deployed the system to optimize for specific outputs.
Prior beliefs are the outputs of previous reasoning steps... Current beliefs denote the conclusions drawn	Prior state vectors are the outputs of previous processing iterations... Current state vectors denote the numerical values computed	The model stores data representations (embeddings/tensors) in memory. It does not hold 'beliefs' (justified true convictions) but simply retains the output of function $f(x)$ for use in function $g(x)$.	N/A - describes computational processes without displacing responsibility.
The agent learns a policy that maps states to actions.	The model's parameters are adjusted via feedback loops to approximate a function mapping input vectors to output vectors.	The system does not 'learn' in a cognitive sense; it fits a curve to a dataset. The 'policy' is a probability distribution over possible outputs, conditioned on inputs.	Engineers configured the reinforcement learning algorithm to adjust the model's weights based on a reward signal defined by the development team.
hallucination is a feature and not a bug	Fabrication of non-factual content is a statistical inevitability of probabilistic token generation.	The model generates the next most probable token based on training data correlations. It has no access to ground truth, so it cannot 'hallucinate' (perceive falsely); it simply generates text that resembles facts without checking validity.	Developers chose to use probabilistic language models for information retrieval tasks despite knowing these architectures prioritize plausibility over factuality.

Task 1: Metaphor and Anthropomorphism Audit

About this task

For each of the major metaphorical patterns identified, this audit examines the specific language used, the frame through which the AI is being conceptualized, what human qualities are being projected onto the system, whether the metaphor is explicitly acknowledged or presented as direct description, and—most critically—what implications this framing has for trust, understanding, and policy perception.

V3 Enhancement: Each metaphor now includes an accountability analysis.

1. The Reasoning Zombie (r-zombie)

Quote: "Analogously, r-zombies are systems that superficially behave as autonomous reasoners, but lack valid internal reasoning mechanisms... an imperfect r-zombie could produce convincing but untrustworthy (or adversarial) CoT by emulating reasoning structure rather than content."

Frame: Model as undead/soulless imitator
Projection: This metaphor maps the philosophical concept of 'p-zombies' (beings physically identical to humans but lacking qualia/consciousness) onto AI systems. By establishing a dichotomy between 'r-zombies' and 'autonomous reasoners,' the text implicitly projects that a 'true' reasoner possesses something akin to genuine understanding or internal conscious validity, whereas the zombie merely simulates it. It anthropomorphizes the 'true' system by suggesting it is not just a mechanism, but an entity with 'valid internal mechanisms' that elevate it above mere simulation, attributing a form of epistemic authenticity to computational processing.
Acknowledgment: Explicitly Acknowledged (The authors explicitly coin the term 'r-zombie' as an analogue to the 'philosophical zombie (p-zombie) thought experiment' and define it within the text.)
Implications: The r-zombie frame creates a dangerous binary. It suggests that while current models are 'fakes,' a future 'valid' system would be a 'real' reasoner. This implies that once a system meets the authors' criteria for 'process validity,' it arguably deserves the trust and agency attributed to human reasoners. It inflates the perceived sophistication of future 'valid' systems, potentially shielding them from scrutiny by implying they possess a 'true' cognitive status rather than just a verifiable audit trail. It risks convincing policymakers that 'valid' AI is equivalent to human judgment.

Accountability Analysis:

Actor Visibility: Hidden (agency obscured)
Analysis: The construction 'r-zombies are systems that... behave' treats the AI as the primary actor, albeit a deceptive one. The human engineers who trained the model to optimize for convincing output (RLHF) are erased. The 'deception' is framed as a property of the zombie, rather than a direct result of corporate decisions to prioritize plausible-sounding outputs over factual grounding. Naming the actor would reveal: 'Microsoft/OpenAI engineers optimized the loss function for persuasive text generation regardless of internal logic.'

2. Computational States as Beliefs

Quote: "Prior beliefs are the outputs of previous reasoning steps... They are intermediate conclusions... Current beliefs denote the conclusions drawn in the transition from t-1 to t."

Frame: Data parameters as epistemic convictions
Projection: This frames mathematical values (vectors, tokens, logical symbols) as 'beliefs'—a term intrinsically tied to consciousness, intentionality, and the psychological state of holding a proposition to be true. It projects the human capacity for justification and conviction onto temporary data storage. It suggests the system 'believes' its output in an epistemic sense, rather than simply storing the result of a calculation. This blurs the line between a variable assignment ($x=5$) and a cognitive state ('I believe x is 5').
Acknowledgment: Direct (Unacknowledged) (The text defines 'Beliefs' operationally (Def 2.5) but uses the term directly without scare quotes or hedging in the main analysis, treating the mapping from data to belief as factual.)
Implications: Calling data states 'beliefs' implies that AI systems function as rational agents capable of holding worldviews. In policy contexts, this invites the 'curse of knowledge,' where humans assume the system understands the semantic content of its 'beliefs.' It complicates liability: if a system acts on a false 'belief,' it sounds like an honest mistake by a rational agent, rather than a calculation error or data quality issue. It creates an illusion of mind that masks the purely syntactic nature of the processing.

Accountability Analysis:

Actor Visibility: Hidden (agency obscured)
Analysis: The text states 'we model beliefs as a form of endogenous or intrinsically obtained information.' This obscures the external designers who defined the data structures and the training data providers who generated the information. The 'belief' is presented as emerging from the system's process ('intrinsically obtained'), erasing the human labor of data curation and the architectural decisions that determine how information is retained.

3. The Goal-Oriented Decision Maker

Quote: "Definition 2.2 (Reasoner, informal). A goal-oriented decision-maker that implements reasoning."

Frame: Algorithm as intentional agent
Projection: This frames a software pipeline as a 'decision-maker' with 'goals.' In human contexts, decision-making implies free will, weighing of options, and moral responsibility. 'Goal-oriented' implies intrinsic desire or intent. This projects agency and teleology (purpose) onto a system that merely minimizes a loss function or executes a stopping rule. It implies the AI 'wants' to solve the problem, rather than being mathematically compelled to terminate a loop.
Acknowledgment: Direct (Unacknowledged) (Presented as a formal definition (Def 2.2) without qualification. While 'Reasoning' is the process, the 'Reasoner' is explicitly defined as the agent performing it.)
Implications: By defining the software as a 'decision-maker,' the text linguistically prepares the ground for shifting liability. If the AI is the decision-maker, it becomes the locus of action. This framing supports the 'electronic personhood' argument, which benefits corporations by insulating them from liability for their products' 'decisions.' It also inflates capabilities, suggesting the system can handle complex trade-offs with the nuance of a human decision-maker.

Accountability Analysis:

Actor Visibility: Hidden (agency obscured)
Analysis: The definition isolates the 'Reasoner' (AI) as the decision-maker. It hides the fact that the 'goals' are objective functions defined by engineers, and the 'decisions' are mathematical inevitabilities given the code and data. A precise framing would be 'A software system executing an optimization path defined by developers.' The current framing displaces agency from the deployer (who chose the goal) to the artifact (which executes it).

4. Epistemic Trust in Software

Quote: "epistemic trust in machine reasoning has been championed most in mathematical domains... the shift from deterministic systems... has raised new specters for epistemic trust"

Frame: Tool reliability as social contract
Projection: Epistemic trust is a concept from sociology and psychology describing the relationship between cognitive agents (e.g., trusting a scientist or doctor). Applying this to software projects a social relationship onto a tool-user relationship. It implies the AI is a member of the 'collective epistemic enterprise' capable of sincerity or deception, rather than a machine that is simply reliable or unreliable. It anthropomorphizes the failure modes as breaches of trust rather than mechanical faults.
Acknowledgment: Explicitly Acknowledged (The authors define epistemic trust via citations (Wilholt, 2013) and apply it to AI, acknowledging it is a transfer of a concept from science/psychology to AI.)
Implications: Framing reliability as 'trust' creates emotional and social expectations. If users 'trust' an AI, they may lower their guard or attribute benevolence/neutrality to it. Reliability is verifiable; trust is relational. Promoting 'trust' in AI risks encouraging over-reliance in high-stakes domains (medicine, law) where verification, not trust, is required. It suggests the solution to AI errors is 'building trust' (relational) rather than 'fixing bugs' (technical).

Accountability Analysis:

Actor Visibility: Partial (some attribution)
Analysis: The text discusses 'trust in machine reasoning' and 'societal investment.' It mentions 'scientists' in the definition of trust but obscures the specific corporations asking for trust in their AI products. By framing it as a general problem of 'epistemic trust,' it diffuses the specific responsibility of companies like OpenAI or Google to demonstrate product safety before deployment. The 'specters for epistemic trust' are presented as abstract phenomena, not corporate failures.

5. Hallucination as Feature

Quote: "evidence that hallucination is a feature and not a bug... accuracy collapse on tasks of scaling complexity"

Frame: Statistical error as psychiatric condition
Projection: The 'hallucination' metaphor maps human perceptual/psychiatric disorders onto probabilistic error. It suggests the AI is a mind that 'perceives' the world but occasionally 'sees' things that aren't there. This projects a psyche capable of perception. Mechanistically, the model is simply generating low-probability or ungrounded tokens. It cannot hallucinate because it never perceived anything to begin with; it only processes text strings.
Acknowledgment: Direct (Unacknowledged) (The term is used as standard technical terminology. The phrase 'feature and not a bug' reinforces the biological/inherent nature of the behavior without questioning the metaphor.)
Implications: The 'hallucination' metaphor absolves developers of responsibility for data quality. A 'hallucination' sounds like an internal, unpredictable mental glitch—difficult to control. If framed as 'fabrication' or 'ungrounded generation,' it sounds like a functional failure. This framing masks the fact that these systems are designed to generate plausible text, not truth. It implies the system is trying to be truthful but suffering a breakdown, rather than succeeding at being plausible but failing at factuality.

Accountability Analysis:

Actor Visibility: Hidden (agency obscured)
Analysis: The phrase 'hallucination is a feature' attributes the behavior to the model's nature. It obscures the design decision by architects (e.g., Google, OpenAI) to use probabilistic generation (Next Token Prediction) for information retrieval tasks, a design choice known to cause fabrication. It erases the commercial decision to deploy stochastic models for factual queries.

6. The Learning Agent

Quote: "The agent learns a policy that maps states to actions... Rules can be learned autonomously from data on-the-fly."

Frame: Parameter optimization as education
Projection: Maps the human/biological process of 'learning' (conceptual change, understanding, skill acquisition) onto numerical parameter updates (gradient descent). Suggests the AI is an autonomous student gaining wisdom. 'Autonomously' intensifies the projection of agency, suggesting the system is self-directed in its improvement, hiding the massive infrastructure and human-designed objectives guiding the optimization.
Acknowledgment: Direct (Unacknowledged) (Standard ML terminology used without qualification. 'Learns' and 'autonomously' are presented as literal descriptions of the algorithmic process.)
Implications: This metaphor suggests that AI capabilities are 'grown' or 'taught' rather than built. It leads to the 'black box' excuse—'it learned this itself, we didn't program it.' This effectively acts as a liability shield for developers. If the AI 'learns' bias, it's the bad student (or bad data), not the bad architect. It obscures the deterministic mathematics of gradient descent and the human choices in objective function design.

Accountability Analysis:

Actor Visibility: Hidden (agency obscured)
Analysis: The subject is 'The agent' or 'Rules.' The human actors who curated the training data, designed the reward function, and tuned the hyperparameters are invisible. The phrase 'learned autonomously' specifically excludes human intervention, erasing the engineering team's role in setting the conditions that made the parameter updates inevitable.

7. Chain-of-Thought (CoT) as Thinking

Quote: "CoT 'reasoning traces' often serve as a stand-in for the LRM’s internal reasoning process... we contend that an imperfect r-zombie could produce convincing but untrustworthy... CoT"

Frame: Text generation as cognitive trace
Projection: The term 'Chain-of-Thought' (standard in the field and used here) maps the human experience of conscious, sequential problem-solving onto the generation of intermediate text tokens. Even while criticizing it as potentially 'unfaithful,' the authors retain the frame that there is an 'internal reasoning process' that the CoT should represent. This projects a dualism: the 'internal mind' (process) and the 'verbal report' (CoT), attributing an interior cognitive life to the model.
Acknowledgment: Hedged/Qualified (The authors are critical of CoT, noting it is 'not guaranteed to be faithful.' However, they use the term 'reasoning traces' (in quotes) but then treat the 'internal reasoning process' as a real thing the CoT fails to match.)
Implications: By debating whether CoT is 'faithful' to the 'internal process,' the discourse validates the existence of an 'internal thought process' worth mapping. It suggests the AI has a 'mind' that might be misrepresented by its 'speech.' This reinforces the illusion of depth. A mechanistic view would see CoT simply as 'intermediate token generation maximizing probability of the final answer,' without assuming an 'internal reasoner' distinct from the computation.

Accountability Analysis:

Actor Visibility: Hidden (agency obscured)
Analysis: The text discusses 'LRMs' and 'CoT' as the actors. It obscures the prompt engineers and researchers who discovered and exploited CoT as a prompting strategy (Wei et al.). It frames the 'unfaithfulness' as a property of the zombie, not a result of training models on human explanations that weren't causally linked to the model's prediction mechanisms.

8. Evidence and Experience

Quote: "Evidence is a form of exogenous or intrinsically obtained information... Rules... can be learned autonomously from data on-the-fly. Rules can be fixed or continuously updated in light of new information."

Frame: Input data as empirical experience
Projection: Maps data ingestion to 'experience' and 'evidence.' In science/law, evidence implies truth-seeking and verification. In phenomenology, experience implies a subject experiencing the world. Here, it projects that the AI is an observer collecting evidence about the world, rather than a processor ingesting bitstreams. It implies an epistemic stance (weighing evidence) rather than a statistical one (updating weights).
Acknowledgment: Direct (Unacknowledged) (Definitions 2.1 and 2.5 explicitly label input data streams as 'Evidence' and internal updates as responding to 'experience' (in claim 2.3).)
Implications: Framing inputs as 'evidence' grants the AI the status of an investigator or judge. It implies the system's outputs are judgments based on facts, rather than probabilistic completions based on training distribution. This builds unwarranted authority; we trust a judge who weighs evidence, but we should verify a function processing inputs. It obscures the fact that 'evidence' for an AI can be pure noise or bias, which it treats identically to truth.

Accountability Analysis:

Actor Visibility: Hidden (agency obscured)
Analysis: The text says 'Evidence is... obtained.' It does not say 'Developers scrape the internet.' It obscures the massive, often non-consensual data extraction performed by companies. By calling it 'evidence' provided to a 'reasoner,' it sanitizes the data supply chain and the copyright/privacy decisions made by the corporations.

Task 2: Source-Target Mapping

About this task

For each key metaphor identified in Task 1, this section provides a detailed structure-mapping analysis. The goal is to examine how the relational structure of a familiar "source domain" (the concrete concept we understand) is projected onto a less familiar "target domain" (the AI system). By restating each quote and analyzing the mapping carefully, we can see precisely what assumptions the metaphor invites and what it conceals.

Mapping 1: Philosophy of Mind / Horror Fiction (Zombies) → AI Systems (Large Language Models) with unverified internal logic

Quote: "r-zombies are systems that superficially behave as autonomous reasoners, but lack valid internal reasoning mechanisms."

Source Domain: Philosophy of Mind / Horror Fiction (Zombies)
Target Domain: AI Systems (Large Language Models) with unverified internal logic
Mapping: The source domain (Zombies) involves entities that look human but lack a 'soul' or 'consciousness.' Mapping this to AI suggests that there are 'soulless' AIs (r-zombies) and, by implication, 'ensouled' or 'true' AIs (valid reasoners). This projects the quality of 'authenticity' or 'inner life' onto the target. It assumes that 'true reasoning' in AI is an ontological state distinct from simulation, much like consciousness is distinct from behaviorism in the source domain.
What Is Concealed: This mapping conceals the fact that all AI reasoning is simulation in the sense that it is code execution. There is no 'ghost in the machine' for the 'valid' reasoner either. It hides the mechanistic reality that the difference between an 'r-zombie' and a 'valid reasoner' is just the strictness of the adherence to a logical rule set, not a metaphysical difference in 'aliveness' or 'understanding.' It obscures that both are artifacts.

Mapping 2: Epistemology / Human Cognition (Belief) → Computer Memory / Data Variables ($B_t$)

Quote: "Prior beliefs are the outputs of previous reasoning steps... Current beliefs denote the conclusions drawn"

Source Domain: Epistemology / Human Cognition (Belief)
Target Domain: Computer Memory / Data Variables ($B_t$)
Mapping: The source domain involves 'beliefs' as mental states held by a conscious subject, usually entailing a claim to truth and a willingness to act. The target is simply the storage of variables or vector states in a sequence. The mapping assumes the AI 'holds' these values as convictions. It projects the 'curse of knowledge'—the human author knows what the variable represents ($x=5$), so they attribute the 'belief that x=5' to the machine.
What Is Concealed: It conceals the complete lack of semantic grounding. The machine does not know what '5' means or what 'x' is; it only holds the binary representation. It obscures the passive nature of the storage. A variable doesn't 'believe' its value; it just contains it. This hides the gap between syntax (symbol manipulation) and semantics (meaning), a classic issue in AI philosophy (Searle's Chinese Room) that this terminology papers over.

Mapping 3: Human Agency / Teleology → Optimization Algorithm / Loss Function

Quote: "A goal-oriented decision-maker that implements reasoning."

Source Domain: Human Agency / Teleology
Target Domain: Optimization Algorithm / Loss Function
Mapping: The source domain involves agents with desires, intentions, and the capacity to make choices among alternatives based on those desires. The target is an algorithm minimizing a mathematical error term or satisfying a stopping condition. The mapping invites the assumption that the AI acts for the sake of the goal, implying foresight and intent.
What Is Concealed: It conceals the mechanical determinism (or probabilistic determinism) of the process. The 'decision' is a calculation, not a choice. The 'goal' is a constraint imposed by the programmer, not a desire held by the system. It hides the fact that the 'decision-maker' is actually the human who set the objective function and the threshold for action. The system has no preference for the goal; it just slides down the gradient.

Mapping 4: Psychiatry / Perception → Probabilistic Text Generation Errors

Quote: "hallucination is a feature and not a bug"

Source Domain: Psychiatry / Perception
Target Domain: Probabilistic Text Generation Errors
Mapping: The source domain is the human experience of perceiving sensory data that does not exist in reality, often due to pathology. The target is the generation of text that is syntactically plausible but factually incorrect. The mapping assumes the AI has a 'mind' that perceives reality and occasionally malfunctions. 'Feature not a bug' suggests this creativity/madness is an inherent personality trait.
What Is Concealed: It conceals the statistical nature of the error. The model predicts the next likely word. If the most likely word is a fabrication, the model is working correctly according to its design (probability maximization). Calling it hallucination conceals the fact that the model never knows the truth, only the probability. It obscures the lack of 'ground truth' access in the training objective.

Mapping 5: Pedagogy / Biology → Parameter Adjustment / Curve Fitting

Quote: "The agent learns a policy that maps states to actions."

Source Domain: Pedagogy / Biology
Target Domain: Parameter Adjustment / Curve Fitting
Mapping: Source domain is an organism adapting to its environment to survive, or a student acquiring knowledge. Target is the mathematical adjustment of weights to minimize loss. The mapping assumes the AI is 'trying' to improve and 'gains' knowledge. It implies a cumulative, coherent worldview is being built.
What Is Concealed: It conceals the brute-force nature of the 'learning' (processing trillions of tokens). It hides the fact that the 'policy' is just a high-dimensional curve fit. It obscures the brittleness—change the distribution slightly, and the 'learning' evaporates (catastrophic forgetting), unlike organic learning which generalizes. It hides the energy and labor cost of the 'training' run.

Quote: "epistemic trust in machine reasoning"

Source Domain: Social Psychology / Interpersonal Relationships
Target Domain: System Reliability / Verification
Mapping: Source is the trust between people (e.g., patient-doctor), involving vulnerability and reliance on good will. Target is the statistical reliability of software output. Mapping invites users to feel a 'relationship' with the AI, expecting it to 'care' about being truthful.
What Is Concealed: It conceals the indifference of the machine. The machine cannot 'betray' trust because it never made a promise. It conceals the need for audit (checking the mechanism) by replacing it with trust (relying on the entity). It obscures the commercial interests—companies want users to 'trust' the bot so they don't sue when it fails.

Mapping 7: Autonomy / Self-Governance → Unsupervised / Self-Supervised Learning algorithms

Quote: "Rules can be learned autonomously from data on-the-fly."

Source Domain: Autonomy / Self-Governance
Target Domain: Unsupervised / Self-Supervised Learning algorithms
Mapping: Source is a sovereign entity making its own laws or rules. Target is an algorithm identifying patterns without explicit labels. The mapping assumes the AI is the source of the rule, projecting creativity and authority.
What Is Concealed: It conceals the dependency on the data. The 'rule' is latent in the data; the AI just extracts it. It hides the fact that the 'autonomy' is strictly bounded by the hyper-parameters set by engineers. It erases the human design of the learning architecture that dictates what kinds of rules can be learned.

Mapping 8: Cognitive Psychology (Dual Process Theory) → Computational Compute Cycles / Search Trees

Quote: "System 2 thinking... is sometimes referenced as a metaphor for inference-time scaling"

Source Domain: Cognitive Psychology (Dual Process Theory)
Target Domain: Computational Compute Cycles / Search Trees
Mapping: Source is the slow, deliberative, conscious, effortful human thought process. Target is simply running the processor longer to search more paths before outputting. Mapping implies the computer is 'thinking harder' or 'reflecting.'
What Is Concealed: It conceals the fact that 'inference-time scaling' is just more calculation, not a different kind of cognition. System 2 in humans involves meta-cognition and conscious awareness. In AI, it's just a deeper search tree. It obscures the lack of self-awareness in the 'deliberation.'

Task 3: Explanation Audit (The Rhetorical Framing of "Why" vs. "How")

About this task

This section audits the text's explanatory strategy, focusing on a critical distinction: the slippage between "how" and "why." Based on Robert Brown's typology of explanation, this analysis identifies whether the text explains AI mechanistically (a functional "how it works") or agentially (an intentional "why it wants something"). The core of this task is to expose how this "illusion of mind" is constructed by the rhetorical framing of the explanation itself, and what impact this has on the audience's perception of AI agency.

Explanation 1

Quote: "Reasoning is the process of selecting and applying sequences of rules that act on prior beliefs and current evidence to obtain principled belief updates in evolving states."

Explanation Types:
- Functional: Explains behavior by role in self-regulating system with feedback
- Theoretical: Embeds in deductive framework, may invoke unobservable mechanisms
Analysis (Why vs. How Slippage): This explanation focuses on the how (mechanism) of reasoning, breaking it down into component parts (rules, beliefs, evidence). It is functional because it describes the role of each component in the transition of states. However, it relies on theoretical constructs ('beliefs', 'rules') that are imposed definitions rather than observable physical components of a neural net. By framing it mechanistically, it emphasizes the procedure but obscures the physical reality—that these are matrix multiplications, not 'rule applications' in the symbolic sense.
Consciousness Claims Analysis: The passage uses strong consciousness/epistemic verbs ('beliefs', 'principled updates') within a mechanistic syntax. It claims the system has beliefs to update. This is a clear example of the 'curse of knowledge': the authors define $B_t$ as 'beliefs' and then attribute the epistemic properties of belief (justification, updates) to the variable. While the description is technical (states, transitions), the vocabulary attributes mental content to the processing states. It asserts that the system is an epistemic agent ('obtain principled belief updates') rather than a data transformer.
Rhetorical Impact: The framing constructs the AI as a rational, logical engine. It increases trust by using the language of logic and validity ('principled', 'rules'). It suggests that if we can just see the 'rules,' the system is trustworthy. It obscures the risk that the 'rules' might be incomprehensible matrices. It positions the AI as a valid participant in logic, elevating it from a tool to a 'reasoner' that follows principles.

Explanation 2

Quote: "The reasoner generally executes a reasoning process to achieve some outcome of interest. This outcome is the goal one is reasoning toward: the answer to a complex question... the optimal action to take."

Explanation Types:
- Intentional: Refers to goals/purposes, presupposes deliberate design
- Reason-Based: Gives agent's rationale, entails intentionality and justification
Analysis (Why vs. How Slippage): This explanation shifts to the why (agential). It defines the system ('reasoner') by its purpose ('to achieve some outcome'). It attributes 'goals' to the reasoner. This emphasizes the teleology—the system wants the answer. It obscures the fact that the 'goal' is an external constraint (loss function) imposed by the user/programmer. The reasoner doesn't have a goal; the user has a goal, and the reasoner is the tool.
Consciousness Claims Analysis: This passage attributes high-level agency ('executes... to achieve'). It assumes the system holds the goal in mind and directs its resources toward it. Mechanistically, the system is simply following the gradient of a function. There is no 'reasoning toward' a goal in the mental sense of holding a desire; there is only the mathematical minimization of error. The language implies a conscious striving ('reasoning toward') that implies awareness of the gap between current state and goal state.
Rhetorical Impact: This makes the AI seem like a helpful partner or employee working toward a shared goal. It fosters relational trust. It also implies competence—if it has a goal, it must know what the goal is. This risks users assuming the AI understands the intent of the goal, not just the literal specification, leading to alignment errors (the 'paperclip maximizer' problem is obscured by assuming the reasoner shares our 'outcome of interest').

Explanation 3

Quote: "Recent progress has been fueled by the remarkable empirical performance of large reasoning models (LRMs)... A wave of benchmarking successes invites many questions..."

Explanation Types:
- Empirical Generalization: Subsumes events under timeless statistical regularities
- Genetic: Traces origin through dated sequence of events or stages
Analysis (Why vs. How Slippage): This explains the rise of the field via empirical success (performance/benchmarks). It frames the 'why' of current interest as a result of observed data (high scores). It emphasizes the output (performance) while noting the obscurity of the process. It's a genetic account of the field's evolution ('fueled by...'). It obscures the specific commercial drivers (investment, hype) by focusing on 'benchmarking successes' as the driver.
Consciousness Claims Analysis: This passage is more grounded, focusing on 'performance' and 'benchmarking.' However, it uses the term 'Large Reasoning Models' (LRMs) as a factual category. This begs the question: it assumes they are reasoning models because they perform well on benchmarks. It attributes the capability ('reasoning') based on the output ('successes'), illustrating the very conflation of process and product the authors later criticize. It attributes the quality of 'reasoning' to the models before the definition is even established.
Rhetorical Impact: By labeling them 'Large Reasoning Models,' the text canonizes their status as reasoners. It creates a 'fait accompli'—reasoning is already happening; we just need to measure it. This increases the perceived power of the technology. It shapes policy by suggesting we are regulating 'reasoning agents' rather than 'text generators,' potentially triggering different legal frameworks.

Explanation 4

Quote: "System 2 thinking... is sometimes referenced as a metaphor for inference-time scaling... System 2 entails slow, deliberative, effortful, and logical cognition."

Explanation Types:
- Theoretical: Embeds in deductive framework, may invoke unobservable mechanisms
- Functional: Explains behavior by role in self-regulating system with feedback
Analysis (Why vs. How Slippage): It uses a psychological theory (Kahneman's System 2) to explain a computational function (inference scaling). It frames the how of the AI in terms of the how of the human mind. It emphasizes the similarity (slowness, logic) but potentially obscures the vast difference in mechanism (synaptic firing vs. tree search). It treats the metaphor as an explanation of function.
Consciousness Claims Analysis: This is a key site of consciousness projection. 'Deliberative,' 'effortful,' and 'thinking' are conscious experiences. A computer does not feel 'effort' (energy cost is not effort). It does not 'deliberate' (weigh options subjectively); it calculates paths. Even though acknowledged as a metaphor, the detailed mapping ('slow, deliberative') encourages the audience to attribute the qualities of human thought (care, reflection) to the machine's brute-force search.
Rhetorical Impact: Calling it 'System 2' gives the AI profound intellectual weight. System 2 is rationality itself. If AI has System 2, it is rational. This generates immense unwarranted trust in the model's judgments. It implies the AI is 'thinking it through' like a careful human, reducing the perceived need for external verification. It humanizes the latency of the model—it's not 'slow processing,' it's 'deep thinking.'

Explanation 5

Quote: "The agent learns a policy that maps states to actions... Update rules in RL often take the following form... where Qt+1 is the estimated reward."

Explanation Types:
- Functional: Explains behavior by role in self-regulating system with feedback
- Genetic: Traces origin through dated sequence of events or stages
Analysis (Why vs. How Slippage): This explains the AI's behavior by its learning history (Genetic) and its internal update mechanism (Functional). It describes how the policy is formed through equations. It emphasizes the mathematical basis (Equation 3) but retains the agential frame ('The agent learns'). It obscures the external designer who chose the update rule and the reward signal.
Consciousness Claims Analysis: The math is precise ($Q_{t+1}$), but the prose frames it as 'learning' and 'estimating.' 'Estimation' implies an epistemic attempt to guess the truth. 'Learning' implies cognitive growth. The juxtaposition of the agentic sentence ('agent learns') with the mechanical equation creates a bridge where the math validates the agency. It implies the equation is the mechanism of the agent's mind. It projects understanding ('estimated reward') onto a variable update.
Rhetorical Impact: This combination of math and agency makes the 'learning' claim seem scientifically proven. It legitimizes the anthropomorphism with Greek letters. It convinces the audience that 'learning' is a solved technical problem, not a metaphor. It diffuses risk: if the agent 'learns' a policy, the behavior is an emergent property of the math, not a direct script written by the developer, distancing the creator from the outcome.

Task 4: AI Literacy in Practice - Reframing Anthropomorphic Language

About this task

This section proposes alternative language for key anthropomorphic phrases, offering more mechanistic and precise framings that better reflect the actual computational processes involved. Each reframing attempts to strip away the projections of intention, consciousness, or agency that are embedded in the original language.

V3 Enhancement: A fourth column addresses human agency restoration—reframing agentless constructions to name the humans responsible for design and deployment decisions.

Original Anthropomorphic Frame	Mechanistic Reframing	Technical Reality Check	Human Agency Restoration
A goal-oriented decision-maker that implements reasoning.	A computational system that executes an optimization algorithm to minimize a specified loss function through iterative data processing.	The system does not make decisions or hold goals; it executes a pre-defined path-finding algorithm based on gradient descent or tree search to satisfy a mathematical stopping criterion.	Developers at [Company] designed the objective function and deployed the system to optimize for specific outputs.
Prior beliefs are the outputs of previous reasoning steps... Current beliefs denote the conclusions drawn	Prior state vectors are the outputs of previous processing iterations... Current state vectors denote the numerical values computed	The model stores data representations (embeddings/tensors) in memory. It does not hold 'beliefs' (justified true convictions) but simply retains the output of function $f(x)$ for use in function $g(x)$.	N/A - describes computational processes without displacing responsibility.
The agent learns a policy that maps states to actions.	The model's parameters are adjusted via feedback loops to approximate a function mapping input vectors to output vectors.	The system does not 'learn' in a cognitive sense; it fits a curve to a dataset. The 'policy' is a probability distribution over possible outputs, conditioned on inputs.	Engineers configured the reinforcement learning algorithm to adjust the model's weights based on a reward signal defined by the development team.
hallucination is a feature and not a bug	Fabrication of non-factual content is a statistical inevitability of probabilistic token generation.	The model generates the next most probable token based on training data correlations. It has no access to ground truth, so it cannot 'hallucinate' (perceive falsely); it simply generates text that resembles facts without checking validity.	Developers chose to use probabilistic language models for information retrieval tasks despite knowing these architectures prioritize plausibility over factuality.
Rules can be learned autonomously from data on-the-fly.	Pattern-matching functions can be extracted from dataset correlations during the training process.	The system identifies statistical regularities (patterns) in the data. It does not learn 'rules' (explicit logical commands) unless hard-coded; it approximates rule-like behavior via high-dimensional vector operations.	Researchers designed the architecture to extract patterns from data collected by [Company], allowing the system to approximate behaviors without explicit programming.
epistemic trust in machine reasoning	verification of the reliability of automated data processing outputs	One cannot 'trust' a machine in the epistemic sense (believing its testimony). One can only verify the error rate of its output distribution. The system has no intent to be truthful.	Users must verify the outputs of the system deployed by [Company], rather than relying on the vendor's claims of reliability.
The reasoner generally executes a reasoning process to achieve some outcome of interest.	The algorithm executes a processing sequence to satisfy a user-defined termination condition.	The system does not have an 'interest' or 'outcome' it strives for. It runs until the code dictates a stop. The 'outcome' is a result, not an achievement.	The user initiates the process to generate a desired output, using the tool provided by the developers.
AI reasoning is a (learnable) rule-based process.	AI text generation is a statistically optimized process approximating rule-following behavior.	The process involves matrix operations that approximate logical steps. The 'rules' are implicit in the weights, not explicitly followed as in a logic gate (unless using a specific neuro-symbolic architecture).	N/A - describes computational processes without displacing responsibility.

Task 5: Critical Observations - Structural Patterns

Agency Slippage

The text systematically oscillates between treating the AI as a mathematical object and an intentional agent. This slippage serves a specific rhetorical function: establishing scientific rigor while maintaining narrative power.

The Definition Phase (Mechanical -> Agential): In Section 2.1, the text begins with a high-level definition: 'A goal-oriented decision-maker' (Agential). This establishes the AI as the protagonist. Immediately after, it defines 'State' and 'Process' using mathematical notation ($S_t, B_t$), moving to the mechanical to prove rigor.
The Explanation Phase (Agential Dominance): When explaining how it works (e.g., RL, Section 2.2), the text slips back to 'The agent learns a policy.' Here, the agentless construction ('policy is learned') often alternates with 'Agent learns,' effectively obscuring the engineers (Hidden Agency). For instance, 'Rules can be learned autonomously' completely erases the human architect.
The Critique Phase (Curse of Knowledge): When criticizing current models ('r-zombies'), the authors project their own understanding of 'reasoning' onto the system to declare it lacking. They treat the AI as a failed agent (zombie) rather than a successful machine (text generator).

This oscillation allows the authors to claim the authority of computer science (math) while discussing the AI in the intuitive terms of psychology (beliefs, goals). The 'Curse of Knowledge' is evident in the definition of 'Beliefs' ($B_t$). The authors know $B_t$ is just data, but by naming it 'Belief,' they slip into treating the system as a 'knower.' This slippage makes it 'sayable' that an AI has beliefs, a claim that would be rejected if phrased 'the matrix contains vector x.'

Metaphor-Driven Trust Inflation

The paper explicitly addresses 'epistemic trust,' yet its own metaphorical choices construct a form of trust that undermines its call for rigor.

Consciousness as Trust Signal: By defining the AI as a 'Reasoner' with 'Beliefs,' the text implicitly signals that the system is a rational entity. We trust reasoners; we trust entities with beliefs (if justified). This invokes 'relation-based trust' (sincerity/competence of an agent) rather than 'performance-based trust' (reliability of a tool).
The 'Valid' Reasoner Authority: The central argument is for 'process validity.' However, by framing this valid process as 'True Reasoning' (vs. Zombie emulation), the text constructs a hierarchy where the 'Valid AI' is accorded the status of a 'Knower.' This implies that a 'valid' system is trustworthy not just because it is accurate, but because it is thinking correctly.
Failure as Pathology: Framing errors as 'hallucinations' or 'zombie' behavior suggests that the problem is a lack of 'life' or 'health' in the system. This implies that the solution is to make the system 'healthier' (better reasoning), inviting trust in the intent of the research program to create 'healthy' minds.
Risks: The text encourages audiences to withhold trust from 'zombies' but potentially extend it uncritically to 'valid reasoners.' If a system is mathematically 'valid' (follows rules), the text implies it is 'trustworthy.' But a system can validly follow biased, harmful, or dangerous rules. The metaphor of 'validity' acts as a stamp of approval that might obscure the content of the reasoning.

Obscured Mechanics

The anthropomorphic metaphors systematically conceal the material and economic realities of AI production.

The 'Evidence' Euphemism: By calling input data 'Evidence' and 'Experience,' the text obscures the massive data extraction industry. 'Evidence' sounds like clues found by a detective. In reality, it is often copyrighted work, personal data, and creative output scraped by corporations (OpenAI, Google, Microsoft). The metaphor hides the taking of data and frames it as the receiving of evidence.
The 'Belief' Abstraction: Calling $B_t$ 'Beliefs' hides the vector dimensionality and the hardware requirements. It creates an abstraction layer that allows the text to ignore how these states are stored (VRAM costs, energy consumption). It creates a 'mind' where there is only memory.
Hidden Labor: The discussion of 'Rules' being 'learned' (Claim 2.3) obscures the Role of RLHF (Reinforcement Learning from Human Feedback). The 'rules' are often just the aggregated preferences of underpaid human annotators. The text says the 'agent learns,' hiding the 'worker teaches.'
Proprietary Opacity: The text discusses 'LRMs' (Large Reasoning Models) without naming the proprietary barriers. It implies we can inspect the 'rules' ($R_t$) to check validity. For models like GPT-4, these 'rules' (weights) are trade secrets. The metaphor of 'checking validity' assumes a transparency that corporate owners (Microsoft, Google) actively prevent.

Beneficiaries: This concealment benefits the model producers. It frames the AI as a scientific artifact to be studied, rather than a commercial product built on extracted data and hidden labor.

Context Sensitivity

The deployment of anthropomorphism varies strategically across the text.

Introductory Vision (High Intensity): The introduction is dense with agential language ('Autonomous reasoning,' 'goal-oriented,' 'decision-maker'). This sets the stakes: we are dealing with intelligent agents. This hooks the reader and establishes the importance of the field.
Formal Definition (Pseudo-Mechanical): Section 2 switches to mathematical notation ($S_t, R_t, B_t$). However, the labels for these variables ('Beliefs', 'Evidence') remain anthropomorphic. The density of consciousness verbs drops, but the foundational nouns remain agential. This tries to borrow the credibility of math to sanitize the metaphors.
Rebuttals/Discussion (Aggressive Agentialism): In the rebuttals (Section 3), the language becomes combative and highly metaphorical ('throwing baby out with bathwater,' 'zombies'). Here, the text defends 'symbolic AI' by framing it as 'true reasoning,' intensifying the distinction between 'real' and 'fake' minds.
Asymmetry: Capabilities are described agentially ('The agent learns,' 'Reasoning achieves goals'). Limitations are often described mechanistically or passively ('hallucination is a feature,' 'accuracy collapse'). The AI does the good things (Reasoning); the math/data causes the bad things (Hallucination). This asymmetry protects the image of the AI as a sophisticated agent, downgrading it to a machine only when it fails.

Accountability Synthesis

Accountability Architecture

This section synthesizes the accountability analyses from Task 1, mapping the text's "accountability architecture"—who is named, who is hidden, and who benefits from obscured agency.

The text constructs an 'accountability sink' by splitting the AI into two entities: the 'Reasoning Zombie' (bad, deceptive) and the 'Valid Reasoner' (good, logical).

Displaced Agency: The primary actors in the text are the 'Reasoner,' the 'Agent,' and the 'Model.' Human actors (Engineers, Corporations) are largely 'Hidden' or 'Partial' (generic 'researchers'). Decisions to deploy, decisions to scrape data, and decisions to prioritize scale over safety are framed as 'historical trends' or 'waves of AI' rather than corporate strategies.
The Zombie Scapegoat: The 'r-zombie' concept serves as a vessel for blame. Deception, hallucination, and untrustworthiness are properties of the zombie—a defective category of AI. This implies that the 'correct' AI (which the authors propose) would be free of these moral failings. It shifts responsibility from creating safe products to achieving the right definition.
Liability Implications: If a model 'hallucinates,' the text frames this as an inherent 'feature' of the technology or a 'zombie' trait. This diffuses legal liability. If it's a 'feature,' it's not negligence; it's physics. By contrast, naming the actor would reveal: 'Company X chose an architecture known to fabricate.'
Naming the Actor: If we replace 'The agent learns a policy' with 'Google engineers trained the model to maximize engagement,' the accountability shifts immediately. The focus moves from the 'mind' of the agent to the ethics of the engineers. The current text serves the academic and industrial interest of treating AI as a natural phenomenon to be studied, rather than a manufactured product to be regulated.

Conclusion: What This Analysis Reveals

The Core Finding

Mechanism of the Illusion:

The 'illusion of mind' is constructed through a 'Curse of Knowledge' feedback loop. The authors, expert in the logic of the system, project their understanding of the meaning of the data onto the system itself. The illusion works by (1) establishing a high-level agentic frame ('The Reasoner'), then (2) grounding it in symbols ($B_t$), and finally (3) treating the symbols as proof of the agency. The temporal structure is critical: the text first creates the 'Zombie' (the fake mind), which paradoxically reinforces the existence of the 'Real Mind' (the valid reasoner). By arguing against the fake, they validate the category of the real. The rhetorical sleight-of-hand is defining 'Reasoning' as a 'process' (mechanical) but populating that process with 'Beliefs' (mental), allowing the text to slide between 'it processes data' and 'it thinks' without triggering the reader's skepticism.

Material Stakes:

Categories: Regulatory/Legal, Epistemic

The stakes of this framing are high for regulation and science.

Regulatory/Legal: If definitions of 'Reasoning' and 'Belief' are accepted in policy, it shifts the regulatory focus from harm reduction (outputs) to architectural purity (process). Regulators might mandate 'valid reasoning' (as defined here: exact rule application), which favors Symbolic/Neuro-symbolic approaches (often Microsoft/DeepMind backed) over pure stochastic models. This could create a regulatory moat. Furthermore, attributing 'decision-making' to the AI (the 'Reasoner') complicates liability. If the 'Reasoner' made the decision based on its 'Beliefs,' the manufacturer can claim the system acted autonomously, potentially shielding them from negligence claims.

Epistemic: In science, framing data variables as 'Beliefs' degrades the precision of language. It encourages researchers to study the 'psychology' of the model rather than its engineering. This leads to 'anthropology of the machine'—treating the AI as an alien subject—rather than computer science. It obscures the need for mechanistic interpretability by suggesting we can understand the system by analyzing its 'beliefs' rather than its weights.

AI Literacy as Counter-Practice:

Practicing critical literacy requires systematically replacing the 'Cognitive' vocabulary with 'Computational' vocabulary. Reframing 'The agent learns a policy' to 'The algorithm minimizes error on the training set' strips away the illusion of autonomy. Reframing 'Hallucination' to 'Fabrication' places the onus on the system's design constraints rather than a pseudo-psychological glitch.

This precision resists the 'Accountability Sink.' When we say 'OpenAI's model fabricates facts due to probabilistic sampling,' the path to solution (change the sampling, penalize the company) is clear. When we say 'The model hallucinates,' the path is obscure (therapy for the model?). Resistance to this precision comes from the 'AI Hype' machine and even researchers themselves, who benefit from the prestige of studying 'minds' rather than 'matrices.' Adopting mechanistic language threatens the narrative that AI is on the verge of AGI, which drives funding and stock prices.

Path Forward

The discourse faces a fork.

The Anthropomorphic Status Quo: Continues using 'Reasoning,' 'Beliefs,' and 'Thinking.' This creates an intuitive interface for users and maintains the 'magic' that drives investment. However, it risks catastrophic trust failures when 'Reasoners' make basic errors, and it confuses the legal status of AI agents.
The Mechanistic Turn: Journals and regulators could demand 'Operational Descriptions' that forbid consciousness verbs for software. We talk of 'processing,' 'optimizing,' and 'generating.' This deflates the hype but clarifies the risks. It makes the technology boring but governable.
The Hybrid Future (Process Validity): The authors' proposal. We keep the high-level terms ('Reasoning') but demand strict internal logic ('Validity'). This risks creating a 'High Priest' class who certify which AIs are 'True Reasoners' and which are 'Zombies,' potentially consolidating power in the hands of those who define 'Validity' (likely large tech labs).

Conclusion: A shift to mechanistic precision is analytically superior for safety and accountability but economically disadvantageous for the 'AI Revolution' narrative. The choice of words will determine whether we regulate a tool or negotiate with a phantom.

Extended Processing Summary

Gemini's Processing Steps

What you are about to read…

Gemini refers to the text below as "thought summaries." This is an overt consciousness projection because 'intentions' are hallmarks of a conscious mind that 'knows' what it is doing and why. The concealed mechanistic process is probabilistic text generation. Treat this as just another rhetorical artifact—a way of making the model's processing legible. When the model writes "I should double check that...", it is mathematically altering its own probability distribution for the next token to be more accurate. The "I" is a functional gear in the machine, not a self-reference. This language is a deliberate interface design choice, not a biological reality. The developers could have programmed the model to output passive logs. Instead, they chose to simulate a narrator.

Defining the Analytical Framework

I'm now establishing the analytical framework. My focus is understanding the paper's core arguments and how they relate to AI metaphors and anthropomorphism. I'm prioritizing the identification of key passages, paying attention to the author's choice of language regarding AI and consciousness. This stage aims to define a precise scope for the CDA.

Examining the Core Concepts

I'm now diving deeper into the source text, specifically analyzing "Beyond Reasoning Zombies." The authors' central argument concerning "process validity" as the key to genuine reasoning is becoming clearer. I'm noting the implications of their concept of "reasoning zombies" (r-zombies) for how we evaluate AI systems. The use of metaphorical language, even in a critical context, adds another layer of complexity that I'm considering. My JSON schema will outline how this displaces human agency.

Analyzing AI Metaphors

I'm now focusing on the "Reasoning Zombie" metaphor, as defined in the source text. I've broken it down to understand its components, mapping the "Reasoner", "Beliefs", "Evidence", "Rules", and "Hallucination" to their AI system counterparts. I'm noting how this builds on the philosophical zombie concept, and I'm drafting my metaphor audit, specifically regarding its binary "alive/undead" implications. Next, I am reviewing the concept of "Beliefs" in the source text.

Mapping AI Terminology

I'm now cataloging the AI system elements that are explicitly presented through metaphor. I'm focusing on "beliefs," and am directly mapping weights/vectors to epistemic states, considering the lack of explicit justification or awareness in their current use. I am also mapping the Reasoner metaphor to decision-makers, and highlighting the concept of "hallucination" in the source material, and the implications of this in an AI context. Finally, I've considered the learning and memory metaphors as well.

Identifying Key Metaphors

I'm now identifying additional key metaphors present in the text, specifically focusing on "projection," "trust/trustworthiness," and "thinking." I'm also examining the implications of "thinking" as a metaphor for inference-time scaling and mapping "trust" to technical reliability. I'm noting how each metaphor functions and its potential significance for AI understanding and evaluation. I've re-reviewed "understanding," and have removed it as not applicable to the AI's understanding.

Unpacking Metaphorical Functions

I'm currently dissecting how each metaphor functions, focusing on what it conceals about AI behavior. The use of "Target (LLM behavior)" hides the non-conscious nature of all AI, even models demonstrating advanced reasoning. I'm noting how "Belief" obscures the lack of semantic understanding, and "Reasoner" conceals the software execution's nature. Also, I'm examining how "Rule-Governed" misrepresents the opaque nature of neural net vectors. Furthermore, I'm analyzing the implications of the "Hallucination" metaphor, given LLMs' probabilistic nature. Finally, "Goal-Oriented" hides the lack of intrinsic motivation.

Reframing the Conceptual Framework

I'm now focusing on re-framing specific phrases related to AI in a more precise and less anthropomorphic language, and creating more precise analogies. For instance, I'm reframing "A goal-oriented decision-maker that implements reasoning" as "A computational system optimizing an objective function through iterative data processing." Similarly, I'm refining "Prior beliefs are the outputs of previous reasoning steps" into "Prior state vectors are the outputs of previous processing iterations." I'm doing this to avoid the implications of concious decision-making. I will repeat this process with "hallucination", "thought", and "understanding."

Revising Anthropomorphic Language

I'm now revising phrases to demystify AI, with a focus on agency and trust. I've rephrased "The agent learns a policy that maps states to actions" as "The algorithm optimizes a function mapping input vectors to output vectors based on reward signal feedback." Additionally, "AI learns to reason" becomes "Model parameters are adjusted to minimize error on logic-puzzle datasets." I've also reframed "Hallucination is a feature" to "Fabrication of non-factual content is a statistical inevitability of probabilistic token generation," to remove any perception of human characteristics.

Generating Critical Observations

I'm now generating critical observations based on the analysis. The concept of agency is slippery, as the paper shifts between describing reasoning as a process and framing a "Reasoner" as a goal-oriented decision-maker. I'm focusing on "epistemic trust," and considering how, even when emphasizing "process validity," the language used can inadvertently promote relational trust.

Analyzing Hidden Elements

I'm now focusing on obscured mechanics in the text. "Belief" hides vector math, and "Rules" obscures the opacity of learned weights. I'm examining the paper's context-sensitive shifts between high-level argumentation and mechanistic definitions, especially in its use of agentic vs. mechanistic language. The paper demands operational definitions for accountability, but its binary "r-zombie" vs "true reasoner" dichotomy implies the latter would have agency.

Formalizing AI Concepts

My focus is formalizing key concepts. I'm noting how the attempt to formalize "belief" and "reasoning" still carries inherent linguistic baggage. I'm also examining the implications for regulation, especially concerning the need for interpretability. The "Bitter Lesson" perspective, prioritizing performance over interpretability, presents a significant resistance point. Finally, I'm considering the future of "process validity" and the implications for hybrid neuro-symbolic systems.

Analysis Provenance

Run ID: 2026-02-17-position-beyond-reasoning-zombies-ai-rea-metaphor-nfmjxl Raw JSON: 2026-02-17-position-beyond-reasoning-zombies-ai-rea-metaphor-nfmjxl.json Framework: Metaphor Analysis v6.4 Schema Version: 3.0 Generated: 2026-02-17T09:09:08.955Z

Deep Analysis

Explanation Audit

🔍Analysis

🧠Epistemic Claim Analysis

🎯Rhetorical Impact

How/Why Slippage

Unacknowledged Metaphors

Hidden Actors

Explanation Types

Acknowledgment Status

Actor Visibility

Source → Target Pairs (8)

Metaphor Gallery (8)

Reframed Language Samples

Task 1: Metaphor and Anthropomorphism Audit​

1. The Reasoning Zombie (r-zombie)​

2. Computational States as Beliefs​

3. The Goal-Oriented Decision Maker​

4. Epistemic Trust in Software​

5. Hallucination as Feature​

6. The Learning Agent​

7. Chain-of-Thought (CoT) as Thinking​

8. Evidence and Experience​

Task 2: Source-Target Mapping​

Mapping 1: Philosophy of Mind / Horror Fiction (Zombies) → AI Systems (Large Language Models) with unverified internal logic​

Mapping 2: Epistemology / Human Cognition (Belief) → Computer Memory / Data Variables ($B_t$)​

Mapping 3: Human Agency / Teleology → Optimization Algorithm / Loss Function​

Mapping 4: Psychiatry / Perception → Probabilistic Text Generation Errors​

Mapping 5: Pedagogy / Biology → Parameter Adjustment / Curve Fitting​

Mapping 6: Social Psychology / Interpersonal Relationships → System Reliability / Verification​

Mapping 7: Autonomy / Self-Governance → Unsupervised / Self-Supervised Learning algorithms​

Mapping 8: Cognitive Psychology (Dual Process Theory) → Computational Compute Cycles / Search Trees​

Task 3: Explanation Audit (The Rhetorical Framing of "Why" vs. "How")​

Explanation 1​

Explanation 2​

Explanation 3​

Explanation 4​

Explanation 5​

Task 4: AI Literacy in Practice - Reframing Anthropomorphic Language​

Task 5: Critical Observations - Structural Patterns​

Agency Slippage​

Metaphor-Driven Trust Inflation​

Obscured Mechanics​

Context Sensitivity​

Accountability Synthesis​

Conclusion: What This Analysis Reveals​

Mechanism of the Illusion:​

Material Stakes:​

AI Literacy as Counter-Practice:​

Path Forward​

Extended Processing Summary​

Task 1: Metaphor and Anthropomorphism Audit

1. The Reasoning Zombie (r-zombie)

2. Computational States as Beliefs

3. The Goal-Oriented Decision Maker

4. Epistemic Trust in Software

5. Hallucination as Feature

6. The Learning Agent

7. Chain-of-Thought (CoT) as Thinking

8. Evidence and Experience

Task 2: Source-Target Mapping

Mapping 1: Philosophy of Mind / Horror Fiction (Zombies) → AI Systems (Large Language Models) with unverified internal logic

Mapping 2: Epistemology / Human Cognition (Belief) → Computer Memory / Data Variables ($B_t$)

Mapping 3: Human Agency / Teleology → Optimization Algorithm / Loss Function

Mapping 4: Psychiatry / Perception → Probabilistic Text Generation Errors

Mapping 5: Pedagogy / Biology → Parameter Adjustment / Curve Fitting

Mapping 6: Social Psychology / Interpersonal Relationships → System Reliability / Verification

Mapping 7: Autonomy / Self-Governance → Unsupervised / Self-Supervised Learning algorithms

Mapping 8: Cognitive Psychology (Dual Process Theory) → Computational Compute Cycles / Search Trees

Task 3: Explanation Audit (The Rhetorical Framing of "Why" vs. "How")

Explanation 1

Explanation 2

Explanation 3

Explanation 4

Explanation 5

Task 4: AI Literacy in Practice - Reframing Anthropomorphic Language

Task 5: Critical Observations - Structural Patterns

Agency Slippage

Metaphor-Driven Trust Inflation

Obscured Mechanics

Context Sensitivity

Accountability Synthesis

Conclusion: What This Analysis Reveals

Mechanism of the Illusion:

Material Stakes:

AI Literacy as Counter-Practice:

Path Forward

Extended Processing Summary