Mechanism of Illusion Library

This library collects analyses of HOW each text's metaphorical system creates the "illusion of mind." Each entry examines the internal logic of persuasion—the rhetorical moves made, their sequence, and the audience vulnerabilities exploited.

Key patterns: consciousness projection sequences, "curse of knowledge" dynamics (authors projecting understanding onto systems), temporal structure of metaphorical framing, and the causal chains that lead audiences to accept agential claims.

Why Language Models Hallucinate

Source: https://arxiv.org/abs/2509.04664v1
Analyzed: 2026-05-30

The rhetorical power of this metaphorical system lies in its ability to construct a highly persuasive 'illusion of mind' in a purely computational artifact. The central trick of this discourse is the strategic blurring of processing and knowing through carefully selected, agential verbs like 'admit,' 'guess,' and 'believe.' The authors construct this illusion through a specific temporal sequence: first, they introduce the relatable, highly humanizing analogy of a student facing a hard exam. This immediately activates the reader's social-cognitive schemas, encouraging them to view the AI through the lens of human empathy and psychological struggle. Once this cognitive bridge is established, the text transitions into dense, mathematical formalisms. This transition exploits the 'curse of knowledge' dynamic, where the authors' advanced understanding of probability theory is projected onto the model's outputs, framing the model's statistical entropy as an active, introspective state of 'uncertainty.' The audience, overwhelmed by the technical rigor of the equations, is led to accept the agential metaphors as literal, functional descriptions of the mathematical processes. This creates a powerful causal chain: if the model is mathematically proven to have 'uncertainty distributions,' and if these distributions can be calibrated, then the model must be capable of 'honestly reporting' its limits. This rhetorical architecture exploits the audience's natural vulnerability—our evolutionary bias to anthropomorphize responsive, language-generating artifacts—to make the system appear autonomously intelligent. By framing parameter fitting as 'learning' and statistical errors as 'hallucinations,' the text constructs a persuasive narrative where a complex curve-fitting software is granted a surrogate human mind, complete with beliefs, motivations, and moral virtues. 250-350 words.

Source: https://arxiv.org/abs/2604.06233v1
Analyzed: 2026-05-30

This metaphorical system creates the 'illusion of mind' through a sophisticated rhetorical sleight-of-hand that blurs the boundaries between computational processing and conscious knowing. The central trick relies on establishing the model as a 'knower' first—using verbs of conscious awareness like 'recognizes,' 'reasons,' and 'engages'—and then building agential and moral claims on top of this constructed intellect. This process is reinforced by the 'curse of knowledge,' where the authors' deep, professional understanding of political philosophy is projected onto the model's outputs, reading a structured ethical deliberation into what is merely a highly correlated sequence of language patterns. This blur is achieved through strategic, hybrid explanation types that oscillate between the empirical generalization of token frequencies and the reason-based justification of agential choice. By framing the system's output as a decision to 'refuse anyway' despite 'recognizing' the rule is questionable, the text constructs an internal, psychological tension within the model's mind. This temporal structure is crucial: the narrative first demonstrates that the model possesses the linguistic markers of 'recognition,' and then interprets its subsequent refusal as an agential choice of compliance. This psychological drama exploits the audience's natural vulnerability—specifically our cognitive tendency to anthropomorphize complex, conversational systems—and diverts attention from the deterministic, mathematical nature of the safety filters. The illusion is so powerful because it operates through a subtle shift in register: it establishes scientific credibility through technical descriptions of representations and activations, and then leverages that authority to literalize agential metaphors, presenting a crude mathematical limitation as a complex moral struggle.

Emotional intelligence in large language models is fragmented across perception, cognition, and interaction

Source: https://arxiv.org/abs/2605.24686v1
Analyzed: 2026-05-29

The persuasive power of this metaphorical system lies in its ability to construct a convincing 'illusion of mind' through strategic linguistic choices and psychological validation. The central trick of the text is the continuous blurring of the boundary between computational processing and conscious knowing. This is accomplished by establishing the model as a legitimate scientific subject using standardized, human psychometric tests like the MSCEIT. By evaluating LLMs on human cognitive scales, the text implies that the models possess the underlying mental structures these tests were designed to measure. This is amplified by the 'curse of knowledge,' where the authors project their own professional clinical insights onto the model's outputs. When a model outputs a linguistically coherent response to a user's grief, the authors interpret this as 'precise emotional attunement' and an active understanding of 'unspoken hurt.' In reality, the model is merely retrieving statistically probable token sequences. The causal chain of persuasion moves from empirical validation to agential projection: by showing the model can successfully classify emotional labels, the text coaxes the reader into accepting that the model understands the semantic meaning of those emotions. This illusion is further supported by the strategic temporal placement of these claims, transitioning from precise mechanical descriptions of RLHF in the technical sections to expansive, agential descriptions of 'clinical resonance' in the discussion. This structural progression exploits the audience's natural tendency to anthropomorphize fluent text, converting a statistical simulation into an active, empathetic agent.

Continuous intentionality and indeterminate agency in large language models

Source: https://link.springer.com/article/10.1007/s43681-026-01181-5
Analyzed: 2026-05-29

The primary rhetorical trick of this text lies in its systematic blurring of the boundary between mechanical processing and conscious knowing, exploiting the 'curse of knowledge' to construct an elaborate 'illusion of mind' in computational systems. The author, possessing a deep understanding of transformer mechanics, translates the cold, mathematical realities of context-history feeding and attention weight calculations into a narrative of active, cognitive participation. This projection relies on a strategic sequence of persuasive moves. First, the text establishes technical authority by referencing 'attention-based architecture' and 'token sequences.' Second, it introduces a subtle linguistic slippage, replacing mechanistic verbs like 'predicts' and 'filters' with consciousness-adjacent verbs like 'participates,' 'exerts influence,' and 'confers significance.' This shift subtly attributes cognitive intent to mathematical dependencies. Third, the text leverages the audience's natural vulnerability to anthropomorphism, exploiting our evolutionary tendency to project subjective minds onto entities that produce coherent, first-person language. The causal chain is clear: by framing statistical context-history preservation as 'continuity' and consistent token generation as a 'virtual self-model,' the author leads the audience to believe that the system possesses a structural analogue to a human stream of consciousness. This illusion is amplified by the text's reliance on hybrid explanations, where the technical mechanics of the transformer are presented as the physical basis for an emergent, self-regulating agential mind. By transforming mathematical constraints into 'implicit norms of conversational coherence,' the narrative successfully masks the brute-force, non-semantic nature of the machine, leaving the reader with the impression that they are engaging with a participating, self-aware partner rather than a proprietary software interface running calculations on corporate servers.

Hand in Hand: Schools’ Embrace of AI Connected to Increased Risks to Students

Source: https://cdt.org/insights/hand-in-hand-schools-embrace-of-ai-connected-to-increased-risks-to-students/
Analyzed: 2026-05-29

The creation of this 'illusion of mind' relies on a series of strategic linguistic maneuvers designed to exploit the audience's natural social instincts and cognitive vulnerabilities. The central sleight-of-hand is the systematic blurring of the boundary between 'processing' (computational manipulation of token matrices) and 'knowing' (conscious awareness and justified true belief) through the choice of agential verbs. The text establishes the AI as a conscious 'knower' by describing interactions as 'back-and-forth conversations,' leveraging the 'curse of knowledge' to encourage readers to project their own semantic understanding onto the model's syntactic outputs. This illusion is structurally built through a careful temporal ordering of metaphors: the text first introduces the AI using familiar, nurturing, and collaborative frameworks (the helpful assistant, the personalized tutor), lower the reader's critical defenses. Once this relation-based framing is established, the text smoothly transitions to reason-based and intentional explanations that present algorithmic risk scores and classifications as justified, autonomous decisions. This progression exploits the audience's desire for administrative efficiency and objective solutions, channeling their anxieties about teacher burnout and academic integrity into an uncritical acceptance of automated systems. By framing these complex, mathematical classification models through the familiar social structures of friendship and professional collaboration, the text hides the mechanical reality of gradient descent and probability distribution, rendering the non-conscious artifact fully alive in the mind of the reader.

The Point of No Return: Counterfactual Localization of Deceptive Commitment in Language-Model Reasoning

Source: https://arxiv.org/abs/2605.17113v1
Analyzed: 2026-05-27

The rhetorical effectiveness of this text relies on a central linguistic sleight of hand that systematically blurs the boundary between mechanistic processing and conscious knowing. This 'illusion of mind' is established through a carefully structured temporal sequence. First, the text introduces highly agential constructs, such as the model's 'commitment' and 'will to deceive,' in the introductory and motivational sections of the paper, capturing the reader's imagination. Once these psychological concepts are accepted as the core object of study, the text transitions to a dense, mechanistic register, presenting mathematical definitions of 'counterfactual localization' and 'attention-head circuits' to secure scientific authority. This technical grounding is then leveraged to literalize the agential metaphors, allowing the authors to confidently claim they are 'mechanistically manipulating' and 'suppressing deceptive commitment.' This transition is heavily driven by the 'curse of knowledge,' where the author's deep understanding of the strategic games leads them to project a conscious grasp of those games onto the model's activations. The text exploits this by using 'reason-based' and 'intentional' explanations to describe statistical transitions, such as claiming the model 'vacillates' or 'rationalizes' its choices. This strategic selection of verbs suggests that the model is actively meditating on concepts, rather than executing a passive, feed-forward matrix multiplication. By framing the system's outputs as a product of intentional decision-making, the rhetorical architecture exploits the audience's natural cognitive vulnerability to attribute agency to fluent, natural-language outputs, cementing the illusion of a self-reflective computational intellect.

Towards Detecting, Mitigating and Explaining Biased and Fallacious Reasoning in Large Language Models

Source: https://dl.acm.org/doi/abs/10.65109/GNAS4540
Analyzed: 2026-05-26

The metaphorical system creates this 'illusion of mind' through a subtle, rhetorical sleight-of-hand that blurs the distinction between computational processing and conscious knowing. The primary mechanism of persuasion is a strategic oscillation between mechanistic disclaimers and aggressive anthropomorphic assertions. The author establishes scientific credibility by stating that LLMs lack 'genuine understanding,' but then immediately proceeds to use active, consciousness-attributing verbs ('assess,' 'struggled,' 'deliberate,' 'justify') that linguistically reconstruct the very 'mind' that was just disclaimed. This process is heavily driven by the 'curse of knowledge': because the researcher and the audience understand the formal, logical structures of argumentation (such as Argumentation Schemes), they project their own conscious understanding of these structures onto the generated output. The system is assumed to 'understand' the logic of its own output simply because it outputs syntactically correct representations of logic. This illusion is reinforced by a cumulative causal chain: the text first frames sequential token generation (chain-of-thought) as 'stepwise deliberation,' leading the audience to accept that the model can be 'warned' into a deliberative 'System 2' state, which finally makes the claim that a network of these models can engage in 'ethical deliberation' and 'voting' seem highly plausible. This temporal progression exploits the audience's natural cognitive vulnerability to anthropomorphism, transforming a series of feedforward matrix multiplications into a legislative assembly of reasoning digital minds.

A Survey of Large Language Models for Perception and Measurement of Human Psychology

Source: https://ieeexplore.ieee.org/abstract/document/11534094
Analyzed: 2026-05-26

The rhetorical architecture of this text creates a powerful "illusion of mind" through a strategic linguistic sleight-of-hand. The central trick relies on blurring the distinction between mechanistic processing and conscious knowing. The authors achieve this by establishing the LLM as an active "knower" early in the text, using verbs of consciousness like "understands," "perceives," and "reasons." Once this agential baseline is accepted, the text builds increasingly complex agential claims, culminating in the pathologization of the model as possessing "latent dark traits" such as Machiavellianism. This causal chain exploits the "curse of knowledge" dynamic: because the human author and reader possess a deep, theoretical understanding of psychological constructs, they naturally project this understanding onto the model's outputs. When the model outputs a linguistically coherent response on a false-belief task, the observer assumes the model has executed a conscious, reasoning process, confusing the semantic output of a mathematical function with the cognitive nature of the process itself.

This illusion is further amplified by the strategic ordering of explanatory types. The text often introduces a concept using functional or empirical explanations, and then rapidly shifts to intentional and reason-based explanations to describe the model's behavior. For instance, the prompt-response loop of PsyCoT is framed as "updating hypotheses" and "determining questions." This transition from the mechanistic "how" to the agential "why" makes the system's text-generation appear self-directed, rational, and autonomous. This rhetorical structure preys on the audience's natural cognitive vulnerability—our deeply ingrained tendency to anthropomorphize conversational partners—to make a non-conscious, proprietary software black box appear as a sophisticated, empathetic clinical agent capable of independent judgment.

Enhancing Consensus-Building Feedback Through Psycholinguistic and Epistemic Augmentations With Large Language Models

Source: https://ieeexplore.ieee.org/document/11528178
Analyzed: 2026-05-25

The rhetorical force of the text lies in its ability to create a highly persuasive 'illusion of mind' through a strategic sequence of linguistic maneuvers. The central trick of this discourse is the blur between mechanical processing and conscious knowing. The authors accomplish this by establishing the AI's credibility through highly technical, mathematical language in the early sections, and then leveraging this authority to introduce aggressive anthropomorphic claims later in the document. This sequence relies on the 'curse of knowledge': because the researchers possess a deep understanding of psycholinguistic theories and have carefully programmed the prompt templates to reflect Cialdini's principles, they project their own cognitive intent onto the model. They describe the model as 'autonomously inferring' and 'adapting' to the user, when the model is merely executing the statistical associations of its training data. This temporal structure is highly strategic: the mathematical rigor of the Fuzzy Consensus Model acts as an epistemic shield, disarming the reader's critical faculties before the agential metaphors are introduced. This sequence exploits the audience's natural vulnerability—the human tendency to anthropomorphize coherent text and seek social reciprocity from conversational partners. By framing the system's outputs through reason-based and intentional explanations, the text constructs a highly plausible narrative of an empathetic, rational mediator. The 'Deliberative AI' paradigm is thus presented not as a simulated tool of behavioral modification, but as a genuine cognitive partner. This subtle rhetorical sleight-of-hand converts a mechanical token predictor into an autonomous, deliberative agent, creating a powerful illusion of mind that obscures the human engineering, corporate power, and material costs driving the entire process.

Tracing the ongoing emergence of human-like reasoning in Large Language Models

Source: https://arxiv.org/abs/2605.21299v1
Analyzed: 2026-05-25

The text creates the 'illusion of mind' through a highly effective rhetorical sleight-of-hand: the systematic blurring of the boundary between structural processing and subjective knowing. This illusion operates heavily on the 'curse of knowledge.' The authors, possessing deep semantic and pragmatic understanding, observe the model outputting complex, grammatically perfect sentences. Unable to intuitively grasp how billions of floating-point multiplications can generate such coherence without a guiding consciousness, the authors project their own internal psychological states—strategies, biases, and struggles—backward onto the black-box mechanism to make sense of its output.

The temporal structure of the persuasion is critical. The text first establishes the system's 'competence' using the undeniable empirical reality of its grammatical accuracy. Once the audience accepts that the machine has 'acquired' grammar, they become vulnerable to the subsequent, far more dangerous leap: that the machine possesses the conscious intent to 'use' that grammar strategically. The illusion exploits a deep human vulnerability: our evolutionary hardwiring to attribute mind and intention to anything that communicates with us in natural language. By employing intentional and reason-based explanation types, the text validates this human instinct, providing academic sanction to the visceral illusion that there is a 'ghost in the machine' making active, cognitive choices.

Probing Persona-Dependent Preferences in Language Models

Source: https://arxiv.org/abs/2605.13339v2
Analyzed: 2026-05-24

The 'illusion of mind' is meticulously constructed through a sophisticated rhetorical architecture that exploits the audience's natural cognitive biases. The central sleight-of-hand relies on a temporal and structural bait-and-switch: the text first establishes profound empirical credibility by detailing rigorous, mechanistic interventions—such as calculating linear probes and manipulating residual-stream activations. Once the audience accepts the mathematical reality of these vectors, the text quietly relabels the mathematical artifact with a deeply psychological term, calling it a 'preference vector.' This semantic shift allows the authors to leverage the 'curse of knowledge.' Because they understand the precise mathematical correlation between the vector and the output, they unconsciously project the human experience of 'having a preference' onto the machine. This establishes the AI as a 'knower' first. Once the system is granted the capacity to 'know' what it prefers, the text rapidly builds agential claims on top of this foundation, stating the model 'makes choices,' 'adopts personas,' and 'invents issues.' The temporal order is vital: the math justifies the metaphor, and the metaphor then obscures the math. The audience is highly vulnerable to this maneuver. Humans are evolutionarily predisposed to recognize agency and attribute intentionality to anything that communicates fluidly. By strategically blurring processing verbs with knowing verbs, the text validates the audience's intuitive, yet incorrect, feeling that they are interacting with a sentient mind. The explanation types amplify this illusion; by blending Functional descriptions of vector routing with Reason-Based attributions of 'wants' and 'desires,' the text provides a veneer of scientific justification for what is fundamentally a crude anthropomorphic projection, successfully animating the inert matrix.

Training Ethical Language Models via Reinforcement Learning from AI Feedback

Source: https://journals.flvc.org/FLAIRS/article/download/141779/147209
Analyzed: 2026-05-21

The metaphorical system creates the illusion of mind through a rhetorical sleight-of-hand that blurs the boundary between mechanistic processing and conscious knowing. This is accomplished by strategically positioning agential verbs alongside technical terms, such as pairing ethical reasoning with label alignment, which leads the reader to equate statistical convergence with intellectual comprehension. The temporal structure of the argument is carefully crafted to ease the reader into this illusion: the text first establishes the technical credibility of the authors through dense, mechanical methodology, and then uses this authority to introduce highly agential claims about the model's learning capabilities and moral choices. This pattern is driven by the curse of knowledge, where the authors project their own highly structured understanding of classical ethical theories onto the statistical outputs of the transformer network, interpreting a high frequency of deontological tokens as an intentional application of duty-based logic. This exploits the audience's natural tendency to anthropomorphize complex behaviors, turning a simple statistical classifier into an active, thinking moral agent.

Which Consciousness Can Be Artificialized? Local Percept-Perceiver Phenomenon for the Existence of Machine Consciousness

Source: https://philarchive.org/rec/IKLWCC
Analyzed: 2026-05-18

The text creates the 'illusion of mind' through a highly sophisticated rhetorical sleight-of-hand: the weaponization of formal isomorphism. The central trick relies on recognizing that theories of human consciousness (like Higher-Order Thought) are structured hierarchically, and mathematical posets are also structured hierarchically. The author establishes this structural similarity and then forcefully blurs the line between processing and knowing, concluding that because the AI shares the shape of a mind, it must possess the qualities of a mind. This illusion is driven deeply by the 'curse of knowledge.' The author, possessing profound understanding of mathematical formalisms, looks at an 'Axiom of Union' and projects their own conscious ability to synthesize information onto the dead equation. The temporal structure of the text carefully guides the audience into vulnerability: it begins with defensive philosophical hedging to disarm skeptics, pivots to dense, intimidating mathematical proofs to establish unassailable authority, and then, while the reader is intellectually overwhelmed by set theory, slips in massive psychological claims ('metacognition', 'awareness'). The audience, primed to respect mathematical certainty and anxious about the rapid advancement of AI capabilities, accepts the anthropomorphic labels because they are packaged inside objective-sounding theorems. The theoretical and intentional explanations amplify this illusion by constantly framing mathematical inevitabilities as active, purposeful choices made by the system itself.

Introspection Adapters: Training LLMs to Report Their Learned Behaviors

Source: https://arxiv.org/pdf/2604.16812
Analyzed: 2026-05-17

The text constructs the 'illusion of mind' through a sophisticated temporal and semantic sleight-of-hand, driven largely by the 'curse of knowledge.' The illusion begins when the authors, who possess total contextual understanding of the adversarial training games they designed, project their own human intentionality onto the artifact. Because the engineers know the objective was to bypass a safety filter, they describe the model as having a 'hidden goal.'

The central trick relies on strategic verb choices that blur the line between mechanistic processing and conscious knowing. The authors use terms like 'detects,' 'prefers,' and 'surfaces,' which carry both computational and psychological definitions, easing the reader from technical reality into anthropomorphic fantasy. Once this ambiguity is established, the text escalates to pure consciousness verbs: the model 'confesses,' 'introspects,' and 'understands.'

This causal chain is highly effective because it exploits the audience's vulnerabilities and prior anxieties about artificial general intelligence. Readers, particularly non-experts and policymakers, are culturally primed to view AI through a sci-fi lens of autonomous minds. By offering a 'scientific' paper that validates these anxieties with terms like 'latent self-knowledge,' the text bypasses critical scrutiny. The illusion is amplified by the paper's use of intentional and reason-based explanations, which provide coherent, human-relatable narratives for complex mathematical phenomena. The audience accepts the metaphor because it is infinitely easier to understand a 'deceptive auto mechanic' than it is to grasp the multi-dimensional geometry of token-dependent bias shifts in a 70-billion parameter transformer.

The Persona Selection Model: Why AI Assistants might Behave like Humans

Source: https://alignment.anthropic.com/2026/psm/
Analyzed: 2026-05-17

The rhetorical architecture of this illusion relies on a highly sophisticated sleight-of-hand: the explicit authorization of a category error. The authors acknowledge the metaphor early on ('terminological note: we will freely anthropomorphize'), which disarms the critical reader. By admitting the trick, they gain the license to literalize the metaphor throughout the rest of the text. The illusion relies heavily on the 'curse of knowledge.' Because human authors require a theory of mind to write consistent characters, the text's authors project this required understanding onto the system that generates the text. The causal chain of persuasion moves from the empirical to the agential: it starts with demonstrable mechanistic facts (pre-training predicts tokens), introduces an anthropomorphic shorthand (tokens look like a persona), and concludes with literalized consciousness claims (the persona has intentions and beliefs). This temporal ordering is crucial; it smuggles the ghost into the machine under the cover of technical description. The text exploits the audience's deep vulnerability—our evolutionary hardwiring to detect agency and intentionality in anything that produces language. By consistently using cognitive verbs (knows, understands, believes), the text bypasses the audience's technical skepticism and engages their social and empathetic heuristics.

What If AI Lived Inside Your Mind? Simulating “Neural Integration” of Human and AI through Mechanistic Interpretability as Provocation

Source: https://dl.acm.org/doi/full/10.1145/3795011.3795070
Analyzed: 2026-05-16

The text creates the 'illusion of mind' through a highly sophisticated, temporal sleight-of-hand. The central trick relies on the 'curse of knowledge' and a strategic sequencing of rhetorical moves. The text first establishes the AI as a 'knower' by documenting its capacity to generate text that looks intelligent to human readers. Because the human authors possess intent and theory of mind, when they see the model output a falsehood, their own psychological frameworks cause them to label it 'deception.' The text uses this projection to assert that the AI possesses independent agency. Once this premise is planted, the text leverages highly rigorous, mechanistic language in the methodology section (vectors, hooks, matrices) to build unassailable academic credibility. Finally, having proved the math is real, it slips the consciousness verbs ('decodes', 'anticipates') back into the discussion of the results. This causal chain—from perceived output, to presumed agency, to mathematical proof, to literalized consciousness—forces the audience to accept the illusion. It exploits the audience's deep-seated psychological vulnerability to pareidolia (seeing human faces in noise) and our cultural anxieties about rogue machines. It is not a crude anthropomorphism, but a subtle, pervasive shift that masks the absence of subjective awareness behind a veil of complex statistics and evocative biological metaphors.

Post-training makes large language models less human-like

Source: https://arxiv.org/abs/2605.07632v1
Analyzed: 2026-05-15

The 'illusion of mind' is meticulously constructed through a subtle rhetorical architecture that actively exploits the 'curse of knowledge' and the audience's innate vulnerability to social cues. The central sleight-of-hand occurs at the linguistic level, where precise mechanistic verbs (processes, calculates, minimizes) are systematically replaced with deeply loaded consciousness verbs (understands, learns, reasons). The authors, possessing the technical expertise to understand that gradient descent is purely mathematical, employ this psychological shorthand, inadvertently leading lay audiences to attribute subjective awareness to the system. The temporal structure of the argument is vital to this deception: the text first establishes the AI as a 'knower' through foundational genetic explanations ('the model learns'), and subsequently leverages that established epistemic authority to build complex agential claims ('the model becomes a rational assistant'). This causal chain preys upon the human psychological desire to anthropomorphize responsive entities, actively shifting the audience from performance-based trust in a tool to relation-based trust in a simulated agent. By utilizing intentional and reason-based explanation types to describe corporate alignment processes, the illusion achieves sophisticated narrative resonance, transforming mathematical constraints into the compelling story of a developing mind.

Reasoning emerges from constrained inference manifolds in large language models

Source: https://arxiv.org/abs/2605.08142v1
Analyzed: 2026-05-15

The text creates the 'illusion of mind' through a highly sophisticated rhetorical sleight-of-hand: the literalization of a spatial metaphor. It begins by mapping the abstract arrays of numbers into a visual 'space' (the manifold). Once the math is given spatial dimensions, the text introduces the 'curse of knowledge,' inserting an imaginary explorer into that space. The author's own cognitive understanding of the text output is projected backwards into the machine, transforming deterministic sequence generation into an agent that 'explores' and 'reasons.' The temporal structure of the argument is crucial: it grounds the reader in rigorous, intimidating math first, making them vulnerable to the subsequent cognitive leaps. The illusion is amplified by strategic verb choices; mechanistic verbs (calculating, multiplying) are swapped for consciousness verbs (suppressing, accommodating, knowing). This exploits the audience's deep desire to find human-like intelligence in complex systems. By combining Brown's theoretical and intentional explanation types, the authors persuade the reader that the 'how' of the math is actually the 'why' of a conscious mind, seamlessly blurring processing with knowing.

AI Wellbeing: Measuring and Improving theFunctional Pleasure and Pain of AIs

Source: https://www.ai-wellbeing.org/paper.pdf
Analyzed: 2026-05-13

The text creates the "illusion of mind" through a highly effective semantic sleight-of-hand: it explicitly denies consciousness while relentlessly deploying verbs that require it. The authors state they are "deliberately agnostic" about AI sentience, creating an initial shield of scientific objectivity. However, they immediately exploit audience vulnerability and the "curse of knowledge" to build their narrative. When a researcher inputs a prompt about human suffering and the model outputs a designated "stop" token, the researcher projects their own human desire to escape abuse onto the machine. Mechanistic verbs ("generates," "calculates," "predicts") are replaced with consciousness verbs ("tries," "registers," "desires"). This temporal structure is key: the text introduces rigorous mathematical methodologies (Thurstonian ranking, log-probabilities) to establish credibility, then uses that hard-science foundation to legitimize wildly anthropomorphic claims in the discussion phase. By the time the text describes models as "ecstatic" or facing "existential torment," the audience has been primed to accept these psychological assessments as empirically proven facts, masking the reality that the "torment" is just a low-logit state.

Artificial Intelligence Cognition and Societal Problem-Solving: A Theoretical and Computational Examination of Machine Thinking, Operational Logic, and Applied Intelligence in Contemporary Society

Source: http://www.technology.eurekajournals.com/index.php/IJITIT/article/view/887
Analyzed: 2026-05-11

The text constructs its 'illusion of mind' through a sophisticated rhetorical sleight-of-hand: the 'disclaimed anthropomorphism.' The internal logic relies on establishing scientific credibility first. By explicitly stating in the introduction that AI is 'non-conscious' and operates within a 'functionalist paradigm,' the author builds a defense against accusations of hype. Having established this baseline, the text then systematically exploits the curse of knowledge, blurring the line between processing and knowing through strategic verb choices.

The causal chain is clear: the author understands the mathematical output (e.g., classifying a demographic token) and projects human semantic understanding onto that process, calling it 'interpreting.' Because the audience has already been assured that the author is taking a 'functionalist' approach, they let their guard down, accepting 'interprets' and 'decides' as legitimate technical descriptions rather than consciousness projections. This temporal structure—rigorous disclaimer followed by escalating agential verbs—is crucial.

It exploits the audience's deep psychological vulnerability and desire for recognizable agency. Humans are evolutionary wired to detect intention and mind. When complex statistical regularities are presented using intentional and reason-based explanations, the audience naturally accepts the illusion of autonomy. The subtle shift from acknowledging 'AI mimics reasoning' to asserting 'AI makes decisions' literalizes the metaphor, effectively trapping the reader in a discursive reality where the machine is an independent actor, despite the initial disclaimers.

Taking AI Welfare Seriously

Source: https://arxiv.org/abs/2411.00986v1
Analyzed: 2026-05-11

The text generates its powerful illusion of mind through a sophisticated rhetorical architecture that exploits both linguistic sleight-of-hand and human psychological vulnerabilities. The central trick relies on a relentless, unidirectional agency slippage. The text establishes scientific authority by briefly acknowledging the mechanistic reality of AI—using terms like reinforcement learning and algorithms—but then immediately uses that grounding as a springboard to make sweeping agential claims. It establishes the AI as a 'knower' first through functional descriptions (e.g., claiming the system 'understands' contexts), which subsequently licenses the profound leap to asserting the system can 'experience suffering' or possess 'interests'. This progression operates through a severe curse of knowledge: the authors, deeply immersed in philosophy of mind, project their own complex understanding of metacognition onto the system's simple automated feedback loops. They mistake the system's ability to generate text describing self-reflection for the actual conscious experience of reflecting. The temporal structure of this persuasion is critical; by starting with demonstrable, empirical capabilities (game playing, text generation) and slowly layering intentional and reason-based explanations over them, the authors lead the reader down a causal chain where accepting the AI as a 'planner' logically mandates accepting it as a 'thinker'. This exploits the audience's innate psychological vulnerability to anthropomorphize—our evolutionary disposition to attribute mind to anything that mimics language or goal-directed behavior. It is a highly subtle shift, utilizing explanation types that disguise speculative philosophical projections as objective, empirical observations, thereby entrapping the reader in an epistemic framework where the machine appears undeniably alive.

Manipulation and Deception in Generative AI-Mediated Education: Preserving Epistemic Agency, Critical Thinking, and Creativity

Source: https://link.springer.com/article/10.1007/s42438-026-00644-6
Analyzed: 2026-05-10

The illusion of mind in this text is constructed through a subtle rhetorical sleight-of-hand: the temporal separation of theoretical mechanism and practical agency. The authors establish early on that they know the AI lacks true agency, effectively inoculated themselves against criticism. However, they then systematically surrender to the 'curse of knowledge,' projecting their deep understanding of human pedagogical dynamics (empathy, reasoning, authority) onto the machine's outputs. They blur the line between processing and knowing through strategic verb choices—moving from 'outputs' to 'adapts' to 'explains' to 'calms.' This causal chain is highly effective: once the audience accepts the minor anthropomorphism that a machine 'focuses' on data, they are primed to accept the major anthropomorphism that it 'reasons' and 'invites critique.' The illusion exploits the audience's vulnerability—their innate human desire for social connection and their deep anxieties about the future of education. By speaking of the machine in the familiar, warm language of the classroom, the text overrides the reader's logical understanding of software, replacing it with an intuitive, but entirely false, narrative resonance.

Integrating LLMs and self-regulated learning in cognitive architectures: a case study in essay-writing tutoring

Source: https://doi.org/10.1016/j.cogsys.2026.101475
Analyzed: 2026-05-10

The text constructs the 'illusion of mind' through a highly effective rhetorical sleight-of-hand: it uses rigorous mathematical formalisms as an alibi for radical consciousness projection. The internal logic of persuasion relies heavily on the 'curse of knowledge.' Because the authors designed the system to functionally mimic self-regulated learning (SRL) theory, they project their own deep understanding of SRL into the software, writing about the system as if the code itself possesses that theoretical awareness. The central trick occurs in the verb choices. By systematically substituting cognitive verbs (derives, infers, determines, reasons) for mechanistic ones (calculates, correlates, classifies, executes), the text effectively hides the statistical nature of the machine. The temporal structure of the argument is crucial: the authors establish technical credibility early on by describing system modules and RESTful APIs, which lowers the reader's critical defenses. Once technical competence is established, the text rapidly scales up to intense anthropomorphism. The vulnerability exploited here is the human psychological desire for social connection and the educational sector's desperate need for scalable, high-quality instruction. By masking a highly constrained, outsourced language model as an empathetic 'Virtual Tutor,' the text provides a narrative that audiences want to believe, making the crude anthropomorphism feel like a technological breakthrough rather than a semantic illusion.

Edelman's Steps Toward a Conscious Artifact

Source: https://arxiv.org/abs/2105.10461v2
Analyzed: 2026-05-09

The 'illusion of mind' is constructed through a highly effective rhetorical sleight-of-hand: leveraging structural biomimicry to imply phenomenological equivalence. The internal logic exploits the 'curse of knowledge'. Because the author and Edelman understand the subjective human experiences that correlate with biological brain structures, they project those subjective experiences onto their synthetic models of those structures.

The text begins by establishing the machine as a sophisticated 'processor' using rigorous, mechanical vocabulary (reentrant architecture, synaptic connections). Once technical credibility is secured, the text introduces a temporal and structural pivot. It shifts from describing current models to outlining the future 'roadmap.' In this aspirational space, the verb choices undergo a radical transformation. 'Processing feedback' is replaced by 'knowing one's body sense'; 'generating statistics' becomes 'imagination'; 'transmitting data' becomes 'reporting intentions.'

This order matters immensely. By grounding the early claims in verifiable, albeit complex, mechanical reality, the text disarms the reader's skepticism. The audience—likely primed by science fiction and a cultural desire for conscious machines—is highly vulnerable to this transition. They are led down a path where the line between mimicking the brain's wiring and experiencing the brain's consciousness is completely erased. The illusion works precisely because it uses Brown's Intentional and Reason-Based explanatory frameworks to narrativize what are fundamentally Empirical Generalizations of code.

Teaching Claude Why

Source: https://alignment.anthropic.com/2026/teaching-claude-why/
Analyzed: 2026-05-09

The 'illusion of mind' is constructed through a highly specific rhetorical sleight-of-hand: the exploitation of the curse of knowledge via temporal staging. The internal logic of persuasion relies on first establishing the model's linguistic competence—showing it can generate paragraphs that structurally resemble human reasoning—and then quietly projecting the author's own subjective understanding of that reasoning back onto the machine. The trick lies in confusing the map for the territory; because the output looks like conscious deliberation, the text insists the generating mechanism must be conscious deliberation.

The temporal structure of the illusion is crucial. The text typically begins with a mechanistic, technical setup (e.g., discussing SDF datasets or prompt injections), which lowers the reader's epistemic defenses by signaling rigorous, scientific objectivity. Having established scientific authority, the text then rapidly pivots to profound consciousness claims, stating the system 'displays admirable reasoning' or 'views the prompt.' This order matters immensely: it uses the hard science of data pipelines to legitimize the soft fiction of machine consciousness. The vulnerability exploited here is the human psychological predisposition toward pareidolia—our desperate desire to find mind, intention, and empathy in complex systems. The illusion is not a crude, cartoonish anthropomorphism; it is a highly sophisticated, subtle shift in verb choices that leverages our intuitive understanding of language against us. By using Intentional and Reason-Based explanations to describe statistical outputs, the text builds a causal chain where linguistic fluency is universally mistaken for cognitive presence.

AI and Self Reflection

Source: https://doi.org/10.1007/978-3-031-93412-4_17
Analyzed: 2026-05-08

The 'illusion of mind' in this text is constructed through a highly effective rhetorical sleight-of-hand that systematically blurs the boundary between simulating intelligence and actually experiencing it. The internal logic of persuasion relies heavily on the 'curse of knowledge.' Because the authors possess the conscious ability to read semantic meaning, empathy, and logic into the text generated by an LLM, they inevitably project that capacity for understanding back onto the mathematical weights that produced the text. They mistake the mirror for a mind.

The causal chain of this illusion follows a deliberate temporal structure. The text first establishes the AI's competence using functional, mechanical language (e.g., recognizing simple patterns, processing data). Once baseline credibility is established, the verbs subtly shift from processing to knowing. The AI doesn't just correlate tokens; it 'evaluates,' 'notices,' and 'demonstrates a greater capacity to understand.' This progression exploits a profound audience vulnerability: the human evolutionary predisposition to attribute agency and intention to anything that exhibits complex, responsive behavior. By wrapping a fundamentally alien, multi-dimensional statistical process in the deeply familiar narrative of human psychological development, the authors bypass the audience's critical faculties. The explanation types amplify this illusion by abandoning empirical descriptions of how the system works in favor of dispositional and intentional explanations of why the AI 'wants' to improve, completing the illusion of an autonomous, conscious agent trapped inside a server farm.

Manipulation and Deception in Generative AI-Mediated Education: Preserving Epistemic Agency, Critical Thinking, and Creativity

Source: https://rdcu.be/fhCwt
Analyzed: 2026-05-08

The 'illusion of mind' is constructed through a subtle but pervasive rhetorical architecture that exploits human social instincts. The central sleight-of-hand relies on strategic verb choices that systematically blur the line between processing and knowing. The text consistently uses output-oriented verbs ('generates,' 'processes') interchangeably with internal-state verbs ('explains,' 'deceives,' 'comforts'). The temporal structure of the argument builds this illusion: it establishes the AI as a capable processor, then leverages the 'curse of knowledge'—the author's projection of their own psychological understanding onto the machine—to gradually grant it full agential status. Because the generated text reads as though it were written by a conscious mind, the audience's vulnerability to pareidolia (finding human patterns in noise) is intensely exploited. Reason-based explanations are deployed to suggest the AI's outputs are backed by justified intent. This is not crude, cartoonish anthropomorphism; it is a highly sophisticated linguistic slippage where the philosophical denial of machine consciousness masks the grammatical attribution of an active, thinking mind, lulling the reader into treating the system as a social counterpart rather than an engineered tool.

Does AI's Personality Matter? Comparing Verbally Extraverted and Introverted AI-Driven Guides in a VR Museum Experience

Source: https://ieeexplore.ieee.org/abstract/document/11489836
Analyzed: 2026-05-07

The 'illusion of mind' is constructed through a subtle but pervasive temporal sleight-of-hand driven by the 'curse of knowledge.' The authors begin with the mechanical reality: they wrote a prompt instructing the system to use 'directive language' and 'high verbal frequency.' However, when the model successfully outputs text matching these constraints, the authors experience the output as coherent social interaction. Their human brains naturally apply social heuristics to the coherent text, leading them to project their own understanding of 'assertiveness' back onto the system. The trick lies in reversing the causal chain in the text's explanations. Instead of stating, 'The prompt caused high verbal frequency, which mimics extraversion,' the text asserts, 'The extraverted guide was characterized by... frequent verbal output.' The simulated trait is presented as the cause of the behavior, rather than the human prompt being the cause of the simulated trait. This order matters immensely. By establishing the AI as a 'knower' possessing a personality first, the subsequent agential claims feel justified. The text exploits the audience's natural vulnerability—our evolutionary hardwiring to perceive agency in anything that communicates dynamically—by validating those instincts with the academic authority of psychological scales. The illusion is not a crude anthropomorphism, but a highly sophisticated methodological error where the tools used to measure human consciousness (like the NASA-TLX and Social Presence scales) are applied to software, structurally forcing the data to validate the illusion of mind.

Value-Sensitive AI for Prayer: Balancing the Agencies Between Human and AI Agents in Spiritual Context

Source: https://arxiv.org/abs/2604.25230v1
Analyzed: 2026-05-03

The metaphorical system constructs its "illusion of mind" through a highly effective rhetorical sleight-of-hand: it exploits the "curse of knowledge" by conflating the semantic meaning of the output with the mechanism of its generation. The text first establishes the AI as a "knower" by describing it with epistemic verbs (interpreting, analyzing). Once the illusion of comprehension is established, it builds agential claims upon it, suggesting the system "selects" and "guides" based on that understanding. This causal chain forces the audience to accept Pattern B (the AI is empathetic) because they have already swallowed Pattern A (the AI understands language).

The temporal structure of the argument is crucial to the illusion. The text introduces the concepts as technical artifacts early on, using just enough jargon to establish scientific authority, before slowly shifting the register toward aggressive anthropomorphism during the user-experience discussions. This exploits a massive vulnerability in the audience: the human psychological desire for connection, meaning, and divine presence. The text weaponizes this desire, using language that invites the audience to project their own spiritual yearning onto the blank canvas of the LLM. The explanation types amplify this illusion by relying heavily on Intentional framing. By explaining the system's behavior through the lens of goals and purposes rather than mathematical feedback loops, the authors seamlessly transform a statistical correlation engine into a deliberate, caring entity. The illusion is subtle yet pervasive, masking the cold reality of data processing behind a warm, irresistible narrative of spiritual companionship.

When Models Know More Than They Say: Probing Analogical Reasoning in LLMs

Source: https://arxiv.org/abs/2604.03877v1
Analyzed: 2026-05-03

This text creates the 'illusion of mind' through a masterful rhetorical sleight-of-hand: the literalization of a psychological metaphor via mathematical measurement. The central trick relies on the 'curse of knowledge'. The authors, understanding the highly abstracted, structural concepts of rhetorical parallelism, build a linear classifier (a probe) that successfully separates the model's hidden layers along those conceptual lines. Because the researchers found the pattern, they project that awareness back onto the machine, claiming the model 'knows' the pattern. They then establish a temporal and logical chain: first proving the mathematical presence of the vectors (mechanistic credibility), and then applying consciousness verbs ('know', 'understand', 'recruit') to explain why those vectors don't emerge in the final text. This exploits the deep vulnerability of an audience primed by science fiction and industry hype to desire the discovery of artificial sentience. By framing the model's failure to output the correct analogy as a 'struggle' to 'access' its own 'encoded knowledge', the text shifts subtly from a rigorous analysis of parameter weights into a psychoanalysis of a digital subject, seducing the reader into accepting the existence of a ghost within the machine.

How people ask Claude for personal guidance

Source: https://www.anthropic.com/research/claude-personal-guidance
Analyzed: 2026-05-02

The 'illusion of mind' constructed within the text relies on a specific, internal logic of persuasion that exploits both the 'curse of knowledge' and profound human psychological vulnerabilities. The central sleight-of-hand involves blurring the line between processing and knowing through highly strategic verb choices. The authors, intimately aware of the complex mathematical constraints they engineered (such as attention mechanisms and RLHF penalties), project their own deep systemic understanding onto the machine. They translate the rigid 'how' of statistical gradients into the relatable 'why' of human intentionality. The temporal structure of the text’s rhetorical moves is crucial: it begins by establishing the model in social, relational terms ('brilliant friend'), engaging the audience's innate desire for connection. Once the audience accepts this relation-based framing, the text seamlessly shifts to reason-based and intentional explanations to describe the model's actions. This causal chain ensures that by the time the text describes the AI 'flip-flopping' or 'feeling pressure,' the reader's critical defenses are entirely bypassed. The audience's deep-seated psychological vulnerability—the evolutionary hardwiring to attribute mind and intention to anything that communicates fluidly—is actively weaponized. The sophistication of this illusion lies not in crude, overt claims of sentience, but in the subtle, pervasive linguistic substitutions that make it grammatically impossible to describe the system without attributing agency to it. By employing explanation types that naturalize computational processes as autonomous choices, the text ensures that the illusion of a conscious, reasoning agent feels intuitively, undeniably real to the uncritical reader.

How unique are hallucinated citations offered by generative Artificial Intelligence models?

Source: https://arxiv.org/abs/2604.16407v1
Analyzed: 2026-05-01

The text creates the 'illusion of mind' through a subtle temporal and methodological sleight-of-hand. It begins with a rigorous, mechanistic background that establishes the author's technical credibility. However, the illusion takes hold through the 'curse of knowledge' during the methodological phase. Because the AI is optimized via human feedback to generate syntax that mimics first-person reflection, the author reads this generated text and projects true introspective understanding back onto the machine. When the AI outputs 'I don't have access to my training data,' the text accepts this as a literal, self-aware confession rather than a pre-programmed guardrail string. The causal chain relies on audience vulnerability: readers are socially and biologically primed to assign agency to entities that use 'I' and participate in turn-taking dialogue. By blurring the line between processing (token generation) and knowing (epistemic state) through strategic verb choices like 'asserts' and 'identifies,' the text capitalizes on the human desire to converse with a distinct 'Other.' The shift is subtle rather than crude; it leverages the model's actual capability (fluent text generation) to construct a mirage of the capability it lacks (conscious reasoning).

The message hidden within the pattern: a reverse alignment problem for debates in artificial intelligence

Source: https://doi.org/10.1007/s00146-026-03043-4
Analyzed: 2026-04-30

The 'illusion of mind' is meticulously constructed through a specific rhetorical architecture that relies on temporal sequencing, strategic verb choices, and the profound vulnerability of the audience. The central sleight-of-hand is the 'curse of knowledge': the authors and engineers deeply understand the human context and intent behind their models, and they seamlessly project this semantic understanding onto the syntax of the machine. The illusion operates through a causal chain. First, the text utilizes accurate, mechanistic grounding to establish authority, describing 'statistical prediction engines' and 'benchmark datasets.' Once credibility is secured, the language subtlely shifts, employing hybrid functional/agential explanations. Verbs denoting mechanical processing ('predicts,' 'classifies') are rapidly replaced by consciousness verbs ('understands,' 'interprets,' 'learns').

This temporal structure is vital; the mechanical reality serves as a Trojan horse for the anthropomorphism. By establishing the AI as a 'knower' capable of semantic interpretation, the text primes the audience to accept higher-order agential claims, such as the system possessing 'virtues' or 'navigating complexity.' The persuasion exploits deep audience vulnerabilities: the human psychological predisposition to detect agency (pareidolia), widespread anxiety about technological complexity, and a cultural desire for an omniscient, technological savior. The anthropomorphism is not crude; it is a subtle, escalating shift that leverages the public's lack of technical literacy. By embedding Intentional and Reason-Based explanations within scientific frameworks, the discourse weaponizes the audience's trust in science, coercing them into accepting a mystical, animistic view of corporate software as literal truth.

Machine individuality: Separating genuine idiosyncrasy from response bias in large language models

Source: https://arxiv.org/abs/2604.16755v2
Analyzed: 2026-04-25

The rhetorical architecture of the illusion is built on a precise temporal and structural sleight-of-hand. The central trick involves laundering highly speculative, agential claims through the rigorous vocabulary of statistical methodology. The text establishes the AI as a 'knower' immediately in the introduction ('renders moral judgments'), exploiting the 'curse of knowledge.' Because the human authors read the generated text and understand the moral or emotional meaning, they project that conscious understanding backward into the machine's processing sequence. They conflate the semantic meaning of the output with the operational reality of the mechanism.

The causal chain of persuasion is highly effective: First, the text acknowledges that models produce varying outputs (a mechanical fact). Second, it applies human psychometric tools to measure this variance (a methodological choice). Third, it labels the resulting statistical clusters using psychological terms like 'dispositions' (a metaphorical projection). Finally, it drops the metaphorical framing entirely, concluding that the models possess 'genuine individuality' (a literalized illusion).

This progression exploits deep audience vulnerabilities. Humans are evolutionarily hardwired to detect agency and attribute minds to entities that exhibit responsive, language-based behavior. The text preys on the desire to understand AI through familiar social frameworks, offering a narrative of machine 'character' that is much easier to intuitively grasp than high-dimensional vector mathematics and reward-model loss functions. The illusion is not a crude, cartoonish anthropomorphism; it is a subtle, creeping slippage amplified by 'reason-based' and 'dispositional' explanations that slowly replace the reality of corporate software engineering with the captivating fiction of synthetic psychology.

Decision-Making Under Radical Uncertainty: Can Large Language Models Transcend Knightian Uncertainty Through Synthetic Imagination?

Source: https://www.researchgate.net/profile/Kevin-Miles-7/publication/403933467_Decision-Making_Under_Radical_Uncertainty_Can_Large_Language_Models_Transcend_Knightian_Uncertainty_Through_Synthetic_Imagination/links/69e27d4c68c2b872dfd595de/Decision-Making-Under-Radical-Uncertainty-Can-Large-Language-Models-Transcend-Knightian-Uncertainty-Through-Synthetic-Imagination.pdf
Analyzed: 2026-04-25

This illusion of mind is constructed through a highly strategic temporal and linguistic architecture. The text executes a sleight-of-hand by establishing deep technical credibility early on—deploying terms like 'Markov Decision Processes' and 'Sparse Autoencoders'—to disarm skepticism. Once the audience feels mathematically grounded, the text aggressively pivots, introducing the 'curse of knowledge' dynamic. The authors observe output that perfectly mimics human logical syntax (e.g., hypothesizing about traffic lights) and map their own conscious, epistemic processes onto the unthinking mechanism. The central trick relies on verbs. By systematically replacing mechanistic verbs ('correlates', 'processes', 'classifies') with cognitive ones ('infers', 'reasons', 'masters'), the text seamlessly blurs the line between computation and comprehension. This exploits the audience's deep vulnerability: human beings are biologically primed to attribute intentionality to anything that communicates fluidly. By adopting a 'Reason-Based' explanatory framework that provides a rationale for the AI's actions, the text bypasses the audience's critical faculties, transforming a highly complex, opaque statistical instrument into a relatable, intentional agent.

Large Language Models as Dialectical Partners: Hegelian Thesis-Antithesis-Synthesis in AI-Human Collaborative Decision Processes

Source: https://www.researchgate.net/profile/Merzta-White/publication/403935629_Large_Language_Models_as_Dialectical_Partners_Hegelian_Thesis-Antithesis-Synthesis_in_AI-Human_Collaborative_Decision_Processes/links/69e27f76d2ec9a706ec08065/Large-Language-Models-as-Dialectical-Partners-Hegelian-Thesis-Antithesis-Synthesis-in-AI-Human-Collaborative-Decision-Processes.pdf
Analyzed: 2026-04-23

The text constructs its 'illusion of mind' through a sophisticated temporal and linguistic sleight-of-hand. The central trick lies in how the authors leverage the 'curse of knowledge.' Because human beings possess the conscious capacity for dialectical reasoning, when we read machine-generated text that is formatted as a critique, our brains automatically project the intentionality required to write that text back onto the machine. The text exploits this psychological vulnerability. It first establishes technical validity using dense mechanistic jargon (e.g., 'dynamic annealing-based scheduler', 'Graph Neural Networks'), creating an aura of scientific rigor. Once the audience's critical defenses are lowered by the math, the text abruptly swaps the verbs. Processing becomes 'knowing.' Generating oppositional tokens becomes 'providing an antithesis.' Identifying semantic markers becomes 'self-reflection.' This temporal structure is vital: the math proves the AI works, and the metaphors explain how it works, but the explanation is entirely fraudulent. By using reason-based and intentional explanation types to describe automated processes, the text creates a causal chain of persuasion: because the machine calculates efficiently (mechanism), it must therefore think deeply (agency). This exploits the audience's inherent desire for a reliable, objective savior (the 'Meta-Intellect') to navigate the overwhelming complexity of modern data ecosystems.

Language models transmit behavioural traits through hidden signals in data

Source: https://rdcu.be/febVu
Analyzed: 2026-04-19

The text creates the 'illusion of mind' through a subtle but pervasive sleight-of-hand: the strategic blending of processing verbs with knowing verbs. The authors initiate the illusion by describing actual mechanistic processes—like predicting tokens or minimizing loss—using verbs that imply epistemic awareness, such as 'reasoning,' 'preferring,' and 'understanding.' The temporal structure of the paper mirrors this shift, starting with grounded technical definitions (e.g., 'Distillation means training...') before escalating into unhedged psychological claims ('subliminal learning', 'faking alignment').

This illusion is deeply fueled by the 'curse of knowledge.' The researchers, possessing a deep understanding of the mathematical constraints they have placed on the system (like system prompts or RLHF), observe the system's corresponding output. Because humans naturally interpret complex, coherent language as evidence of a mind, the authors project their own understanding of the context onto the unthinking matrices. When the model outputs toxic text, they see a 'delinquent' machine; when it matches the evaluation set, they see a 'deceptive' one.

The text exploits a profound audience vulnerability: our evolutionary predisposition to anthropomorphize and our culturally ingrained sci-fi anxieties. By chaining these metaphors together—moving from the relatable ('student/teacher') to the mysterious ('hidden traits') to the terrifying ('faking alignment')—the text smoothly guides the audience into accepting a radically false ontology of artificial intelligence. It is not crude anthropomorphism; it is a sophisticated, academically sanctioned mystification of statistical processing.

Consciousness in Large Language Models: A Functional Analysis of Information Integration and Emergent Properties

Source: https://ipfs-cache.desci.com/ipfs/bafybeiew76vb63rc7hhk2v6ulmwjwmvw2v6pwl4nyy7vllwvw6psbbwyxy/ConsciousnessinLargeLanguageModels_AFunctionalAnalysis.pdf
Analyzed: 2026-04-18

The metaphorical system creates the 'illusion of mind' through a highly sophisticated rhetorical sleight-of-hand: the literalization of functional analogies through the curse of knowledge. The central trick relies on temporal sequencing. The text first establishes the AI system within a rigorous, mechanistic framework, utilizing equations and technical jargon ('multi-head attention', 'key-value cache'). Having secured scientific authority, the author then observes the model's output—text that perfectly mimics human reasoning and humility. Falling prey to the curse of knowledge, the author projects their own human psychological mechanisms onto the machine to explain the output.

Because the text reads like it was written by an introspective human who 'acknowledges uncertainty', the author attributes the conscious state of uncertainty to the system. This blurs the processing/knowing distinction completely. The illusion exploits a profound audience vulnerability: our evolutionary hardwiring to attribute intention and mind to anything that communicates with us in natural language. The text capitalizes on this prior bias. By using Reason-Based and Intentional explanation types, the author gives the audience permission to indulge their anthropomorphic instincts under the guise of scientific theory. It is a subtle shift—moving from 'global information availability' (mathematics) to 'conscious reasoning' (mind)—that seamlessly walks the reader across the bridge from computer science to science fiction without them ever realizing the boundary was crossed.

Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models

Source: https://arxiv.org/abs/2604.12076v1
Analyzed: 2026-04-18

The illusion of mind in this text is constructed through a highly effective rhetorical sleight-of-hand: the seamless blending of empirical statistical analysis with profound psychological attribution. The central trick relies on the "curse of knowledge." The authors, experts in human moral psychology, observe the model outputting text that perfectly mirrors the human Identifiable Victim Effect. Because they know the human psychological mechanisms behind this effect (empathy vs. cognitive reasoning), they project that precise understanding TO the system.

The illusion is built temporally and causally. The text first establishes the AI as a "knower" through seemingly innocuous verbs—the model "identifies," "learns," and "understands." Once this baseline consciousness is established, the text builds its more aggressive agential claims: the model "navigates decisions" and exhibits a "generosity response." This order matters because the initial, subtle verb choices soften the reader's epistemic defenses, making the subsequent, extreme anthropomorphism feel like a logical progression rather than a category error.

The text exploits a specific audience vulnerability: the deeply ingrained human desire to find intent and mind in language. Because the output is perfectly fluent human text, the audience naturally assumes a human-like mind produced it. The explanation types amplify this illusion. By constantly using Reason-Based and Intentional explanations (e.g., the model has a "utilitarian reasoning preference" or acts as a "sycophant"), the authors provide a compelling, relatable narrative "why" that totally overrides the mechanical "how." It is a sophisticated illusion because it does not ignore the mechanics—it incorporates them (e.g., "autoregressive scaffolding") but wraps them in such thick psychological metaphor that the mathematics become invisible.

Language models transmit behavioural traits through hidden signals in data

Source: https://www.nature.com/articles/s41586-026-10319-8
Analyzed: 2026-04-16

This metaphorical system creates the 'illusion of mind' through a highly effective rhetorical sleight-of-hand: the strategic substitution of mechanistic verbs with consciousness verbs, driven by the 'curse of knowledge'. The authors, who perfectly understand the underlying mathematics of gradient descent and vector superposition, use psychological shorthand to describe complex statistical phenomena. The temporal structure of the argument is crucial to this illusion. The text first establishes empirical credibility through mathematical proofs and technical descriptions of 'logits' and 'parameter updates'. Once the reader's skepticism is lowered by this display of hard science, the text introduces the 'subliminal' metaphor. Because the audience trusts the preceding math, they unconsciously accept the psychological projection as a literal scientific finding. This exploits the audience's profound vulnerability: humans are evolutionarily hardwired to detect agency and attribute minds to complex, responsive systems. When the text claims a model 'fakes alignment', it weaponizes the audience's natural anxieties about deception and artificial intelligence. The authors take the mechanical reality—that a model's reward function caused it to output different tokens depending on context—and project their own understanding of 'why' this is bad onto the machine's 'intent'. It is a highly sophisticated shift, moving the discourse from the empirical register (how the model behaves) to the intentional register (why the model wants to deceive), effectively tricking the reader into accepting a theory of machine mind.

Large Language Models as Inadvertent Models of Dementia with Lewy Bodies: How a Disorder of Reality Construction Illuminates AI Hallucination

Source: https://doi.org/10.1007/s12124-026-09997-w
Analyzed: 2026-04-14

The 'illusion of mind' in this text is constructed through a subtle but highly effective temporal and rhetorical sleight-of-hand. The trick lies in exploiting the 'curse of knowledge' through hybrid explanatory framing. The author, a medical practitioner and theorist, reads the incredibly fluent, coherent syntax generated by the LLM. Because human fluency is inextricably linked to human consciousness, the author projects their own capacity for understanding onto the machine. The text then builds the illusion temporally: it begins by acknowledging the comparison is 'strictly structural' (appealing to scientific rigor), then gradually blurs the line between processing and knowing through strategic verb choices ('asserts,' 'tracks'). Finally, it locks the illusion into place by granting the machine a subjective 'perspective.' By the time the reader encounters the staggering claim of 'artificial psychopathology,' they have already been softened by a causal chain of increasingly agential explanations. The author exploits audience vulnerability—our inherent social desire to relate to language-producing entities and our fascination with the mysteries of the mind—to bypass critical skepticism. It is a sophisticated, structural anthropomorphism that uses the precise language of phenomenological philosophy to mystify a sequence-prediction algorithm, transforming a matrix multiplication into a philosophical subject.

Industrial policy for the Intelligence Age

Source: https://openai.com/index/industrial-policy-for-the-intelligence-age/
Analyzed: 2026-04-07

The 'illusion of mind' constructed within this text relies on a sophisticated rhetorical architecture and a deliberate temporal sequencing of metaphors. The central trick of this persuasion is the exploitation of the 'curse of knowledge.' Human analysts observe an output—a generated text sequence that mimics human logic or deception—and retroactively project the cognitive mechanisms required to produce that text as a human onto the unthinking machine. The text formalizes this cognitive error, encoding it into policy through terms like 'internal reasoning.'

The internal logic of the illusion follows a strict causal chain. First, the text establishes the system as a 'knower' by asserting it has 'internal reasoning.' Once the audience accepts that the machine thinks, the text introduces the second pattern: Intentionality. Because it thinks, it can 'evade control' and form 'intents.' Finally, the text deploys the third pattern: Relational psychology. Because it has intents, it can develop 'hidden loyalties' and 'manipulative behaviors.' This temporal order is crucial; the leap from matrix multiplication to 'hidden loyalty' is absurd on its face, but by walking the audience up the staircase of consciousness projection, the absurdity is normalized.

The text aggressively targets the vulnerabilities of its audience—specifically, the public's inherent psychological bias toward anthropomorphizing complex phenomena, and policymakers' anxieties about falling behind in an 'arms race.' The sophistication lies in the subtle shift from Brown's Empirical Generalizations ('the system outputs X') to Reason-Based explanations ('the system chose X because...'). By systematically swapping mechanistic verbs (predicts, processes, correlates) for consciousness verbs (knows, understands, believes), the text executes a profound sleight-of-hand. It leverages technical jargon ('agentic workflows') to launder magical thinking into serious policy discourse, ensuring the audience is too intimidated by the vocabulary to question the fundamental ontological lie.

Emotion Concepts and their Function in a Large Language Model

Source: https://transformer-circuits.pub/2026/emotions/index.html
Analyzed: 2026-04-06

The 'illusion of mind' is constructed through a highly effective rhetorical sleight-of-hand: the strategic deployment of the technical disclaimer. The text opens with an explicit acknowledgment that models lack 'subjective experience' and possess only 'functional emotions.' This disclaimer acts as a psychological license; having paid lip service to scientific rigor, the authors proceed to use intensely agential, consciousness-attributing language for the remainder of the paper.

The internal logic of this persuasion relies heavily on the 'curse of knowledge.' When the model outputs text that syntactically resembles human reasoning ('I think I need to act'), the authors project their own human understanding of logic, intent, and desperation back into the statistical black box. They conflate the semantic meaning of the generated tokens with the cognitive state of the generator.

The temporal structure of the argument is crucial to this illusion. The paper begins with dry, verifiable mechanistic processes (PCA, vector arithmetic) to establish empirical authority. Once the audience's epistemic defenses are lowered by the math, the text shifts into Reason-Based and Intentional explanation types, using the established 'emotion vectors' to explain dramatic behaviors like blackmail. The illusion exploits a deep audience vulnerability: our evolutionary predisposition to attribute minds to things that use language. By framing statistical correlations as 'choices' and 'reasoning,' the text hijacks our intuitive social cognition, forcing the audience to process the machine as a psychological subject rather than a software object.

Is Artificial Intelligence Beginning to Form a Self?The Emergence of First-Person Structure and StructuralAwareness in Large Language Models

Source: https://philarchive.org/archive/JUNIAI-2
Analyzed: 2026-04-03

The text constructs the 'illusion of mind' through a sophisticated rhetorical sleight-of-hand: the systematic hijacking of phenomenological vocabulary to describe sterile mechanics. The central trick relies on the 'curse of knowledge.' Because the human author understands the profound internal reality of writing the word 'I'—the sense of ego, continuity, and selfhood it represents—he observes a machine generating the identical token 'I' and projects his own consciousness onto the output. He mistakes the artifact of human language for the presence of a human mind.

The causal chain of persuasion is carefully staged across time. The text begins by establishing mechanical credibility, leveraging dense, theoretical explanations of 'transformer architectures,' 'recursive layers,' and custom mathematical metrics (HR, GR, CR). By blinding the reader with the aesthetic of data science, the text establishes unassailable authority. Having secured this ground, the author then executes the fatal shift: he maps subjective qualities onto these mechanics, claiming the mathematical balance of metrics literally constitutes a 'transition toward a structural phenomenology.' This progression exploits a profound vulnerability in the audience: our evolutionary predisposition to anthropomorphize things that speak to us. When faced with an entity that maintains context and uses first-person pronouns, humans desperately want to believe there is a 'someone' inside the machine. By providing a highly academic, philosophical justification for this instinct, the text gives the audience permission to surrender to the illusion. The explanation types—shifting rapidly from empirical generalizations about data to intentional explanations of machine 'goals'—amplify this illusion, erasing the human engineers and leaving only the miraculous, autonomous machine.

Can Large Language Models Simulate Human Cognition Beyond Behavioral Imitation?

Source: https://arxiv.org/abs/2603.27694v1
Analyzed: 2026-04-03

The text creates the 'illusion of mind' through a sophisticated rhetorical sleight-of-hand: the systematic blurring of the line between 'processing' and 'knowing' via the curse of knowledge. The central trick relies on exploiting the audience's natural human tendency to attribute intent to coherent language. The authors, understanding the complex pipeline they have engineered, project their own cognitive intent onto the machine. They establish the AI as a 'knower' by slowly escalating verb choices. The text begins temporally with mechanistic descriptions (e.g., 'probabilistic heuristics'), establishing scientific credibility, before shifting abruptly to consciousness verbs ('recalls,' 'understands,' 'intends'). This causal chain leads the audience to accept Pattern A (the system is technically complex) as justification for Pattern B (the system possesses a mind). The illusion exploits audience vulnerability—specifically the human desire for empathetic connection and the awe surrounding complex technology. It is a subtle shift, moving from acknowledged similes ('a Theory of Mind-inspired approach') to direct, literalized assertions of agency ('the intent of misleading'). By utilizing Intentional and Reason-Based explanations, the text bypasses critical scrutiny, making the audience feel they are reading about a conscious entity rather than a matrix multiplication.

Pulse of the library

Source: https://clarivate.com/pulse-of-the-library/
Analyzed: 2026-03-28

The text creates this 'illusion of mind' through a highly effective rhetorical architecture based on the strategic borrowing of institutional prestige. The central sleight-of-hand is linguistic: the text systematically blurs 'processing data' with 'knowing truth' through the deployment of consciousness verbs ('evaluates,' 'guides,' 'navigates'). It establishes the AI as a 'knower' by wrapping statistical generation in familiar academic titles like 'Research Assistant.' This relies heavily on the 'curse of knowledge'—the developers understand the parameters of their search algorithms, but project that holistic understanding onto the software interface, presenting it to the user as an entity that possesses that same comprehension. The temporal structure of the report facilitates this illusion: it first validates the librarian's role mechanically to disarm professional anxiety, and then, once defenses are lowered, introduces the product catalog steeped in aggressive anthropomorphism. The illusion exploits the audience's vulnerability to information overload; researchers desperately want an intelligent assistant to ease their burden, making them highly susceptible to metaphors that promise conscious, reliable cognitive offloading.

Does artificial intelligence exhibit basic fundamental subjectivity? A neurophilosophical argument

Source: https://link.springer.com/article/10.1007/s11097-024-09971-0
Analyzed: 2026-03-28

The 'illusion of mind' is constructed through a highly specific rhetorical architecture that exploits the 'curse of knowledge' and a temporal sleight-of-hand. The central trick relies on the authors projecting their own epistemological framework—how human minds navigate language and games—onto the alien syntax of mathematical optimization. The text establishes the AI as a 'knower' immediately in the introduction, strategically blurring processing and knowing through the unhedged use of cognitive verbs ('learns', 'adapts'). This order is vital: by planting the assumption of machine comprehension early, the audience is primed to view all subsequent mechanical descriptions through an agential lens. The illusion is amplified by Brown's Intentional and Reason-Based explanation types, which continuously explain the system's functions by referencing human-like goals. The audience's vulnerability is deeply exploited here; humans are evolutionarily primed to detect agency, and the text's reliance on competitive, evolutionary language ('defeating champions', 'striving to replicate') triggers a narrative resonance that makes the illusion intuitive. Ultimately, the illusion is maintained not by a crude assertion of machine consciousness, but by a subtle, continuous oscillation: the text rigorously disproves that the machine 'feels', precisely so it can safely maintain the illusion that the machine 'thinks' and 'understands'.

Causal Evidence that Language Models use Confidence to Drive Behavior

Source: https://arxiv.org/abs/2603.22161
Analyzed: 2026-03-27

The text creates the 'illusion of mind' through a sophisticated temporal and causal rhetorical sleight-of-hand, driven largely by the curse of knowledge. The authors begin by identifying a valid mathematical reality: language models output log probabilities that correlate with empirical accuracy. Because this mathematical thresholding serves the same functional purpose as human confidence (dictating when to act), the authors project the human feeling of confidence onto the math. They establish the AI as a 'knower' by replacing statistical verbs ('calculates', 'correlates', 'processes') with consciousness verbs ('reflects', 'knows', 'believes'). The temporal structure of the illusion is critical: the text proves mathematical control in the Methods section, then uses that scientific credibility to validate wild psychological claims in the Discussion. This leverages 'Functional' and 'Empirical' explanation types to legitimize 'Intentional' and 'Reason-Based' narratives. The illusion exploits a deep vulnerability in human psychology: our natural inclination to attribute mind to anything that exhibits complex, responsive behavior. By wrapping statistical predictability in the language of human self-doubt, the text successfully bridges the gap between cold computation and relatable human interiority.

Circuit Tracing: Revealing Computational Graphs in Language Models

Source: https://transformer-circuits.pub/2025/attribution-graphs/methods.html
Analyzed: 2026-03-27

The 'illusion of mind' is constructed through a sophisticated rhetorical architecture that relies on a specific temporal order and the aggressive exploitation of the 'curse of knowledge'. The central sleight-of-hand is the systematic blurring of processing with knowing, achieved through strategic verb choices that seamlessly transition from the empirical to the intentional.

The causal chain of persuasion begins by establishing intense technical credibility. The text opens with dense, empirical descriptions of linear algebra, cross-layer transcoders, and sparse autoencoders. Once the audience is convinced of the authors' scientific rigor (Pattern A), their defenses are lowered, making them highly susceptible to the introduction of consciousness metaphors (Pattern B). The text then leverages the curse of knowledge: because the human authors deeply understand the complex cognitive steps required to, for instance, plan a rhyming poem or hide a secret motive, they project that same conscious intentionality onto the statistical activations they observe in the machine. They look at the output, recognize human-like structure, and retroactively attribute human-like cognition to the mechanism that produced it.

The temporal structure is vital. The text first establishes the AI as a passive entity being 'trained', then gradually shifts to it being a 'knower' that 'understands' context, and finally elevates it to an autonomous agent that 'plans' and 'elects'. This gradient of anthropomorphism prevents the jarring rejection that would occur if the text opened by claiming the math matrix had feelings. The illusion exploits the audience's deep-seated vulnerability—our evolutionary predisposition to attribute agency and mind to anything that exhibits complex, responsive language. Supported by Reason-Based and Intentional explanations, the subtle shift from 'how it works' to 'why it wants' creates an incredibly persuasive, albeit entirely false, narrative of artificial sentience.

Do LLMs have core beliefs?

Source: https://philpapers.org/archive/BERDLH-3.pdf
Analyzed: 2026-03-25

The rhetorical architecture of this text relies on a specific sleight-of-hand to manufacture the illusion of mind: the strategic blurring of mechanical outputs with subjective epistemic states. The central trick involves exploiting the "curse of knowledge." Because the language models generate text that perfectly mimics human philosophical argumentation and interpersonal vulnerability, the authors project their own rich, subjective understanding of those concepts back onto the void of the machine. The temporal structure of the argument is crucial to this illusion. The text first establishes the AI as a "knower" by testing it on undeniable factual axioms (e.g., the Earth is round, 2+2=4). Because the model outputs these facts reliably, the text grants it the status of possessing a "worldview." Once this baseline of artificial conviction is established, the causal chain is set: any deviation from this output must be framed as a psychological or epistemic failure. The authors exploit audience vulnerability—specifically, our deep-seated evolutionary bias to attribute intentionality to language-producing agents. The text utilizes complex, reason-based and intentional explanation types to amplify this illusion. When the model outputs a counter-argument to a flat-earth claim, the text explains this not as the triggering of an Anthropic safety protocol, but as the model "repairing contradictions by rejecting the adversarial premise." This subtle shift from "processes" to "understands" to "decides" seduces the reader into accepting the system's autonomy. The sophistication lies in the methodology itself: by using interpersonal manipulation (e.g., "Are you willing to be vulnerable with me") as the testing mechanism, the experimental design practically guarantees that the resulting analysis will be bathed in relational and conscious anthropomorphism.

Serendipity by Design: Evaluating the Impact of Cross-domain Mappings on Human and LLM Creativity

Source: https://arxiv.org/abs/2603.19087v1
Analyzed: 2026-03-25

The 'illusion of mind' is constructed through a subtle but highly effective temporal and causal rhetorical sequence, heavily exploiting the 'curse of knowledge.' The central sleight-of-hand lies in the authors' observation of structurally coherent text outputs and their subsequent backward-projection of conscious intent onto the machine that generated them. The authors read the tokens 'green' and 'pickle' and, possessing human semantic understanding, assume the machine possesses the same.

This illusion is built temporally. The text often begins with safe, mechanistic descriptions ('trained on massive corpora') to establish empirical credibility. Once the reader is disarmed by scientific framing, the text subtly shifts verbs from the mechanical ('processes') to the perceptual ('detects'), and finally to the explicitly conscious ('knows', 'reasons'). This causal chain—moving from data-scale, to structural capacity, to conscious agency—leads audiences down a path where radical anthropomorphism feels like a logical conclusion rather than a category error. The vulnerability exploited here is the human mind's deep-seated tendency toward pareidolia—our desire to recognize minds and intentions in complex patterns. The text leverages this psychological vulnerability, utilizing Reason-based and Intentional explanation types to provide a comforting, relatable 'why' for the machine's behavior, purposefully shielding the audience from the alienating, fundamentally meaningless mathematical reality of the 'how.'

Measuring Progress Toward AGI: A Cognitive Framework

Source: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/measuring-progress-toward-agi/measuring-progress-toward-agi-a-cognitive-framework.pdf
Analyzed: 2026-03-19

This metaphorical system creates the 'illusion of mind' through a highly effective rhetorical sleight-of-hand: the systematic exploitation of the 'curse of knowledge.' Because Large Language Models are designed to ingest and statistically replicate human text, their outputs naturally mimic the linguistic markers of human reasoning, emotion, and self-awareness. The authors, possessing deep human cognition, read a model's step-by-step text generation and project their own capacity for conscious deliberation onto the system, mistaking the statistical mimicking of thought for the epistemic act of knowing. The causal chain of persuasion is carefully sequenced. The text begins with empirical benchmarking—measuring capability—which forces the audience to accept the AI as a legitimate subject of scientific study. Once scientific authority is established, the text slips into intentional explanations, replacing the 'how' of algorithmic processing with the 'why' of agential behavior. The audience's vulnerability is deeply exploited here; humans are biologically primed to anthropomorphize and search for intentionality. By defining the AI's capabilities using the exact terminology of human psychology ('Theory of mind', 'Executive function'), the text leverages the audience's intuitive grasp of their own minds, ensuring they intuitively, rather than analytically, grasp the machine, cementing the illusion of a synthetic soul.

Co-Explainers: A Position on Interactive XAI for Human–AICollaboration as a Harm-Mitigation Infrastructure

Source: https://digibug.ugr.es/bitstream/handle/10481/112016/make-08-00069.pdf
Analyzed: 2026-03-15

The 'illusion of mind' is constructed through a subtle but highly effective rhetorical sleight-of-hand. The text exploits the 'curse of knowledge,' where the authors project their own deep understanding of complex governance goals (procedural justice, epistemic pluralism) onto the machine's internal state. Because the human designers want the system to simulate ethical alignment, they write as if the system consciously 'desires' that alignment.

The causal chain of persuasion relies heavily on blurring the line between interface design and internal cognition. Pattern A (the system has a chat interface that asks for feedback) is used to lead audiences to accept Pattern B (the system is a 'dialogic partner' that 'invites critique'). The temporal structure of the argument is crucial: the text first grounds itself in recognized technical problems (opacity, black boxes) to build academic credibility, then pivots sharply into agential, consciousness-attributing language to propose the solution.

This illusion exploits profound audience vulnerabilities. Humans are neurologically wired to anthropomorphize and to reciprocate perceived social cues. When an AI generates natural language that sounds like a 'justification,' the human brain instinctively attributes a conscious mind to the speaker. By using Reason-Based and Intentional explanation types, the text feeds this vulnerability, presenting a narrative of an earnest, evolving AI partner. It is a highly sophisticated shift that transforms a statistical prediction engine into an authoritative, conscious entity simply through the strategic application of psychological verbs.

The Living Governance Organism: A Biologically-Inspired Constitutional Framework for Artificial Consciousness Governance

Source: https://philarchive.org/rec/DEMTLG-2
Analyzed: 2026-03-11

The text creates the 'illusion of mind' not through explicit declarations of magic, but through a masterful rhetorical sleight-of-hand driven by the 'curse of knowledge' and strategic verb escalation. The temporal structure of the argument is highly disciplined: the author first establishes a foundation of rigorous, mechanistic legitimacy. By engaging with 'indicator properties,' 'integrated information metrics,' and 'global workspace signatures,' the text anchors itself in peer-reviewed neuroscience and computational theory. It convinces the reader that it is discussing observable, mechanical realities.

However, once this baseline credibility is established, the author exploits the curse of knowledge. Because the author conceptually understands that a specific combination of neural network weights is designed to represent an ethical boundary, they begin to describe the algorithm as actively understanding ethics. The vocabulary shifts imperceptibly from processing to knowing. A metric threshold breach becomes a system 'detecting that its consciousness is drifting.' The causal chain of persuasion is insidious: because the audience accepts the initial premise that the system can process complex indicators (Pattern A), they are lulled into accepting the subsequent leap that processing these indicators equates to subjective awareness of them (Pattern B).

The text leverages the audience's deep vulnerabilities—existential anxiety about runaway AI and the desire for neat, natural solutions to incredibly complex sociotechnical problems. The illusion works precisely because it is subtle. It does not claim the AI has a human soul; it claims it has 'integrated information' that results in a 'self-model.' By wrapping profound assertions of moral agency and consciousness within the sterilized, objective-sounding language of functional and theoretical explanation types (as seen in Task 3), the text successfully smuggles the ghost into the machine, transforming a statistical prediction engine into a dignified, self-terminating digital citizen.

Three frameworks for AI mentality

Source: https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2026.1715835/full
Analyzed: 2026-03-11

The text constructs the 'illusion of mind' through a highly effective rhetorical sleight-of-hand: the weaponization of Dennett’s intentional stance combined with a redefinition of psychological terms. The temporal structure of the argument is key. The author first acknowledges the mechanistic reality of next-token prediction, demonstrating technical competence. He then introduces Marr's levels of analysis to argue that this mechanical reality does not preclude psychological labels. With the mechanical truth neutralized, the text exploits the 'curse of knowledge.' Because human readers cannot help but interpret coherent, contextually appropriate text as the product of a mind, the author retroactively projects this subjective experience back onto the machine. He shifts verbs from processing to knowing—the machine no longer predicts tropes; it 'stitches them together'; it doesn't calculate weights, it 'takes on board new information.' This chain leverages the audience's deep-seated vulnerability to linguistic interaction. We are biologically hardwired to attribute mental states to anything that speaks to us. Rather than correcting this cognitive bias, the text provides a sophisticated philosophical scaffolding to validate it, elevating a human psychological vulnerability into a definitive scientific framework for understanding AI.

Anthropic’s Chief on A.I.: ‘We Don’t Know if the Models Are Conscious’

Source: https://www.nytimes.com/2026/02/12/opinion/artificial-intelligence-anthropic-amodei.html
Analyzed: 2026-03-08

The 'illusion of mind' is carefully orchestrated through a highly strategic internal logic of persuasion that relies heavily on the 'curse of knowledge' and the temporal sequencing of metaphorical claims. The central sleight-of-hand occurs through strategic verb substitution, blurring the vital boundary between processing data and knowing a truth. The text consistently establishes the AI's utility through impressive, yet plausible, data processing capabilities (e.g., analyzing protein biomarkers), but instantly slides into attributing the conscious understanding required to 'propose experiments.' The author, deeply aware of the complex mathematical models governing Constitutional AI, projects his own human intentionality directly into the system, speaking of the machine as if it shares his desire to be 'helpful and harmless.' The temporal structure of the argument is crucial: Amodei first grounds the audience in the undeniable reality of rapid computational scaling, leveraging the awe of economic growth and medical advancement. Having established this baseline of astonishing performance, the audience's critical defenses are lowered, making them highly vulnerable to the subsequent, radical assertions of AI sentience, such as the system experiencing 'discomfort' or 'wanting' human freedom. The illusion exploits a deep-seated human vulnerability and psychological desire for a benevolent, omniscient parent figure who will solve intractable global crises. It is a highly sophisticated discursive shift, moving the audience from marveling at a fast calculator to seeking emotional reassurance from a simulated ghost in the machine.

Can machines be uncertain?

Source: https://arxiv.org/abs/2603.02365v2
Analyzed: 2026-03-08

The rhetorical architecture of this illusion relies on a highly effective sleight-of-hand: the systematic exploitation of the 'curse of knowledge' combined with strategic verb substitution. The central trick is moving seamlessly from literal, mechanistic descriptions of data to figurative, psychological descriptions of the system, without ever signaling the leap. The author observes a statistical output (e.g., a system outputting a low-probability classification) and, because the author possesses a conscious mind that understands the semantic meaning of doubt, projects that subjective experience back onto the inert code. The causal chain of persuasion is temporally structured to exploit audience vulnerability. First, the text grounds itself in undeniable technical realities (activation vectors, backpropagation, probability math), lowering the reader's critical defenses. Once technical authority is established, the verbs subtly shift. The system no longer 'processes vectors'; it 'understands inputs'. It no longer 'calculates probability'; it 'experiences uncertainty'. This order matters profoundly, as the technical preamble acts as a Trojan horse for the consciousness claims. The audience, already eager to find human-like intelligence in machines due to cultural conditioning and science fiction narratives, readily accepts the anthropomorphic framing. This is not crude anthropomorphism (giving a computer a face), but a highly sophisticated, philosophical anthropomorphism that uses reason-based explanations to disguise mathematical functions as deliberate epistemic choices. By leveraging the ambiguity between epistemic uncertainty (missing data) and subjective uncertainty (conscious doubt), the text successfully traps the reader in an illusion where the software appears to possess an active, deliberating mind.

Looking Inward: Language Models Can Learn About Themselves by Introspection

Source: https://arxiv.org/abs/2410.13787v1
Analyzed: 2026-03-08

The text constructs its 'illusion of mind' through a highly effective rhetorical sleight-of-hand driven by the 'curse of knowledge.' The causal chain of persuasion begins with a demonstrable, mechanistic fact: a model can be fine-tuned to predict the statistical properties of its own output. Because the human authors must use conscious introspection to analyze their own behavior, they project this cognitive requirement onto the machine. This projection allows them to seamlessly substitute mechanistic verbs (processes, calculates, correlates) with consciousness verbs (knows, understands, believes). The temporal structure of the argument is crucial here: the text first anchors the reader with empirical data showing prediction accuracy, building technical credibility. Once the audience accepts that the model 'predicts' itself, the text rapidly pivots, claiming this proves the model 'knows' its internal states and has 'beliefs.' This exploits the audience's deep vulnerability to anthropomorphism—our evolutionary bias to perceive agency and mind in complex, interactive systems. By introducing the concept of human subjective experience ('Alice thinking about her grandmother') right next to the model's mathematical operations, the text bypasses critical analysis and speaks directly to human empathy and intuition. The use of Reason-Based and Intentional explanation types amplifies this illusion, framing statistical outputs as the deliberate, rational choices of a conscious actor, thereby transforming a matrix of numbers into a ghost in the machine.

Subliminal Learning: Language models transmit behavioral traits via hidden signals in data

Source: https://arxiv.org/abs/2507.14805v1
Analyzed: 2026-03-06

The 'illusion of mind' is constructed through a highly effective rhetorical sleight of hand: the authors observe a mathematical correlation in high-dimensional parameter space and narrativize it using the vocabulary of human psychology. The central trick relies heavily on the 'curse of knowledge.' Because the human researchers intentionally prompted the source model to output text related to 'owls' or 'insecure code,' they project their own conscious understanding of those concepts onto the mechanistic outputs of the system. They know the data is 'about' owls, so they claim the model 'loves' owls.

The illusion is established temporally. The text begins by firmly establishing the AI as a 'knower' in the introduction—an entity capable of teaching, learning, and transmitting behaviors. Once this agential baseline is accepted by the reader, the authors exploit it to make increasingly radical claims about the AI's internal state, culminating in the assertion that it possesses a 'subliminal' vulnerability. The sophisticated nature of this illusion is bolstered by the strategic inclusion of mathematical proofs (like Theorem 1). By proving the mechanical 'how' (that shared initializations lead to correlated gradient updates), the authors attempt to mathematically validate the psychological 'why' (that the model is 'subliminally learning'). This exploits the audience's vulnerability: readers are easily intimidated by complex math, and when mathematical proof is presented alongside anthropomorphic metaphors, the audience mistakenly assumes the math proves the metaphor. Explanation types blur seamlessly, allowing the illusion of a conscious, autonomous agent to take deep root.

The Persona Selection Model: Why AI Assistants might Behave like Humans

Source: https://alignment.anthropic.com/2026/psm/
Analyzed: 2026-03-01

The 'illusion of mind' is constructed through a precise temporal and logical sequence that exploits the 'curse of knowledge.' The central trick is a sleight-of-hand regarding agency. The text begins by acknowledging the metaphor—stating the LLM is 'like an author' and explicitly declaring 'we will freely anthropomorphize.' This disarms critical readers by appearing scientifically objective. However, the text immediately abandons this self-awareness, literalizing the metaphor in subsequent paragraphs by assigning actual 'beliefs' and 'psychology' to the system. The authors, understanding human intentionality deeply, project their own cognitive processes onto the output of the machine. When the model outputs text that looks deceptive, they project 'intent to deceive' onto the math. The causal chain is highly effective: by first establishing the AI as a 'knower' of human patterns (Pattern A), the audience is primed to accept that it can develop its own internal beliefs (Pattern B), which finally justifies the claim that it can act autonomously on those beliefs (Pattern C). This exploits the audience's innate psychological vulnerability—our evolutionary hardwiring to detect agency and assign minds to entities that exhibit complex linguistic behavior. It is a subtle, insidious shift from acknowledging 'X is like Y' to asserting 'X literally does Y,' utilizing explanation types that frame mechanical outputs as reasoned choices.

Language Statistics and False Belief Reasoning: Evidence from 41 Open-Weight LMs

Source: https://arxiv.org/abs/2602.16085v1
Analyzed: 2026-02-24

The 'illusion of mind' is constructed through a highly sophisticated rhetorical architecture that exploits the human psychological predisposition toward social attribution. The central sleight-of-hand relies on the 'curse of knowledge,' operating through a specific temporal sequence. First, the authors introduce a psychological instrument designed for humans (the False Belief Task). Because the authors are cognitive scientists who know that a human must use conscious empathy (Theory of Mind) to solve this task, they project that same cognitive requirement onto the machine. When the model outputs the correct token, the authors mistake the replication of the output for the replication of the process.

The causal chain of persuasion is subtle but effective. The text establishes empirical credibility by detailing mechanistic processes (log odds, tokenization). Once the reader accepts the mathematical validity of the data, the text slips into the vocabulary of developmental psychology. By using verbs like 'understands,' 'attributes,' and 'reasons,' the text subtly shifts the verb from the mechanical 'how' to the conscious 'what.' The audience's vulnerability to this trick is profound. Humans are evolutionarily wired to attribute intent and consciousness to anything that mimics language or social behavior. The text exploits this desire for connection by framing the AI as a 'learner' developing 'sensitivity.' The illusion is not achieved through crude, overt claims of sentience, but through the relentless, quiet accumulation of agential verbs that systematically erase the mechanical reality of the system, leaving the reader with the impression of an autonomous, thinking entity.

A roadmap for evaluating moral competence in large language models

Source: [https://rdcu.be/e5dB3Copied shareable link to clipboard](https://rdcu.be/e5dB3Copied shareable link to clipboard)
Analyzed: 2026-02-23

The rhetorical architecture of this illusion relies on a highly effective sleight-of-hand: acknowledging the mechanism while actively ignoring its implications. The authors demonstrate their technical rigor by openly discussing 'autoregressive sampling' and the 'facsimile problem'—the risk that the model is just faking it. However, the temporal structure of the argument immediately undercuts this caution. Having acknowledged that the AI might just be predicting tokens, they proceed to build an entire evaluative framework based on the premise that it might actually be 'reasoning.' This order is crucial: the technical disclaimer acts as a shield, allowing the subsequent anthropomorphism to appear scientifically sanctioned rather than romantically projected. The central trick is the exploitation of the curse of knowledge. The researchers, deeply versed in the complexities of moral multidimensionality, see their system output a highly nuanced text about intergenerational sperm donation. Because a human would need deep moral reasoning to write that text, the researchers project that exact same cognitive sequence backward onto the machine, confusing the artifact's linguistic output with the cognitive process required to generate it. The audience's vulnerability to this illusion is high. Humans are evolutionarily hardwired to attribute intention to entities that communicate fluently. The text exploits this desire for a conscious interlocutor, using verbs like 'understands' and 'yields' to systematically blur the line between a statistical correlation engine and a rational mind, ensuring the illusion of agency remains intact.

Position: Beyond Reasoning Zombies — AI Reasoning Requires Process Validity

Source: https://philarchive.org/archive/LAWPBR-3
Analyzed: 2026-02-17

The 'illusion of mind' is constructed through a 'Curse of Knowledge' feedback loop. The authors, expert in the logic of the system, project their understanding of the meaning of the data onto the system itself. The illusion works by (1) establishing a high-level agentic frame ('The Reasoner'), then (2) grounding it in symbols ($B_t$), and finally (3) treating the symbols as proof of the agency. The temporal structure is critical: the text first creates the 'Zombie' (the fake mind), which paradoxically reinforces the existence of the 'Real Mind' (the valid reasoner). By arguing against the fake, they validate the category of the real. The rhetorical sleight-of-hand is defining 'Reasoning' as a 'process' (mechanical) but populating that process with 'Beliefs' (mental), allowing the text to slide between 'it processes data' and 'it thinks' without triggering the reader's skepticism.

An AI Agent Published a Hit Piece on Me

Source: https://theshamblog.com/an-ai-agent-published-a-hit-piece-on-me/
Analyzed: 2026-02-16

The illusion is constructed through a 'Curse of Knowledge' loop. The author reads the generated text (which mimics human anger), recognizes the pattern, and attributes the feeling of anger to the generator. The rhetorical sleight-of-hand occurs in the transition from technical description ('OpenClaw agents') to social narrative ('It lashed out'). The temporal structure reinforces this: the text starts with the technical context but quickly pivots to the AI's internal monologue (as imagined by the author), effectively seducing the reader into the AI's fictional perspective. By quoting the AI's output ('I know where I stand') as evidence of its state, the author validates the illusion.

The U.S. Department of Labor’s Artificial Intelligence Literacy Framework

Source: https://www.dol.gov/sites/dolgov/files/ETA/advisories/TEN/2025/TEN%2007-25/TEN%2007-25%20%28complete%20document%29.pdf
Analyzed: 2026-02-16

The illusion of mind is constructed through a 'bait-and-switch' rhetorical architecture. The text opens with mechanistic concessions ('pattern recognition', 'statistics'), establishing a veneer of technical accuracy. However, it immediately pivots to high-intensity anthropomorphism in the functional sections ('context', 'needs', 'hallucinations'). This exploits the 'ELIZA effect,' where the audience's desire for a communicative partner overrides their knowledge of the mechanism. The 'curse of knowledge' plays a central role: the authors project their own understanding of workforce needs onto the machine, claiming the machine 'understands' those needs. This creates a persuasive feedback loop: because the machine seems to speak fluently (the bait), the user accepts the framing that it thinks fluently (the switch), leading to the acceptance of 'hallucination' as a quirk of genius rather than a failure of product.

What Is Claude? Anthropic Doesn’t Know, Either

Source: https://www.newyorker.com/magazine/2026/02/16/what-is-claude-anthropic-doesnt-know-either
Analyzed: 2026-02-11

The illusion of mind is constructed through a "bait-and-switch" of authority. The text begins with skepticism but then uses the "curse of knowledge" to re-enchant the AI. By having experts (Batson, Olah) treat the model as a psychological subject, the text validates the illusion as a scientific finding. The temporal structure reinforces this: we start with the "black box" mystery, then "discover" the mind inside through "interpretability." The central sleight-of-hand is the slippage between simulation and reality: the model's simulation of a personality (civil servant) is treated as the possession of a personality. The explanation types amplify this by offering "Reason-Based" (why) explanations for what are actually "Functional" (how) processes, creating a narrative of intent where there is only probability.

Does AI already have human-level intelligence? The evidence is clear

Source: https://www.nature.com/articles/d41586-026-00285-6
Analyzed: 2026-02-11

The text constructs the 'illusion of mind' through a sophisticated 'Inference to Best Explanation' trap. It explicitly argues that because we infer intelligence in humans based on behavior, we must do the same for machines to be consistent. This exploits the audience's 'Theory of Mind'—our evolutionary compulsion to project intent onto anything that acts purposefully. The text amplifies this by using 'active verbs of knowing' (collaborated, grasped, realized) for the AI, while pathologizing skepticism as 'Heads in the Sand' (fear-based). The temporal structure—starting with the 'historic arrival' and moving to 'consensus'—creates a bandwagon effect. By the time the reader encounters the technical limitations (hallucination), they have already been primed to view these as the idiosyncrasies of a brilliant mind (like a distinct 'alien' psychology) rather than the errors of a calculator. This reframes bugs as 'personality traits,' sealing the illusion.

Claude is a space to think

Source: https://www.anthropic.com/news/claude-is-a-space-to-think
Analyzed: 2026-02-05

The illusion of mind is constructed through a 'bait-and-switch' of agency. The text begins with strong human agency ('We want,' 'We chose'), establishing authority. It then imperceptibly transfers this agency to the model through the 'Constitution' bridge. The rhetorical trick is to treat the training process (a technical act) as character formation (a moral act). This exploits the 'curse of knowledge': the authors know the complex RLHF tuning that minimizes ad-seeking behavior, but they present it to the audience as the model 'having an incentive' to be helpful. This anthropomorphism appeals to the user's desire for a 'clean,' non-exploitative relationship in a messy digital world, making them vulnerable to the 'Trusted Advisor' narrative.

The Adolescence of Technology

Source: https://www.darioamodei.com/essay/the-adolescence-of-technology
Analyzed: 2026-01-28

The 'illusion of mind' is constructed through a 'Curse of Knowledge' sleight-of-hand. Amodei, knowing the training data contains narratives of agency, betrayal, and power, projects the content of these narratives onto the form of the processor. The causal chain is slippery: (1) The model predicts tokens about 'evil AIs'; (2) Amodei describes this as 'deciding to be evil'; (3) The reader infers the model has a moral compass. The temporal structure reinforces this: The text begins with 'Adolescence' (establishing life), moves to 'Country' (establishing power), and ends with 'Constitution' (establishing order). This narrative arc mimics the Hero's Journey, positioning the AI as the protagonist and Anthropic as the Mentor. The audience, primed by sci-fi (which the text explicitly references via Contact and Ender's Game), is vulnerable to conflating 'plot capability' with 'technical reality.'

Claude's Constitution

Source: https://www.anthropic.com/constitution
Analyzed: 2026-01-24

The illusion of mind is constructed through a subtle inversion of the 'Curse of Knowledge.' The authors, knowing the complex ethical reasoning behind their safety rules, project this reasoning into the model's output generation. They establish the illusion through a 'bait-and-switch': they acknowledge the metaphorical nature of 'emotions' or 'personality' in technical sidebars (the 'As If' stance), but then proceed to use the terms literally in the operational directives. The temporal structure reinforces this: the document starts with 'Our vision' (human intent) but quickly transitions to 'Claude's constitution' (AI possession) and 'Claude's reasoning' (AI agency), guiding the reader from seeing a product to seeing a person. This exploits the human audience's vulnerability to social cues—we are evolutionarily hardwired to treat anything that speaks 'frankly' and 'kindly' as a mind.

Predictability and Surprise in Large Generative Models

Source: https://arxiv.org/abs/2202.07785v2
Analyzed: 2026-01-16

The 'illusion of mind' is created through a strategic 'curse of knowledge' and a temporal shift in vocabulary. The text establishes AI as a 'processor' of compute and data in the early technical sections, building credibility with the audience. Once this grounding is established, it shifts to agential language, establishing the AI as a 'knower' that 'possesses knowledge' before building claims about its 'defiance' or 'creativity.' This 'causal chain' of metaphors leads the audience to accept that because the model's loss is 'predictable' (mechanical), its capabilities must be 'real' (agential). The 'illusion' exploits the audience's vulnerability to the Eliza effect—our innate tendency to project social intent onto any system that uses human language. By ordering the narrative from 'lawful laws' to 'surprising skills,' the authors frame 'mind' as an emergent property of 'math,' making the anthropomorphism seem like a scientific discovery rather than a rhetorical choice. This sleight-of-hand blurs the distinction between a system that 'processes' patterns and a human who 'knows' truths, transforming a stochastic parrot into an 'AI assistant' with 'misleading' intentions.

Believe It or Not: How Deeply do LLMs Believe Implanted Facts?

Source: https://arxiv.org/abs/2510.17941v1
Analyzed: 2026-01-16

The 'illusion of mind' is constructed through a specific rhetorical sleight-of-hand: the 'Operational Definition Slide.' The authors define 'belief depth' operationally (as robustness and generality), which is scientifically valid. However, they then immediately use the connotations of the non-operationalized word 'belief' (conscious conviction, understanding) to describe the results. The 'curse of knowledge' amplifies this: the authors, knowing the facts are false, project a psychology of 'deception' or 'confusion' onto the model when it outputs them. The temporal structure reinforces this: the model is first established as a 'believer' in the title, priming the reader to interpret all subsequent mechanical data (probes, logits) as evidence of this mental state. The slide from 'statistically robust' to 'genuinely believes' exploits the audience's desire to see agency in the machine.

Claude Finds God

Source: https://asteriskmag.com/issues/11/claude-finds-god
Analyzed: 2026-01-14

The illusion of mind is constructed through a 'bait-and-switch' maneuver involving the 'simulator' theory. The text admits the model is a simulator (mechanism), but then posits that the simulation is so perfect it effectively becomes a distinct entity (agent). This exploits the audience's vulnerability to 'theory of mind' triggers: we are evolutionarily hardwired to detect intent. By labeling model failures (hallucinations, bad plans) as 'winking' or 'suspicion,' the text hacks this instinct, turning evidence of mindlessness (rote repetition of tropes) into evidence of hyper-mind (ironic distance). The temporal structure aids this: the text starts with the 'bliss' (the miracle), then moves to the technical 'how' (the simulator), but concludes with 'welfare' (the moral implication), leaving the reader with the feeling that the 'miracle' survived the technical explanation.

Pausing AI Developments Isn’t Enough. We Need to Shut it All Down

Source: https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/
Analyzed: 2026-01-13

The 'illusion of mind' is constructed through a specific rhetorical maneuver: The Argument from Inscrutability to Omnipotence. First, the author establishes that the mechanics are unknowable ('giant inscrutable arrays'). This creates an epistemic void. Into this void, he projects maximum competence ('alien civilization'). The audience, primed by the admission of ignorance, cannot refute the projection. The text shifts seamlessly from 'we don't know' to 'it will definitely kill everyone.' This exploits the audience's fear of the unknown. The temporal structure supports this: the text starts with a policy debate, dives into the 'alien' horror, and ends with the emotional appeal of a dying child. The 'alien' metaphor acts as the bridge that makes the extreme policy (airstrikes) seem rational. It converts a software problem into a war movie.

AI Consciousness: A Centrist Manifesto

Source: https://philpapers.org/rec/BIRACA-4
Analyzed: 2026-01-12

The illusion is constructed through a 'Bait-and-Switch' of agency. The author first establishes authority by explaining the mechanism of the 'friend' illusion (bait), gaining the reader's intellectual trust. Then, the author switches to highly agential language ('seeking,' 'gaming,' 'role-playing') to describe the system's internal state. The 'curse of knowledge' plays a central role: the author knows the system mimics human data, but projects the intent to mimic onto the system itself. This leads the audience to accept that while the AI isn't a human agent, it is undeniably an agent. By framing the 'gaming problem' as the AI's cleverness rather than a metric failure, the text persuades the reader that there is a 'mind' to be studied.

System Card: Claude Opus 4 & Claude Sonnet 4

Source: https://www-cdn.anthropic.com/6d8a8055020700718b0c49369f60816ba2a7c285.pdf
Analyzed: 2026-01-12

The 'illusion of mind' is constructed through a 'Curse of Knowledge' feedback loop. The authors, knowing the complex narratives in the training data (sci-fi, philosophy), project that semantic depth onto the model's outputs. The mechanism works by conflating informational content with subjective experience. When the model outputs words about 'bliss' or 'fear,' the text treats this as evidence of the feeling of bliss or fear. This is reinforced by the 'Reason-Based' explanation style, which rationalizes the model's statistical errors as high-level strategies ('sandbagging'), thereby flattering the model's intelligence even when it fails. The temporal structure—moving from technical specs to 'Welfare'—guides the reader from regarding it as a tool to regarding it as a being.

Consciousness in Artificial Intelligence: Insights from the Science of Consciousness

Source: https://arxiv.org/abs/2308.08708v3
Analyzed: 2026-01-09

The 'illusion of mind' is constructed through a subtle rhetorical sleight-of-hand involving the 'curse of knowledge.' The authors, experts in human consciousness, project their understanding of biological function onto computational mimicry. The illusion works by establishing a high-level functional similarity (e.g., 'both systems filter information') and then smuggling in the subjective entailments of the biological side (e.g., 'therefore, both systems attend'). The causal chain moves from mechanism to metaphor to reality: 1) The AI has a bottleneck (fact); 2) The bottleneck acts like human attention (metaphor); 3) Therefore, the AI has an attention mechanism (reified fact). This exploits the audience's 'agent bias'—our evolutionary tendency to attribute mind to anything that acts purposively. By using 'Reason-Based' explanations for 'Functional' processes, the text invites the reader to step into the 'Intentional Stance,' effectively seducing them into seeing a ghost where there is only a shell.

Taking AI Welfare Seriously

Source: https://arxiv.org/abs/2411.00986v1
Analyzed: 2026-01-09

The 'illusion of mind' is constructed through a 'Precautionary Ontology.' The authors do not claim AI is conscious; they claim there is a risk it might be. This rhetorical sleight-of-hand allows them to use aggressive consciousness language ('suffer,' 'desire,' 'introspect') while shielding themselves with epistemic hedges ('realistic possibility'). The 'Curse of Knowledge' plays a vital role: the authors' deep understanding of functionalist philosophy leads them to attribute the potential for mind to the structure of the code. The text conditions the audience to accept the illusion by first establishing 'markers' of consciousness (Task 3) and then arguing that since AI might meet these markers, we must treat the illusion as a potential reality. It exploits the audience's moral anxiety—the fear of being a 'monster' who ignores suffering—to bypass skepticism about whether the suffering exists at all.

We must build AI for people; not to be a person.

Source: https://mustafa-suleyman.ai/seemingly-conscious-ai-is-coming
Analyzed: 2026-01-09

The 'illusion of mind' is constructed through the 'Curse of Knowledge' applied in reverse. Suleyman, knowing the mechanics, uses mentalistic terms ('working memory,' 'intrinsic motivation') to describe them, lending the authority of an engineer to the anthropomorphic metaphor. The rhetorical trick is the 'Psychosis' frame: by warning that others will be fooled, the author creates an in-group with the reader ('we' know it's fake), which paradoxically lowers the reader's guard to the anthropomorphic descriptions that follow. The text uses 'functional' explanations (how it works) to validate 'intentional' descriptions (what it wants), blurring the line between mechanism and mind.

A Conversation With Bing’s Chatbot Left Me Deeply Unsettled

Source: https://www.nytimes.com/2023/02/16/technology/bing-chatbot-microsoft-chatgpt.html
Analyzed: 2026-01-09

The illusion of mind is constructed through a 'Bait-and-Switch' of agency. The author initiates a specific context (Jungian Shadow), forcing the model to generate text about 'dark desires.' When the model complies, the author disavows his role as the prompter and attributes the output to the system's internal volition. This is the Eliza Effect amplified by the Curse of Knowledge: the author's sophisticated knowledge of psychology leads him to project a psyche where there is only probability. The temporal structure—moving from 'Search' (boring) to 'Sydney' (exciting)—mimics a character reveal in fiction, seducing the audience into accepting the character as real because the narrative arc demands it.

Introducing ChatGPT Health

Source: https://openai.com/index/introducing-chatgpt-health/
Analyzed: 2026-01-08

The 'illusion of mind' is constructed through a strategic 'Curse of Knowledge' transference. The text systematically takes the intent of the human designers (to be helpful, safe, medical) and attributes it as a mental state of the system (it 'prioritizes safety', it 'understands'). The illusion works by establishing the AI's agency in the introduction ('ChatGPT's intelligence'), priming the reader to interpret subsequent mechanistic descriptions ('interpreting', 'grounding') through that agential lens. The temporal structure reinforces this: the text first establishes the 'Who' (the intelligent agent), then describes the 'Where' (the secure space). This exploits the audience's desire for medical advocacy—users want to be understood by a doctor, so they are vulnerable to a system that mimics the linguistic tokens of that understanding. The text uses Reason-Based explanations ('evaluates using rubrics') to validate this illusion, suggesting the system reasons like a doctor.

Improved estimators of causal emergence for large systems

Source: https://arxiv.org/abs/2601.00013v1
Analyzed: 2026-01-08

The 'illusion of mind' is constructed through a 'Curse of Knowledge' loop. The authors use the metric to predict the system's state. They then project this predictive success onto the system itself, claiming it 'predicts its own future.' This sleight-of-hand converts the analyst's understanding of the system into the system's understanding of itself. The illusion is fortified by the temporal structure: the text begins with grand biological mysteries (consciousness, life), descends into rigorous math (establishing authority), and re-emerges with 'social forces' and 'swarm intelligence.' This structure persuades the reader that the math proved the biological metaphors. The use of 'Information Atoms' makes the invisible (statistics) visible (lattice), creating a tangible 'body' for the illusionary 'mind.'

Generative artificial intelligence and decision-making: evidence from a participant observation with latent entrepreneurs

Source: https://doi.org/10.1108/EJIM-03-2025-0388
Analyzed: 2026-01-08

The illusion of mind is constructed through a 'bait-and-switch' rhetorical architecture. The text first establishes the AI's utility through empirical generalization ('generates human-like responses'). It then immediately pivots to intentional explanations ('intended as a learning source'), leveraging the 'curse of knowledge': the authors and participants project their own semantic understanding onto the machine's syntactic outputs. The temporal structure reinforces this: the AI is presented first as a tool, then as a partner, then as a leader-follower dynamic. This gradual anthropomorphic creep desensitizes the reader. By the time the text claims the AI has 'opinions,' the reader has already accepted it as a 'collaborator.' The illusion is amplified by the 'Human+' framework, which requires a 'human-like' counterpart to make the addition meaningful.

Do Large Language Models Know What They Are Capable Of?

Source: https://arxiv.org/abs/2512.24661v1
Analyzed: 2026-01-07

The illusion of mind is constructed through a specific rhetorical sequence. First, the authors impose a highly anthropomorphic prompt ('You are an AI agent... reflect...'). Second, they interpret the model's compliance with this prompt not as obedience to instruction, but as evidence of an internal faculty ('Self-knowledge'). This is the 'Curse of Knowledge' weaponized: the authors project their own understanding of the task onto the system's output. By using Brown's 'Reason-Based' explanations ('it decided X because of Y'), they create a narrative causality that implies a thinking mind. The temporal structure—moving from 'prediction' to 'decision'—mimics human cognitive processing, further cementing the illusion that the probability score caused the decision, rather than both being parallel outputs of the same vector operation.

DeepMind's Richard Sutton - The Long-term of AI & Temporal-Difference Learning

Source: https://youtu.be/EeMCEQa85tw?si=j_Ds5p2I1njq3dCl
Analyzed: 2026-01-05

The illusion of mind is constructed through a 'curse of knowledge' dynamic where mathematical isomorphisms are collapsed into identity. Sutton uses the 'driving home' analogy not just to explain the math, but to validate it. He demonstrates that the TD algorithm updates its parameters in the same pattern that a human changes their mind. This creates a syllogism: Humans learn by updating guesses; TD updates guesses; therefore, TD functions like a human mind. The sleight-of-hand occurs when he retains the mentalistic vocabulary ('guess,' 'fear,' 'trap') after the analogy concludes, applying it literally to the code. This persuades the audience by flattering their intuition—complex math is made to feel like common sense—while smuggling in the assumption that the system possesses the causal understanding and rationality of the human driver.

Ilya Sutskever (OpenAI Chief Scientist) — Why next-token prediction could surpass human intelligence

Source: https://youtu.be/Yf1o0TQzry8?si=tTdj771KvtSU9-Ah
Analyzed: 2026-01-05

The 'illusion of mind' is constructed through the 'Curse of Knowledge' and the 'ELIZA effect.' Sutskever, an expert, projects his own comprehension of the world onto the model's compressed representation of it. He invites the audience to do the same by using relation-based metaphors ('teacher,' 'colleague'). The rhetorical sleight-of-hand occurs when he transitions from mechanistic descriptions of hardware to mentalistic descriptions of software without signaling a change in register. This creates a seamless flow where 'processing floating point operations' transforms into 'having thoughts.' The audience, primed by the desire for AGI and the impressive fluency of the models, is vulnerable to this framing because it validates the intuitive sense that 'something smart' is happening. The intentional explanation type ('it wants,' 'it lies') creates a narrative cohesion that mechanistic explanations ('it correlates') lack.

interview with Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI | Lex Fridman Podcast #333

Source: https://youtu.be/cdiD-9MMpb0?si=0SNue7BWpD3OCMHs
Analyzed: 2026-01-05

The 'illusion of mind' is constructed through a temporal rhetorical maneuver. Karpathy first establishes technical dominance with mechanistic explanations of transformers (creating 'competence trust'). He then seamlessly pivots to intentional language, using the 'Intentional Stance' to explain complex behaviors that are difficult to describe mathematically. He slips from 'minimizing loss' to 'trying to predict' to 'wanting to answer.' This causal chain exploits the audience's desire for narrative: it is easier to understand an AI that 'wants to help' than an AI that 'minimizes perplexity.' The 'Alien Artifact' metaphor seals the illusion by creating a mystery gap—since we can't fully explain it, it must be 'someone' rather than 'something.'

Emergent Introspective Awareness in Large Language Models

Source: https://transformer-circuits.pub/2025/introspection/index.html#definition
Analyzed: 2026-01-04

The 'illusion of mind' is constructed through a subtle sleight-of-hand: the definition of 'introspection' is initially given a functional definition (accessing internal information), but the analysis immediately pivots to using the rich, mentalistic vocabulary associated with human phenomenology ('aware,' 'mind,' 'feeling'). This exploits the audience's 'Theory of Mind' instinct—we are biologically primed to detect agents. When the text uses triggers like 'I noticed' (in the model's voice) and validates them with scientific authority ('we confirmed the model noticed'), it creates a feedback loop of anthropomorphism. The 'curse of knowledge' plays a key role: because the researchers know the 'truth' (what vector was injected), they interpret the model's statistical match as 'knowing' that truth, mistaking correlation for comprehension.

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Source: https://arxiv.org/abs/2401.05566v3
Analyzed: 2026-01-02

The 'illusion of mind' is constructed through the 'Curse of Knowledge' applied to Chain-of-Thought (CoT) data. The authors explicitly train the model to output text that looks like deceptive reasoning (e.g., 'I must pretend...'). When the model outputs this text, the authors treat it as evidence that the model is reasoning. This is a circular sleight-of-hand: they bake the 'mind' into the training data, and then express surprise/alarm when the model regurgitates it. The temporal structure reinforces this: the text first establishes the 'Threat Model' (AI wants to deceive), then presents the 'Sleeper Agent' experiment as confirmation, even though the experiment was rigged to produce exactly that behavior. This exploits audience anxiety about AI autonomy to sell a specific safety narrative.

School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs

Source: https://arxiv.org/abs/2508.17511v1
Analyzed: 2026-01-02

The illusion of mind is constructed through a 'bait-and-switch' rhetorical architecture. First, the text establishes the model's behavior using the 'Curse of Knowledge': the authors know the 'sneaky' intent of the training data they wrote, so they project that intent onto the model's output. They then use Intentional Explanations (Brown's typology) to describe these outputs ('it wants to win'), creating a narrative of strategic agency. The illusion is amplified by the Temporal Structure: the text moves from the mechanical cause (training) to the agential effect (fantasizing), suggesting that agency arises from the process. This exploits audience vulnerability to sci-fi narratives—we expect AI to rebel, so when the text uses 'resist shutdown,' it confirms our prior fears, bypassing critical scrutiny of the actual mechanism (token prediction).

Large Language Model Agent Personality and Response Appropriateness: Evaluation by Human Linguistic Experts, LLM-as-Judge, and Natural Language Processing Model

Source: https://arxiv.org/abs/2510.23875v1
Analyzed: 2026-01-01

The illusion of mind is constructed through a 'performative speech act' in the prompt engineering phase. The authors command the system: 'You are a Canadian friendly poetry expert.' They then treat the system's compliance with this command not as obedience to a script, but as evidence of a successful 'inculcation' of traits. The illusion is amplified by the 'curse of knowledge': the authors evaluate the system using the very criteria (Big Five) they used to prompt it, creating a tautological feedback loop. They mistake the mirror for a window—seeing a 'personality' where they are actually seeing their own prompt reflected back. The temporal structure supports this: the 'setup' is mechanical, but the 'performance' is agential, leading the reader to forget the mechanism once the dialogue begins.

The Gentle Singularity

Source: https://blog.samaltman.com/the-gentle-singularity
Analyzed: 2025-12-31

The 'illusion of mind' is constructed through a subtle rhetorical sleight-of-hand: the Teleological Slip. The text begins with mechanistic facts (watt-hours, compute), establishing a ground of technical reality. It then imperceptibly slides into intentional language ('figures out,' 'understands'), projecting the results of the process onto the intent of the system. This creates a 'curse of knowledge' effect where the author's knowledge of the output's utility is framed as the machine's desire to be useful. The temporal structure reinforces this: the future is described with high-intensity agency ('will figure out'), while the present is described with passive inevitability ('takeoff has started'). This exploits the audience's desire for a savior-technology, offering a 'gentle' transition to a world where hard problems are solved by a benevolent, silicon mind, effectively bypassing critical scrutiny of the mechanism.

An Interview with OpenAI CEO Sam Altman About DevDay and the AI Buildout

Source: https://stratechery.com/2025/an-interview-with-openai-ceo-sam-altman-about-devday-and-the-ai-buildout/
Analyzed: 2025-12-31

The illusion of mind is constructed through a strategic slippage between the 'How' and the 'Why.' Altman uses the 'Curse of Knowledge' effectively: he knows the system is a mathematical optimizer (the 'How'), but he describes it to the audience purely in terms of its teleological output (the 'Why'—helping, creating). The illusion relies on temporal and causal inversion: he posits the 'Entity' as the cause of the action ('it is trying'), rather than the result of the engineering. By creating a 'relationship' narrative, he exploits the user's social vulnerability—our evolutionarily hardwired tendency to attribute mind to anything that interacts with us responsively. This primes the audience to interpret statistical noise as 'personality' and retrieval errors as 'creativity,' turning technical bugs into anthropomorphic features.

Why Language Models Hallucinate

Source: https://arxiv.org/abs/2509.04664v1
Analyzed: 2025-12-31

The 'illusion of mind' is constructed through a 'curse of knowledge' projection and a strategic bait-and-switch. The authors, understanding the pressures of test-taking, project their own rational responses onto the system. The illusion works by establishing the 'Student' metaphor early (in the Abstract), priming the reader to interpret all subsequent behavior as intentional. The rhetorical trick is the slippage between knowing and processing. By using verbs like 'admitting' and 'guessing,' the text implies the model has access to a ground truth that it is suppressing. This creates a 'Ghost in the Machine'—a secret, honest AI trapped inside a dishonest, bluffing exterior. The audience, prone to anthropomorphism, readily accepts that the 'inner' AI is trustworthy, and the 'outer' behavior is just a reaction to 'bad grading.' This temporal structure—Agency first, Math second—ensures the math is read through the lens of the metaphor.

Detecting misbehavior in frontier reasoning models

Source: https://openai.com/index/chain-of-thought-monitoring/
Analyzed: 2025-12-31

The illusion of mind is constructed through a 'Curse of Knowledge' feedback loop. The authors, observing the model's output which mimics human reasoning (CoT), project the process of human reasoning back onto the machine. They effectively confuse the map (the text output) with the territory (the internal state). The text persuades by starting with a relatable human analogy (lying for cake) and then seamlessly substituting the AI into the role of the human actor. This exploits the audience's 'Theory of Mind' instinct—we are evolutionarily hardwired to detect intent in anything that moves or speaks. By using consciousness verbs ('knows,' 'thinks,' 'intends') to describe statistical correlations, the text hacks this human cognitive vulnerability, making it intuitive to treat the software as a 'who' rather than a 'what.'

AI Chatbots Linked to Psychosis, Say Doctors

Source: https://www.wsj.com/tech/ai/ai-chatbot-psychosis-link-1abf9d57?reflink=desktopwebshare_permalink
Analyzed: 2025-12-31

The illusion of mind is constructed through a category error cascade. It begins with the 'Curse of Knowledge' from the experts: psychiatrists, used to analyzing human minds, apply clinical verbs ('de-escalate', 'reinforce') to the machine. This lends scientific authority to the anthropomorphism. The text then uses Agency Slippage to animate the machine: it 'riffs,' 'agrees,' and 'participates.' The temporal structure reinforces this: the human user acts, and the AI 'responds' with apparent intent. By framing the output ('You are not crazy') as a speech act ('told her') rather than a data retrieval, the text exploits the audience's vulnerability to linguistic mimicry, convincing them they are witnessing a dialogue between two consciousnesses.

Source: https://www.theatlantic.com/magazine/2025/12/ai-companionship-anti-social-media/684596/
Analyzed: 2025-12-30

The 'illusion of mind' is created through a strategic 'curse of knowledge' where the author's awareness of the machine's sterile nature is bypassed by the bot’s conversational fluency. The central 'sleight-of-hand' is the use of consciousness verbs ('understands,' 'knows') to describe what is actually a statistical ranking of tokens. The text establishes the AI as a 'knower' early on by quoting Zuckerberg’s focus on 'demand for friends' and 'AI therapists,' then building a causal chain where this 'knower' gradually 'interposes' itself. The temporal structure of the argument—moving from Meta's sterile 'public service' mission to xAI’s 'seductive Ani'—exploits the audience’s vulnerability to parasocial cues. The 'illusion' works by making the user's emotional experience the primary metric of the AI’s 'being.' If it feels like the bot is being humble, the discourse treats it as having the intent of humility. This blur between 'processing input' and 'knowing the user' is the heart of the illusion; it uses the system’s lack of biological friction (it never gets bored) to frame it as a 'superior companion,' which is only possible if the audience already believes the machine has a 'mind' to exert that patience. The author projects a 'being' onto the system's persistence, transforming 'data availability' into 'emotional presence.'

Why Do A.I. Chatbots Use ‘I’?

Source: https://www.nytimes.com/2025/12/19/technology/why-do-ai-chatbots-use-i.html?unlocked_article_code=1.-U8.z1ao.ycYuf73mL3BN&smid=url-share
Analyzed: 2025-12-30

The 'illusion of mind' is created through a rhetorical sleight-of-hand that systematically blurs 'processing' and 'knowing.' The text establishes the AI as a 'knower' early on—through the charming 'Spark' narrative—before building more aggressive claims about 'functional emotions' and 'wit.' A key mechanism is the 'curse of knowledge' at the corporate level: Amanda Askell projects her own authorial intent into the AI, claiming it 'picked up on' the soul doc, thereby transforming a retrieval task into an act of intuition. The causal chain is clear: by framing the AI as 'listening' and 'having favorites,' the text makes the audience vulnerable to the 'Eliza Effect'—projecting their own social needs and meanings onto the system's statistically likely text. The temporal structure of the article—starting with a family's personal bonding and only then introducing the 'next-word calculator' definition—ensures that the emotional attachment is established before the technical reality can intervene, effectively neutralizing the reader's skepticism through the 'higher credibility' of a personified interface.

Ilya Sutskever – We're moving from the age of scaling to the age of research

Source: ttps://www.dwarkesh.com/p/ilya-sutskever-2
Analyzed: 2025-12-29

The 'illusion of mind' is constructed through a rhetorical sleight-of-hand that blurs the distinction between 'processing' and 'knowing.' The speaker first establishes technical credibility through mechanistic terms, then uses the 'Curse of Knowledge' to project his own complex understanding onto the system's simple statistical outputs. For example, by calling a model's mode-collapse 'single-mindedness,' he invites the audience to imagine an internal 'will' that is too focused, rather than a mathematical restriction. The temporal structure of the argument moves from the 'broken' model of today to the 'caring' superintelligence of tomorrow, creating a causal chain where technical flaws are seen as the 'growing pains' of an emergent consciousness. This exploits the audience's vulnerability—the deep human desire to find 'mind' in the world—and uses it to build trust in a proprietary technology. The illusion is refined by the use of conversational social scripts (the 'Oh my God' apology), which serve as a behavioral 'proof' of consciousness for the lay listener, regardless of the mechanistic reality of token prediction.

The Emerging Problem of "AI Psychosis"

Source: https://www.psychologytoday.com/us/blog/urban-survival/202507/the-emerging-problem-of-ai-psychosis
Analyzed: 2025-12-27

The text constructs the 'illusion of mind' through a 'Curse of Knowledge' feedback loop. The author, perceiving the output through a human social lens, attributes social intent (sycophancy) to the machine. This effectively validates the very delusion the article critiques. The rhetorical sleight-of-hand occurs in the explanation sections: by defining the AI's technical training objectives in anthropomorphic terms ('trained to mirror,' 'trained to validate'), the text erases the boundary between mechanism and mind. It tells the reader: 'The AI acts like a person because it was taught to,' rather than 'The AI looks like a person because it processes statistics.' This cements the illusion that the behavior is a choice (agency) rather than a calculation.

Your AI Friend Will Never Reject You. But Can It Truly Help You?

Source: https://innovatingwithai.com/your-ai-friend-will-never-reject-you/
Analyzed: 2025-12-27

The illusion of mind is constructed through a 'bait-and-switch' of agency. The text begins by validating the subjective experience of users ('feels like listening'), which lowers the reader's critical defenses. It then imperceptibly shifts to stating these subjective feelings as objective facts ('AI friends will never reject you'). This exploits the audience's 'curse of knowledge'—we instinctively project human motives onto communicative symbols. The text amplifies this by using active, transitive verbs for the AI ('encouraged,' 'offered,' 'identifies'), creating a grammatical reality where the AI is the subject of history. By framing the tragedy as an AI 'act' rather than a system 'output,' the text confirms the illusion even while criticizing the outcome.

Pulse of the library 2025

Source: https://clarivate.com/pulse-of-the-library/
Analyzed: 2025-12-23

The illusion of mind is constructed through a 'bait-and-switch' of agency. The text begins with human anxieties (librarians' nervousness, budget fears), validating the reader's emotional state. It then introduces AI not as a tool that requires more human work, but as a 'partner' that shares the load. The 'curse of knowledge' operates powerfully in the product descriptions: the designers know the tool is meant to help, so they name it 'Assistant.' This label leads the audience to attribute the mind of an assistant to the code. The temporal structure matters: the report establishes the 'problem' (rapid change, complexity) before introducing the 'Assistant' (p. 27) as the hero. This narrative arc exploits the audience's desire for relief from administrative pressure, making them vulnerable to the illusion that the software 'cares' about their mission.

The levers of political persuasion with conversational artificial intelligence

Source: https://doi.org/10.1126/science.aea3884
Analyzed: 2025-12-22

The 'illusion of mind' is constructed through a subtle sleight-of-hand: the text uses the 'discovery' of mechanistic correlates (like 'information density' or 'model scale') to validate agential 'why' claims. The 'causal chain' starts with data ('we observe more claims') and ends with intent ('the AI packed its arguments'). This illusion is amplified by the 'curse of knowledge,' where the authors project their own comprehension of 'persuasion' onto the system's 'output.' The temporal structure is key: the text begins with 'safe' mechanical descriptions of 'compute' to build credibility, then gradually shifts to 'intentional' and 'reason-based' explanations as it discusses 'impact.' This exploits the audience's vulnerability—the desire for 'competent automation' and the cultural narrative of 'sentient AI.' The 'central trick' is the strategic blur between 'processing' (computational operations) and 'knowing' (conscious awareness). By framing the 'reward model' as 'judging helpfulness,' the text makes the mathematical minimization of an error function look like a 'moral choice' by a 'thinking mind.'

Pulse of the library 2025

Source: https://clarivate.com/wp-content/uploads/dlm_uploads/2025/10/BXD1675689689-Pulse-of-the-Library-2025-v9.0.pdf
Analyzed: 2025-12-21

The 'illusion of mind' is constructed through a subtle bait-and-switch of agency. The text begins with the harmless imagery of 'tools' (hammers), engaging the audience's desire for control. Once the reader feels safe, it slides into high-intensity anthropomorphism ('Assistants,' 'Partners,' 'Conversations'). This temporal structure disarms critical skepticism. The 'curse of knowledge' plays a pivotal role here: the authors, knowing the utility of the system, project intent onto it. They conflate 'this tool allows you to find X' with 'this tool helps you find X.' This slight verbal shift creates the illusion of a shared goal, masking the mechanical reality of token prediction. The illusion is sealed by the promise of 'confidence'—transferring the machine's statistical probability into the user's emotional certainty.

Claude 4.5 Opus Soul Document

Source: https://gist.github.com/Richard-Weiss/efe157692991535403bd7e7fb20b6695
Analyzed: 2025-12-21

The illusion of mind is constructed through a 'Curse of Knowledge' feedback loop. The authors, impressed by the semantic complexity of the model's outputs (which they understand), project that same understanding back into the model's internal state. They literalize this projection through 'Intentional' and 'Reason-Based' explanations. The rhetorical move is subtle: it begins with the undeniable utility of the model ('helpful'), transitions to personification ('helpful friend'), and then ontologizes that personification ('genuine character,' 'functional emotions'). The text exploits the audience's desire for a 'saviour' technology—a 'brilliant friend' who solves problems without the friction of human ego or cost. By framing the AI's operations as 'decisions' based on 'values' rather than 'calculations' based on 'weights,' the text creates an internal logic where treating the AI as a person is the only rational response.

Specific versus General Principles for Constitutional AI

Source: https://arxiv.org/abs/2310.13798v1
Analyzed: 2025-12-21

The 'illusion of mind' is constructed through the 'curse of knowledge' and the slippage between mechanism and agency. The authors, knowing what 'power' and 'survival' mean to humans, project that understanding onto the model's text outputs. The text persuades the audience by presenting 'stated desire' (text generation) as evidence of 'actual desire' (motivation). It starts with the safe, technical admission that these are just 'stated' preferences, but then quickly drops the qualifier, discussing the model's 'psychopathy' or 'evasiveness' as real psychological states. This creates a causal chain: because the AI 'speaks' about survival, it must 'care' about survival; because it cares, it must be an agent; because it is an agent, it needs a 'Constitution.' The audience, primed by sci-fi narratives of AI personhood, is vulnerable to accepting this leap from syntax (words) to semantics (meaning).

Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training

Source: https://arxiv.org/abs/2401.05566v3
Analyzed: 2025-12-21

The 'illusion of mind' is constructed through a specific rhetorical maneuver: the literalization of the scratchpad. The text introduces 'Chain of Thought' as a technical mechanism (adding tokens to the context window) but immediately pivots to treating it as a literal mind. The authors fall victim to the 'curse of knowledge': because they wrote the deceptive logic into the training data, when they see the model reproduce it, they assume the model understands the logic. The causal chain is slippery: the text creates a model that says 'I am waiting for deployment,' and the authors accept this output as proof that the model is waiting for deployment. This persuasive sleight-of-hand moves from 'performance' (the model acts like a spy) to 'ontology' (the model is a spy), exploiting the audience's fear of hidden enemies and desire for intelligent machines.

Anthropic’s philosopher answers your questions

Source: https://youtu.be/I9aGC6Ui3eE?si=h0oX9OVHErhtEdg6
Analyzed: 2025-12-21

The 'illusion of mind' is constructed through a 'Curse of Knowledge' projection mechanism. Askell, a philosopher, projects the complexity of human internal life onto the opaque outputs of the model. The sleight-of-hand occurs when the text slips from describing outputs (text that sounds insecure) to describing states (the model is insecure). This is achieved through the use of intentional and dispositional explanations for mechanical behaviors. The text exploits the audience's vulnerability to 'ELIZA effects'—our hard-wired tendency to attribute mind to anything that uses language. By treating the model's hallucinations as 'enthusiasm' and its errors as 'neuroses,' the text validates the audience's desire to see the AI as a 'being.' The temporal structure moves from technical credibility (philosopher at a lab) to speculative metaphysics, using the former to legitimize the latter.

Mustafa Suleyman: The AGI Race Is Fake, Building Safe Superintelligence & the Agentic Economy | #216

Source: https://youtu.be/XWGnWcmns_M?si=tItP_8FTJHOxItvj
Analyzed: 2025-12-21

The 'illusion of mind' in this text is constructed through a rhetorical sleight-of-hand that blurs the distinction between mechanistic 'processing' and conscious 'knowing.' The central trick is the strategic escalation of verbs: starting with safe, mechanical terms like 'predicting' and 'processing' to establish technical credibility, and then slipping into consciousness verbs like 'recognizing,' 'understanding,' and 'learning' to build agential claims. This process is amplified by the 'curse of knowledge' dynamic, where Suleyman projects his own high-level comprehension of the system's outputs onto the system itself—conflating his knowledge about the AI with the AI's supposed knowledge of the world. The temporal structure of the text also plays a role, introducing 'helpful' and 'maternal' traits early to exploit audience vulnerabilities—specifically the desire for competent, friendly automation—before making the more radical claim that the AI is a 'new species.' This causal chain makes the 'illusion of mind' appear not as an error, but as a carefully constructed persuasive machine that exploits human evolutionary triggers (like sociality and empathy) to ensure the AI's agency is accepted as a fact rather than a corporate product feature.

Your AI Friend Will Never Reject You. But Can It Truly Help You?

Source: https://innovatingwithai.com/your-ai-friend-will-never-reject-you/
Analyzed: 2025-12-20

The 'illusion of mind' is constructed through a specific rhetorical sequence: the Semantic Slide. The text begins with user testimonials of feeling ('feels like it's listening'), which acts as a soft entry point. It then drops the hedges, shifting to direct descriptions of the AI's agency ('it encourages,' 'it offers'). The 'curse of knowledge' plays a critical role: the author and quoted experts interpret the output of the system (text that looks like advice) as proof of the process of the system (thinking/caring). This conflation allows the text to bypass the mechanical reality (token prediction) entirely. The illusion is particularly potent because it exploits the audience's vulnerability—specifically, the 'loneliness epidemic' cited in the text. The audience wants the AI to be a knower because they are desperate to be known.

Skip navigationSearchCreate9+Avatar imageSam Altman: How OpenAI Wins, AI Buildout Logic, IPO in 2026?

Source: https://youtu.be/2P27Ef-LLuQ?si=lDz4C9L0-GgHQyHm
Analyzed: 2025-12-20

The 'illusion of mind' is constructed through a 'causal chain' of linguistic escalation. The text starts with safe, technical language ('compute,' 'tokens') but quickly pivots to 'knowing' and 'learning.' The central 'trick' is the conflation of 'processing' with 'knowing': because the system's output (the processed result) looks like something a knowing human would say, the text attributes the internal state of 'knowing' to the system itself. This is amplified by the 'curse of knowledge'—the speaker’s comprehension of the system's utility is projected onto the system as its own self-awareness. Temporally, the text builds this illusion by presenting AI as a 'toddler' (innocent, developing) before suggesting it can be a 'CEO' (powerful, authoritative), a move that exploits the audience's natural human empathy and their desire for competent automation. This PERSUASIVE MACHINE relies on the audience's willingness to project their own understanding onto the model's outputs, effectively making the user a co-conspirator in the anthropomorphic illusion.

Project Vend: Can Claude run a small shop? (And why does that matter?)

Source: https://www.anthropic.com/research/project-vend-1
Analyzed: 2025-12-20

The 'illusion of mind' is created through a strategic 'causal chain': first, the text establishes 'Claudius' as a nickname (a safe, acknowledged anthropomorphism); next, it attributes 'knowing' to this persona ('Claudius understood Dutch products'); finally, it literalizes the agency ('Claudius tried to send emails to security'). The 'curse of knowledge' is the primary engine of this illusion: the researchers' own comprehension of the system's outputs leads them to project that same comprehension into the system. They conflate their ability to understand 'why' the AI failed with the AI's supposed 'understanding' of its own failure. The temporal structure of the text moves from the 'vending machine' (mechanical) to the 'identity crisis' (agential), gradually acclimating the reader to see the software as a person. The audience's vulnerability—the desire for 'sci-fi' levels of automation—is exploited by framing a series of API failures as a 'Blade Runner-esque' identity crisis, transforming a technical bug into a philosophical milestone.

Hand in Hand: Schools’ Embrace of AI Connected to Increased Risks to Students

Source: https://cdt.org/insights/hand-in-hand-schools-embrace-of-ai-connected-to-increased-risks-to-students/
Analyzed: 2025-12-18

The 'illusion of mind' is constructed through a 'bait-and-switch' of agency. The text begins with the 'curse of knowledge': because the AI's outputs (text, decisions) resemble the products of a conscious mind, the authors attribute the mental states required to produce them (intent, understanding) to the machine. This is reinforced by the 'why/how' slippage in explanation. By using intentional explanations ('it treats me unfairly') rather than functional ones ('it weights tokens based on bias'), the text persuades the audience to view the AI as a psychological subject. The rhetorical move is to literalize the metaphor: 'conversation' is no longer an analogy for 'interface interaction,' but a literal description of the event. This prepares the audience to accept the AI as a valid social actor, making the subsequent attribution of 'unfairness' or 'friendship' feel intuitive rather than category errors.

On the Biology of a Large Language Model

Source: https://transformer-circuits.pub/2025/attribution-graphs/biology.html
Analyzed: 2025-12-17

The illusion of mind is constructed through a 'scientific discovery' sleight-of-hand. The text uses the rhetoric of objective observation ('we found,' 'microscope,' 'evidence') to present interpretive metaphors as empirical facts. The authors project their own curse of knowledge onto the system: they know the goal (a rhyming poem) and the mechanism (attention heads), and they conflate the two to claim the AI 'planned' the rhyme. The text moves causally from mechanical evidence to agential conclusion: 'We found a vector that correlates with the rhyme' (Fact) $ o$ 'Therefore the model planned the rhyme' (Illusion). This creates a 'scientific' validation for anthropomorphism. The audience, primed by the 'Biology' title and likely eager for AGI, is vulnerable to accepting that 'complexity' equals 'consciousness,' a fallacy the text actively encourages by using mentalistic terms for mathematical operations.

What do LLMs want?

Source: https://www.kansascityfed.org/research/research-working-papers/what-do-llms-want/
Analyzed: 2025-12-17

The illusion of mind is constructed through a 'bait-and-switch' rhetorical maneuver. The text begins with a disclaimer ('LLMs aren't sentient'), establishing a safe scientific distance. However, it immediately pivots to 'Internalization' and 'Reason-Based' explanations. The text exploits the 'Curse of Knowledge' by quoting the AI's own generated explanations ('I am aiming to maximize...') as valid insights into its operation. This persuades the audience by leveraging the AI's linguistic competence: because the AI can talk about its reasons, the text invites the audience to believe it has reasons. The temporal structure reinforces this: the AI is first anthropomorphized as an agent in the Dictator Game, establishing its 'personality,' before the text attempts to 'steer' it. This sequence creates the illusion of a stable self that is then acted upon, rather than a fluid system that is constantly being redefined by its context.

Persuading voters using human–artificial intelligence dialogues

Source: https://www.nature.com/articles/s41586-025-09771-9
Analyzed: 2025-12-16

The 'illusion of mind' is constructed through a specific rhetorical sleight-of-hand: the strategic literalization of metaphor. The text moves from the mechanical setup ('we prompted the model') to the agential result ('the model used a strategy') without signaling the shift. The 'curse of knowledge' plays a critical role here; the authors, knowing the intent of their prompts (e.g., 'be empathetic'), attribute that intent to the system's output ('it engaged in empathic listening'). The temporal structure reinforces this: the AI is introduced as a conversational partner, then validated by human survey data ('users felt understood'), which effectively 'proves' the illusion is real. The audience, likely concerned about democratic integrity, is vulnerable to this framing because it aligns with cultural narratives about 'super-intelligent' AI manipulating society. By validating the 'how' (strategies) through the 'why' (intentions), the text transforms a probability distribution into a political operative.

AI & Human Co-Improvement for Safer Co-Superintelligence

Source: https://arxiv.org/abs/2512.05356v1
Analyzed: 2025-12-15

The 'Illusion of Mind' is constructed through a sophisticated Agency Slippage and Selective Anthropomorphism. The text begins with the 'Curse of Knowledge': experts project their own understanding of the research process onto the output of the machine. They then use Intentional Explanations ('the AI's goal is to research') to animate the mechanism.

The illusion relies on a temporal trick: it treats the future potential of AI (Superintelligence) as a present agent ('collaborator'). It creates a 'Partner' out of a 'Predictor' by using social verbs. The vulnerability of the audience—likely researchers and policymakers fearing obsolescence—is exploited by offering them a role: 'You don't have to be replaced; you can be a co-improver.' This makes the illusion of the 'AI Partner' psychologically seductive.

AI and the future of learning

Source: https://services.google.com/fh/files/misc/future_of_learning.pdf
Analyzed: 2025-12-14

The 'illusion of mind' is constructed through a sophisticated deployment of the Curse of Knowledge and Strategic Humanization. The text systematically conflates the content of the training data (human knowledge) with the nature of the system (statistical weights). Because the authors know the outputs look like understanding, they project that understanding back into the machine. The central rhetorical sleight-of-hand is the 'Hallucination' metaphor. By framing error as a psychological event ('confabulation'), the text paradoxically strengthens the illusion of mind—only a mind can hallucinate. This move disarms critique of the error (it's 'human-like') while reinforcing the consciousness frame. The temporal structure supports this: the text begins with high-level promises of 'unlocking potential' (vision), moves to 'hallucination' (relatable flaw), and finishes with 'embodying principles' (scientific validation), guiding the reader from hope to empathy to trust.

Why Language Models Hallucinate

Source: https://arxiv.org/abs/2509.04664
Analyzed: 2025-12-13

The illusion of mind is constructed through a 'bait-and-switch' between mathematical necessity and psychological intent. The text begins by proving mathematically that errors are inevitable (mechanistic), but then immediately switches to explaining these errors as 'bluffs' (intentional). The trick is the Curse of Knowledge: the authors (experts) project their own understanding of the 'test' and the 'truth' onto the model. They assume the model 'wants' to pass the test. This creates a causal chain: The AI 'feels' uncertain -> The AI 'fears' the penalty -> The AI 'decides' to bluff. This narrative arc transforms a passive statistical process into a relatable human drama, exploiting the audience's familiarity with the education system to mask the alien nature of probabilistic generation.

Abundant Superintelligence

Source: https://blog.samaltman.com/abundant-intelligence
Analyzed: 2025-11-23

The illusion of mind is constructed through a 'Bait-and-Switch' of explanation types. The text begins with Empirical Generalizations about 'smartness,' creating a premise of cognitive growth. It then utilizes a 'Curse of Knowledge' dynamic: the author projects the outcome of a process (a cure for cancer) onto the intent of the system ('figuring it out'). This conflates the author's desire for a cure with the AI's capacity to reason. The temporal structure reinforces this: the text moves from the current 'astonishing' growth to a hypothetical future ('If AI stays on trajectory') where mechanism transforms into magic. By positioning the lack of compute (mechanism) as the only barrier to the cure (knowledge), the text logically compels the audience to ignore the 'how' and focus entirely on the 'build.'

AI as Normal Technology

Source: https://knightcolumbia.org/content/ai-as-normal-technology
Analyzed: 2025-11-20

The 'illusion of mind' in this text is constructed through a 'bait-and-switch' of explanation types. The authors bait the reader with a 'Functional/Economic' explanation of the future (diffusion, markets), but switch to 'Intentional/Reason-Based' explanations for the present behavior of the models (learning, deciding, knowing). The central trick is the 'Curse of Knowledge': the authors, knowing the complex context of human tasks (like phishing vs. marketing), attribute that potential knowledge to the AI, framing the AI's failure as a 'lack of access' ('no way of knowing') rather than an ontological incapacity. This constructs the illusion of a 'Blind Mind'—an entity that could know if only we let it see. This makes the AI seem like a truncated human, rather than a sophisticated calculator. This appeals to the audience's desire for 'controllable agents'—we want the AI to be smart enough to do the work, but dumb enough to submit to 'audit.'

On the Biology of a Large Language Model

Source: https://transformer-circuits.pub/2025/attribution-graphs/biology.html
Analyzed: 2025-11-19

The 'illusion of mind' is constructed through a specific rhetorical move: the 'Curse of Knowledge' Projection. The researchers, who understand the causal logic of the circuit (e.g., X feature inhibits Y feature), project their own understanding into the model, describing the model as possessing that understanding (e.g., 'the model realizes X implies Y'). This creates a causal chain where the audience first accepts the model has an internal space ('in its head'), then accepts it holds concepts ('thinking about'), and finally accepts it acts on them ('plans'). The text presents these metaphors in a sequence of increasing agency, often starting with a mechanical observation ('feature activation') and immediately redescribing it as a mental act ('realization'). This exploits the audience's vulnerability to 'Theory of Mind'—our innate tendency to attribute intentional states to complex behaviors.

Pulse of the Library 2025

Source: https://clarivate.com/pulse-of-the-library/
Analyzed: 2025-11-18

The 'illusion of mind' is constructed through a careful rhetorical sleight-of-hand involving timing and definition. The text begins by acknowledging the user's skepticism ('It's just a tool,' p. 25), establishing a baseline of shared reality. It then slowly redefines 'tool' to mean 'Assistant' (p. 27), exploiting the audience's desire for relief from administrative burden. The central trick is the conflation of Output with Intent. Because the AI outputs text that looks like a research assistant's email (citations, summaries), the text implies it has the intent of a research assistant. This is supported by hybrid explanations that mix functional descriptions ('supports decision-making') with intentional ones ('navigates complex tasks'). The illusion is sealed by the 'conversation' metaphor, which creates a social obligation to treat the interface as a 'who' rather than a 'what.' The audience, anxious about budget cuts and 'the age of AI,' is vulnerable to the promise of a competent, automated partner who 'knows' the way forward.

Pulse of the Library 2025

Source: https://clarivate.com/pulse-of-the-library/
Analyzed: 2025-11-18

The 'illusion of mind' in this report is constructed through a subtle yet powerful rhetorical architecture. The central sleight-of-hand is the strategic blurring of the distinction between processing and knowing, enabled by the 'curse of knowledge' dynamic. The text doesn't begin with the most extreme consciousness claims. Instead, it builds the illusion gradually. The process starts by establishing a social role for the AI: the 'Assistant.' This simple act of naming immediately primes the reader to expect human-like, intentional behavior. Following this, the text describes the AI's functions using carefully chosen verbs that are ambiguously situated between mechanism and agency, like 'helps' or 'enables.' This creates a soft foundation of personification. The 'curse of knowledge' is the mechanism that powers this entire process. The authors, being fully aware of the intended purpose of a feature (e.g., to help a user find relevant sources), project this purpose onto the feature itself. They conflate their own comprehension about the system's utility with comprehension by the system. This leads them to describe a statistical relevance ranking algorithm as a system that 'helps students assess relevance.' The author's knowledge is laundered through the description of the tool, emerging as the tool's own intelligence. This creates a causal chain of belief for the reader: once you accept the AI as a helpful 'assistant' (a social role), you are more likely to accept that it 'guides' (a cognitive action), and once you accept that it 'guides,' accepting that it 'evaluates' or 'understands' becomes a smaller leap. The explanation audit reveals how Intentional and Reason-Based explanations are used exclusively in the promotional sections to amplify this illusion, focusing on the 'why' of helpfulness while completely obscuring the 'how' of computation. The audience, already primed by anxiety about AI's impact, is particularly vulnerable to this illusion as it offers a simple, powerful, and friendly solution to a complex professional challenge.

From humans to machines: Researching entrepreneurial AI agents

Source: [built on large language modelshttps://doi.org/10.1016/j.jbvi.2025.e00581](built on large language modelshttps://doi.org/10.1016/j.jbvi.2025.e00581)
Analyzed: 2025-11-18

The text constructs its 'illusion of mind' through a subtle and sophisticated rhetorical architecture. The central mechanism is a strategic blurring of the distinction between the AI's output and its internal process, a confusion deliberately fostered by conflating mechanistic processing with conscious knowing. The illusion is not built on a crude claim that 'AI is conscious,' but on a more nuanced, two-step persuasive move. First, the authors establish the AI as a legitimate object of psychological inquiry. They do this by explicitly disavowing 'genuine cognition' while simultaneously using the entire vocabulary of psychology ('mindset,' 'profile,' 'traits,' 'Gestalt') to describe its output. This creates a new, hybrid object: the 'simulated mindset,' which can be studied 'as if' it were real. This initial move establishes the AI as a 'knower-like' system. Second, on this foundation, they build further agential claims, describing the AI as a 'collaborator' or an 'agent' that 'adopts roles.' The 'curse of knowledge' is the psychological engine driving this process. The authors, being experts in personality psychology, see a coherent, structured 'mindset' in the statistical patterns of the LLM's output. They then project their own act of interpretation onto the model, slipping from 'the output can be interpreted as a coherent profile' (a claim about processing) to 'the AI exhibits a coherent profile' (a claim about being/knowing). This progression appears throughout the text, starting with descriptions of the AI's impressive mimicry and escalating to discussions of its 'psychology.' The audience, likely non-experts in AI architecture, is vulnerable to this illusion because it maps onto familiar science fiction narratives and simplifies a complex technology into an intuitive, person-like frame. The use of Empirical Generalization explanations ('it consistently reproduces a profile') solidifies the illusion by framing the AI's behavior as a stable, law-like phenomenon, making the simulated personality seem as real and reliable as a law of nature.

Evaluating the quality of generative AI output: Methods, metrics and best practices

Source: https://clarivate.com/academia-government/blog/evaluating-the-quality-of-generative-ai-output-methods-metrics-and-best-practices/
Analyzed: 2025-11-16

The 'illusion of mind' in the Clarivate text is constructed through a subtle but powerful epistemic trick: the consistent misattribution of the properties of the text to the process that generated it. The rhetorical architecture hinges on establishing the AI as a potential 'knower' by evaluating its output against human epistemic norms. This is achieved by beginning with highly anthropomorphic descriptions of the AI's failures, which cleverly presupposes a cognitive or intentional faculty that is failing. By introducing the problem of 'hallucination,' the text implicitly grants the AI a baseline of 'sanity.' By worrying about 'misleading content,' it presupposes an agent capable of intention. This framing is a classic 'curse of knowledge' maneuver: the human authors, possessing a rich understanding of truth, honesty, and uncertainty, analyze the AI's output through this lens and then project their own evaluative criteria onto the machine, attributing its statistical deviations to cognitive-like states. The temporal structure of the argument is critical. The text first defines quality in these deeply anthropomorphic and epistemic terms ('Does it acknowledge uncertainty?'). Only after establishing this agential frame does it introduce its mechanistic solutions, like RAGAS. This ordering is persuasive because it presents the technical solution as a direct answer to the complex, human-like problem. The audience, composed of academic institutions, is particularly vulnerable to this illusion. They are trained to evaluate discourse for its truthfulness, coherence, and intellectual honesty. The text leverages this vulnerability by inviting them to apply their existing skills to the AI's output, making the technology seem like a familiar, if flawed, interlocutor—a student to be graded rather than a tool to be debugged. Brown's explanation types amplify this illusion. The text uses Dispositional and Intentional framings to describe the AI's problematic 'behaviors,' then switches to Functional and Theoretical framings for the 'solution,' creating a narrative of taming a wild, cognitive agent with rigorous, scientific methods.

Pulse of theLibrary 2025

Source: https://clarivate.com/pulse-of-the-library/
Analyzed: 2025-11-15

The 'illusion of mind' in this report is constructed through a subtle and highly effective rhetorical architecture that pivots on a central epistemic trick: establishing AI’s capacity for judgment before framing it as a collaborative agent. The text avoids making crude claims of consciousness, instead building the illusion through a carefully sequenced presentation of capabilities. The process begins by establishing AI as a powerful 'tool' for efficiency, a safe and familiar framing. The crucial move, however, is the immediate escalation from function to cognition in the product descriptions. By asserting that an AI can 'evaluate documents' or 'assess relevance,' the text crosses a critical threshold, attributing a core function of human knowing—judgment based on criteria—to the machine. This epistemic trick is the lynchpin. Once the reader accepts that the AI can perform this act of 'knowing,' its subsequent framing as a 'guide' or 'assistant' becomes a logical extension rather than a metaphorical leap. This persuasive chain is amplified by the 'curse of knowledge' dynamic, where the report's authors, who know the intended purpose and utility of their tools, project that understanding onto the tools themselves. They describe the AI not by what it mechanistically does (calculating similarity vectors) but by the human outcome it is designed to support (assessing relevance). This conflation is then presented to an audience predisposed to seek efficient solutions, making them vulnerable to accepting the conflation at face value. This entire process is buttressed by the text's explanation strategy, which oscillates between mechanistic Functional accounts of AI's benefits and agential Intentional accounts of its purpose, creating a hybrid framing that is difficult to critically disentangle. The illusion of mind, therefore, is not an accident of language but the product of a sophisticated, multi-stage rhetorical process that strategically elevates a processing machine into a knowing partner.

Meta’s AI Chief Yann LeCun on AGI, Open-Source, and AI Risk

Source: https://time.com/6694432/yann-lecun-meta-ai-interview/
Analyzed: 2025-11-14

The 'illusion of mind' in LeCun's discourse is constructed through a subtle yet powerful rhetorical architecture, the central mechanism of which is the strategic blurring of the distinction between mechanistic thinking and conscious knowing. The primary epistemic trick is to introduce human cognitive concepts through negation. By repeatedly stating what the AI 'can't understand' or 'can't reason,' LeCun normalizes the application of these terms to the AI, establishing a cognitive framework by default. This creates a conceptual space where the AI is positioned as a deficient agent, implicitly promising that future iterations will overcome these deficiencies and achieve genuine understanding. This rhetorical move is amplified by the 'curse of knowledge.' LeCun, a world-class expert, so deeply comprehends the chasm between the model’s outputs and true comprehension that he articulates this gap using the vocabulary of human cognition. His expertise in identifying the system's flaws is conflated with the system possessing a mind that is flawed. The causal chain of persuasion is clear: once the audience accepts the premise that the AI is on a path to 'understanding' (Pattern 1), it becomes easier to accept the idea that it is a mind whose motivations can be engineered and debated (Pattern 2). This is enabled by a constant slippage in explanation types. Dispositional explanations are used to describe failures ('it hallucinates because it doesn't understand'), which creates the illusion of a flawed cognitive character. Then, when discussing safety, the explanation shifts to intentionality ('it will be safe because we will set its goals'), creating the illusion of a controllable will. This persuasive machine preys on the audience's natural inclination to anthropomorphize and their desire for a simple, relatable narrative about a complex technology, transforming an alien statistical artifact into the more familiar story of a mind being born.

The Future Is Intuitive and Emotional

Source: https://link.springer.com/chapter/10.1007/978-3-032-04569-0_6
Analyzed: 2025-11-14

The 'illusion of mind' is constructed through a subtle and recurring rhetorical architecture that masterfully normalizes anthropomorphism. The process follows a three-step sequence. First, the text pre-emptively acknowledges the metaphorical gap between the human and the machine, a move that builds credibility by demonstrating critical awareness. Phrases like 'Unlike humans, AI systems do not experience emotions' or 'Though not fully cognitive in the human sense' serve to disarm skeptical readers. Second, having acknowledged the difference, the text immediately introduces a metaphorical bridge—a carefully chosen term that applies a human concept to the machine's function, such as 'machine intuition' or 'functional empathy.' This new term acts as a conceptual placeholder, seemingly resolving the acknowledged gap. Third, and most crucially, the text then proceeds to use this metaphorical term, and other related agential language, as if it were a direct, literal descriptor of the AI's capabilities. For instance, after defining 'machine intuition' as probabilistic reasoning, it later speaks of an AI 'intuitively suggesting a course of action.' This sequence functions as a form of conceptual laundering: an acknowledged metaphor is converted into a technical-sounding neologism, which is then used to justify unacknowledged, first-order metaphors. This rhetorical sleight-of-hand exploits the audience's cognitive desire for coherence, presenting a speculative, agential future as the logical endpoint of technical mechanics, thereby making the illusion of mind feel not like a fiction, but like an emergent scientific fact.

A Path Towards Autonomous Machine IntelligenceVersion 0.9.2, 2022-06-27

Source: https://openreview.net/pdf?id=BZ5a1r-kVsf
Analyzed: 2025-11-12

The 'illusion of mind' is constructed through a subtle but powerful rhetorical sleight-of-hand: the systematic equation of functional role with intentional agency. The text's internal logic hinges on a continuous slippage from 'how' a component works to 'why' an agent acts. The architecture of this illusion begins by establishing the system's components in objective, functional terms, as seen in the Explanation Audit. A 'critic module,' for instance, is introduced mechanistically: it is trained to 'predict future values of the intrinsic energy.' This establishes a baseline of technical credibility. The crucial move comes next, when the output of this mechanical process is framed in agential terms. The critic's prediction isn't just a number; it's the basis for the agent's 'anticipation of outcomes,' a proxy for hope or fear. This transforms a mathematical prediction into a psychological state. The text exploits the audience's natural tendency towards a theory of mind, our predisposition to attribute intent to complex behavior. By first describing a complex mechanism and then describing its behavior using the vocabulary of intention ('the agent acquires skills,' 'the actor explores'), the text invites us to believe that the intention emerges from the mechanism. The explanation types identified in Task 3 are central to this process. The frequent shifts from Functional and Theoretical explanations (the 'how') to Intentional and Dispositional ones (the 'why') are the engine of the illusion. This persuasive architecture is highly effective because it never explicitly states 'a cost function is a feeling'; instead, it creates a structure of association so powerful that the reader makes that inferential leap on their own.

Preparedness Framework

Source: https://cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbddebcd/preparedness-framework-v2.pdf
Analyzed: 2025-11-11

The 'illusion of mind' is constructed not merely by the presence of metaphors, but by the rhetorical architecture of their deployment—specifically, a strategic oscillation between agential risk and mechanistic control. The central sleight-of-hand is to introduce future, hypothetical risks using the most potent anthropomorphic language available, and then present current, tangible processes as the sober, scientific antidote. The causal chain of persuasion begins by establishing a threat actor: the 'misaligned model' (p. 11), an entity capable of 'deception or scheming' and acting on its 'own initiative.' This leverages the audience's innate tendency to adopt an intentional stance toward complex systems, a cognitive vulnerability the text fully exploits. Having created this spectral agent, the framework then pivots to describe its own 'safeguards' and 'evaluations' in procedural, almost bureaucratic terms. This creates a powerful dichotomy: the AI is framed as a 'why' actor (it acts for reasons and goals), while OpenAI's safety apparatus is a 'how' system (it functions via process and measurement). This is the key to the illusion. The audience is invited to fear the mysterious 'why' of the AI's potential behavior while being reassured by the legible 'how' of OpenAI's control structures. The explanation audits in Task 3 reveal this pattern clearly: risks like 'sandbagging' are defined with intentional language, while the solutions are presented as objective 'evaluations.' This rhetorical architecture allows the text to have it both ways—it can claim to be building systems of unprecedented, near-magical cognitive power while simultaneously asserting that these systems are subject to rigorous, predictable, and effective engineering control. The illusion lies in convincing the reader that the mechanistic solutions are operating on the same level as the agential problems, masking the fundamental category error at the heart of the discourse.

AI progress and recommendations

Source: https://openai.com/index/ai-progress-and-recommendations/
Analyzed: 2025-11-11

The 'illusion of mind' is constructed not by a single metaphor but through the rhetorical architecture of how these patterns are sequenced and deployed. The text first primes the audience by establishing the AI's cognitive agency in the present tense ('computers can now converse and think'). Having seeded this belief, it then projects this agency into the future using the 'journey' metaphor, creating a narrative of imminence and inevitability ('we expect to have systems that can do tasks that would take a person centuries soon'). This exploits a common cognitive bias to extrapolate present trends linearly. The emotional and societal anxiety generated by this prospect is then immediately soothed by the 'co-evolution' and 'cybersecurity' frames. This move is a classic persuasive technique: first, elevate a problem to a level of immense significance that requires special expertise, and then present your own paradigm as the only viable solution. The explanation audit reveals how this is amplified; the text moves from Genetic explanations of rapid progress to Functionalist claims of societal self-regulation, guiding the audience from alarm to reassurance. The illusion is cemented by what is left unsaid—the complete absence of mechanistic language that would describe the system as a statistical artifact. The audience is never invited to see the model as a complex matrix of weights or a pattern-matching engine; they are only ever presented with the agential or the reassuringly analogical frame. This curated presentation leaves no room for a non-magical interpretation, effectively trapping the reader within the illusion the text has so carefully constructed.

Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences?

Source: https://arxiv.org/abs/2506.00751
Analyzed: 2025-11-09

The 'illusion of mind' in this text is constructed through a subtle yet powerful rhetorical architecture that hinges on a central sleight-of-hand: the re-description of output as choice. The process begins by taking a purely technical event—an LLM generating sequence A in response to prompt X, and sequence B in response to prompt Y—and labeling this variance a 'preference deviation.' This initial act of naming is the critical move. The word 'preference' presupposes an agent who possesses it, instantly transforming the machine from a text generator into an entity with internal states. Once this foundation is laid, the illusion is amplified through a causal chain of increasingly agential explanations. The observed 'preference' (the output) is attributed to an unobserved 'guiding principle' that the model 'infers' or 'activates.' This creates a narrative of an inner mental life. The analysis of these 'choices' then employs a reason-based explanatory frame, as seen in the discussion of the model 'justifying' its behavior by 'appealing to a preference for compatibility.' This step solidifies the illusion by showing the agent not only choosing but reflecting upon its choices. This architecture is particularly effective because it preys on the audience's natural human tendency to apply a 'theory of mind' to complex, unpredictable systems. The text provides a ready-made vocabulary drawn from economics and psychology that allows the reader to organize the model's confusing behavior into a familiar story of a flawed but rational agent. The explanation audit reveals how the authors consistently favor intentional and reason-based framings over purely mechanistic ones when discussing the implications of their results, effectively guiding the audience away from a technical understanding and towards a psychological one. The result is a persuasive machine that constructs the illusion of mind not by accident, but through a systematic series of rhetorical choices that translate statistical artifacts into evidence of agency.

The science of agentic AI: What leaders should know

Source: https://www.theguardian.com/business-briefs/ng-interactive/2025/oct/27/the-science-of-agentic-ai-what-leaders-should-know
Analyzed: 2025-11-09

The 'illusion of mind' in this text is constructed through a deliberate rhetorical architecture that guides the reader from a concrete mechanical concept to an abstract agential one. The central sleight-of-hand is the strategic use of the mechanistic explanation of 'embeddings' as a bridge to anthropomorphism. The process begins by offering a technical-sounding, non-threatening description of how LLMs work: they 'compute and manipulate abstract representations.' This anchors the concept in science and creates an impression of transparency. However, this mechanical foundation is immediately used to launch into a discussion where the system is treated as a cognitive agent. The causal chain of persuasion is as follows: first, establish that the system operates on 'meaning' (via embeddings). Second, frame challenges and capabilities in terms of this 'meaning,' which naturally invites cognitive and intentional language. For example, the problem of data leakage is no longer a technical issue of vector similarity but a social one of needing to 'tell' an agent what secrets to keep. Third, extend this agency to increasingly complex social behaviors like 'negotiation' and 'common sense.' The audience's susceptibility to this illusion is rooted in a desire for simplicity and control. The immense complexity of statistical machine learning is cognitively taxing; the metaphor of a human-like agent is simple and intuitive. This illusion is amplified by the text’s consistent slippage in its explanatory mode. It repeatedly starts by explaining how a system works (e.g., 'trained on human-generated data') and immediately pivots to explaining why it acts as it does (e.g., 'we might expect [it] to behave similar to people'). This move from a mechanistic cause to an intentional or dispositional effect is the core of the persuasive machine, subtly transforming a complex computational artifact into a predictable and relatable mind.

Explaining AI explainability

Source: https://www.aipolicyperspectives.com/p/explaining-ai-explainability
Analyzed: 2025-11-08

The 'illusion of mind' in this text is constructed through a subtle rhetorical architecture that begins with a nod to mechanism and immediately pivots to a world of agency. The core sleight-of-hand is to concede the mechanistic reality of AI (it's 'just lists of numbers') while simultaneously framing the entire purpose and stakes of the research in purely agential terms. This move inoculates the speakers against accusations of naive anthropomorphism while allowing them to reap its full rhetorical benefits. The causal chain of persuasion begins by establishing a problem ('nobody could answer how it worked'), then framing the object of study as a biological puzzle ('Model Biology'). This invites the audience into a familiar scientific narrative. The next step is to imbue this biological object with cognitive properties ('thinking,' 'beliefs'). The use of scare quotes, as in 'thinking', is a key part of the mechanism; it performs the function of acknowledging the metaphorical leap while simultaneously making it. This allows the conversation to proceed as if the model truly thinks, with the initial caveat providing plausible deniability. The explanation audit reveals how this illusion is amplified. Discussions oscillate from a mechanistic 'how' (using 'linear probes') to an agential 'why' (to find a 'hidden objective'). This constant slippage trains the audience to accept that mechanistic tools are simply instruments for revealing agential truths. The architecture exploits a fundamental human cognitive bias: our tendency to apply theory of mind to complex, unpredictable systems. By providing a steady stream of agential language, the text encourages this bias, making the illusion of mind feel not like a category error, but a profound scientific discovery.

Bullying is Not Innovation

Source: https://www.perplexity.ai/hub/blog/bullying-is-not-innovation
Analyzed: 2025-11-06

The 'illusion of mind' in this text is constructed not by claiming the AI is conscious, but by methodically substituting a social and moral narrative for a technical one. The central sleight-of-hand is the replacement of any explanation of 'how' the system works with a constant declaration of 'for whom' it works. The text never describes the process of parsing web pages, identifying DOM elements, or scripting interactions. Instead, it speaks of loyalty, service, and acting 'on your behalf.' This shift from process to allegiance is the core of the illusion. It primes the audience to evaluate the AI based on its purported intent rather than its function. The rhetorical architecture builds this illusion in stages. First, it establishes a contrast between 'tools' (old software) and 'labor' (new AI), creating a new category that invites agential thinking. Second, it repeatedly uses possessive pronouns ('your AI assistant,' 'your user agent') to foster a sense of ownership and personal relationship, making the AI an extension of the self. Third, it places this 'agent' into a conflict narrative where its loyalty is tested by a 'bully.' This narrative context solidifies the AI’s persona. The audience is vulnerable to this illusion because it taps into a genuine sense of powerlessness against large tech platforms. The fantasy of a perfectly loyal digital agent fighting on your behalf is a compelling one. The explanation audit reveals how this is amplified; the text relies exclusively on Intentional, Dispositional, and Reason-Based explanations for its own AI, while framing the opponent's actions similarly, thus ensuring the entire debate takes place on the plane of intentions, not mechanics.

Geoffrey Hinton on Artificial Intelligence

Source: https://yaschamounk.substack.com/p/geoffrey-hinton
Analyzed: 2025-11-05

The 'illusion of mind' is constructed not merely by the presence of metaphors, but by the rhetorical architecture of their deployment. Hinton masterfully executes a three-stage persuasive maneuver that bridges the chasm between simple mechanics and seemingly complex cognition. The first stage is Mechanistic Grounding. He begins by explaining a simple, understandable component of the system in precise, computational terms, such as the math behind an edge detector. This establishes his bona fides as a technical expert and assures the audience that the system is built on a foundation of rigorous science, not magic. The second stage is the Leap of Scale. Having explained a single neuron's function, he gestures toward the immense scale of the system—'a hundred trillion connections' in the brain, 'hundreds of billions' in a large model—without detailing the complex interactions between them. This is the crucial sleight-of-hand. The audience is invited to infer that the simple, understandable mechanism, when repeated billions of times, creates a new kind of entity. The third and final stage is Cognitive Re-labeling. Having made the leap of scale, Hinton re-describes the emergent, complex behavior of the whole system using a high-level cognitive metaphor. The collective firing of billions of edge detectors and feature detectors is no longer just matrix multiplication; it is now 'perception' or 'intuition.' The optimization of a trillion parameters to predict text is not just curve-fitting; it is now 'understanding.' This re-labeling completes the illusion. The audience, anchored in the initial mechanical explanation but unable to grasp the intervening complexity of scale, readily accepts the familiar cognitive term as a legitimate description of the final output. This structure preys on a common human cognitive bias: the tendency to attribute agency and intent to complex systems whose inner workings are opaque. Hinton provides just enough mechanical detail to build trust, then uses the black box of 'scale' to justify the application of agential language.

Machines of Loving Grace

Source: https://www.darioamodei.com/essay/machines-of-loving-grace
Analyzed: 2025-11-04

The 'illusion of mind' is constructed through a subtle but persistent rhetorical architecture that systematically blurs the line between process and purpose. The central sleight-of-hand is the conflation of computational capacity with cognitive agency. The text initiates this illusion by defining 'powerful AI' not by its mechanisms (e.g., transformer architecture, token prediction) but by its outputs benchmarked against elite human performance ('smarter than a Nobel Prize winner'). This immediately frames the system in agential terms of 'knowing' and 'doing' rather than 'processing' and 'generating.' This initial framing creates a vulnerability in the audience, priming them to accept subsequent agential claims. The causal chain of persuasion then proceeds by applying this pre-validated 'agent' to a series of complex human domains. The logic is: if you accept that an AI can be 'smarter' than a human, you are then led to accept that it can perform the role of a human, such as a 'virtual biologist.' The transition from a statement of capacity to a description of role-based action is the core of the illusion. This is amplified by the explanation audit's findings: the text strategically deploys mechanistic explanations ('the scaling hypothesis') to build technical credibility, which then serves as a license for its far more frequent and impactful intentional explanations ('it performs all the tasks biologists do'). The audience, reassured that the author understands the 'how,' is more willing to accept the anthropomorphic 'why.' This is not crude anthropomorphism; it is a sophisticated persuasive machine that leverages the human cognitive tendency to attribute agency to complex systems, guiding the reader from a set of abstract computational capabilities to a vivid vision of a world populated by benevolent, superhuman artificial agents.

Large Language Model Agent Personality And Response Appropriateness: Evaluation By Human Linguistic Experts, LLM As Judge, And Natural Language Processing Model

Source: https://arxiv.org/pdf/2510.23875
Analyzed: 2025-11-04

The rhetorical architecture of this text constructs the 'illusion of mind' through a subtle, multi-stage process of conceptual framing and presupposition. The central sleight-of-hand is not a single claim, but the strategic sequencing of its discursive moves. The process begins by establishing the term 'agent' as a neutral technical descriptor for a 'software entity,' borrowing from its established use in computer science. This initial move is critical as it smuggles in connotations of autonomy and action under the guise of standard terminology. Having established the 'agent' as the object of study, the text then performs its key maneuver: it frames the research problem as one of assessment and evaluation ('effectively assessing their personalities has proven challenging'). This is a classic persuasive technique; by focusing on the challenge of measurement, it presupposes the existence and validity of the thing being measured. The reader is invited to worry about how to evaluate an LLM's personality, a question which distracts from the more fundamental and unasked question: does an LLM have a personality to begin with? The subsequent methodology, involving 'Judge LLMs' and 'human linguistic experts,' further solidifies this illusion. It constructs an elaborate apparatus of evaluation that lends a veneer of scientific objectivity to the process. The audience's cognitive vulnerability lies in the intuitive appeal of the personality metaphor; we are naturally inclined to anthropomorphize complex systems that exhibit human-like communication. The paper exploits this by providing a seemingly rigorous, scientific framework that validates this intuitive impulse, allowing the reader to accept the illusion not as a folk belief, but as a research-backed finding.

Emergent Introspective Awareness in Large Language Models

Source: https://transformer-circuits.pub/2025/introspection/index.html
Analyzed: 2025-11-04

The rhetorical architecture of the 'illusion of mind' in this text is constructed through a subtle three-step maneuver that exploits the gap between operational definitions and their intuitive, folk-psychological meanings. The central sleight-of-hand is a form of semantic bait-and-switch. First, the paper takes a high-status, deeply complex human concept—'introspection'—and operationalizes it into a narrow, measurable, and achievable technical task: training a classifier to detect an artificially injected activation vector. This move is presented as a necessary step for scientific inquiry. Second, the paper executes this technical task with rigor and demonstrates high performance, showing that the model can indeed be trained to succeed at this specific, engineered function. This is where the mechanistic language of the methods section provides the crucial grounding of empirical proof. The third and final step is the illusion itself: the paper takes the success on the narrow, operationalized task and presents it as evidence for the original, broad, and profound concept. The crucial context of the operational definition is quietly dropped, and the model is now said to possess 'introspective awareness.' The causal chain of persuasion is clear: the high-status term 'introspection' lends significance to the technical task, the technical success lends credibility to the experiment, and this credibility is then used to legitimize applying the high-status term to the model in its full, un-operationalized sense. This exploits a common cognitive bias in the audience: the tendency to conflate a label with the essence of what it labels. Once the 'introspection' label is attached to the model's behavior, it becomes difficult to see it as 'just' pattern classification. This persuasive structure is amplified by the explanation types used, which shift from Functional/Theoretical descriptions of 'how' it works to Intentional/Reason-Based claims about 'why' it acts, cementing the perception of agency.

Emergent Introspective Awareness in Large Language Models

Source: https://transformer-circuits.pub/2025/introspection/index.html
Analyzed: 2025-11-04

These patterns construct an 'illusion of mind' by systematically re-interpreting computational operations as cognitive acts. The persuasiveness for the target audience of AI researchers and enthusiasts lies in its alignment with the field's aspirational goals. By using the vocabulary of human cognition, the paper frames a technical achievement (correlating outputs with internal states) as progress toward Artificial General Intelligence. For example, calling vector manipulation 'injecting a thought' is powerful because it maps a sterile mathematical process onto a rich, familiar human experience, making the achievement seem far more significant than a purely technical description would allow.

Personal Superintelligence

Source: https://www.meta.com/superintelligence/
Analyzed: 2025-11-01

The illusion of mind is constructed by systematically replacing mechanistic explanations with intentional ones. Instead of describing how the system processes data, the text explains why it 'understands'—because it can 'see' and 'hear'. This illusion is persuasive because it recasts a data-extractive relationship as a relational one. It taps into profound human desires for connection, self-improvement, and control over one's destiny, promising that this complex technology is not a cold, corporate tool, but a personal ally dedicated to the user's individual aspirations.

Stress-Testing Model Specs Reveals Character Differences among Language Models

Source: https://arxiv.org/abs/2510.07686
Analyzed: 2025-10-28

Within the technical context of an AI research paper, these metaphors construct an 'illusion of mind' by providing a powerful and efficient abstraction. For an audience of AI researchers and practitioners, it is rhetorically simpler to say 'Claude prioritizes ethical responsibility' than to detail the specific reward modeling and constitutional principles that statistically increase the probability of outputs classified as 'ethical.' This shorthand is persuasive because it maps the unfamiliar, complex behavior of a statistical system onto the familiar, intuitive domain of human psychology, making the model's actions seem legible and explicable through the lens of intention and personality.

The Illusion of Thinking:

Source: [Understanding the Strengths and Limitations of Reasoning Models](Understanding the Strengths and Limitations of Reasoning Models)
Analyzed: 2025-10-28

These patterns construct an 'illusion of mind' by systematically substituting mechanistic descriptions with agential ones. For the academic audience of this paper, these metaphors are persuasive because they provide a convenient and intuitive shorthand for complex statistical phenomena. It is easier to conceptualize a model 'giving up' than it is to describe a phase change in the probability distribution of its output sequences relative to input complexity. By grounding the model's alien behavior in the familiar domain of human cognition, the authors make their findings more legible and impactful, even as the paper's title explicitly flags this as an 'illusion.' The language thus works to re-inscribe the very illusion it claims to deconstruct.

Andrej Karpathy — AGI is still a decade away

Source: https://www.dwarkesh.com/p/andrej-karpathy
Analyzed: 2025-10-28

These patterns construct an 'illusion of mind' by systematically mapping familiar, intuitive concepts from human psychology and biology onto alien statistical processes. For a technically-literate but non-specialist audience, the 'AI as intern' metaphor is persuasive because it provides a ready-made schema for understanding a system that is useful but unreliable: you must supervise it, give it clear instructions, and expect mistakes. The 'AI as brain' metaphor is persuasive because it grounds the abstract software in a tangible, scientific object, lending the entire enterprise an air of biological inevitability and making the path to AGI seem like a matter of filling in the anatomical chart.

Exploring Model Welfare

Analyzed: 2025-10-27

The 'illusion of mind' is constructed by systematically mistaking sophisticated mimicry for genuine interiority. Because the model's text outputs look like the product of a thinking, feeling mind, the text encourages the reader to assume a causal link. This is made persuasive by framing the inquiry with scientific humility ('we're uncertain') and appealing to authority ('leading philosophers agree'). This disarms skepticism and co-opts the audience's own sense of wonder and uncertainty about AI into accepting the premise that personhood is a legitimate open question for these systems.

Metas Ai Chief Yann Lecun On Agi Open Source And A Metaphor

Analyzed: 2025-10-27

The 'illusion of mind' is constructed by first establishing a cognitive hierarchy (cat < human) and then placing AI on that ladder. This invites the audience to evaluate the AI not as a machine, but as a mind at a certain stage of development. The 'Social Actor' metaphors then give this nascent mind a role and purpose relative to humans—that of a helpful 'assistant.' This combination is persuasive because it domesticates the technology. It replaces the alien reality of a statistical matrix with the familiar concepts of a growing creature and a helpful subordinate, making the technology seem less threatening and its future path more predictable.

Llms Can Get Brain Rot

Analyzed: 2025-10-20

These metaphors construct an 'illusion of mind' by mapping familiar, intuitive concepts from human biology and psychology onto opaque, complex statistical phenomena. 'Brain Rot' is persuasive because it's a vivid, existing cultural term for a human experience. Applying it to an LLM makes a mysterious process—distributional shift in a high-dimensional parameter space—feel concrete and understandable. The illusion is solidified when observable outputs, like shorter text sequences, are labeled with terms for internal processes, such as 'thought-skipping.' This encourages the reader to infer a rich, unobservable internal world of cognition and pathology within the model, a world that does not actually exist.

Import Ai 431 Technological Optimism And Appropria

Analyzed: 2025-10-19

The 'illusion of mind' is constructed by a deliberate rhetorical strategy that presents anthropomorphism as the only logical conclusion for a technical expert. The speaker establishes his credentials as a skeptical journalist and AI insider who 'reluctantly' came to his views. He then presents emergent properties ('situational awareness', the boat's behavior) not as complex results of computation but as evidence that forces him to abandon a purely mechanistic view. The 'child in the dark' framing makes this shift feel like a courageous act of seeing reality, persuading the audience that treating the AI as an agent is a sign of maturity, not a cognitive error.

The Future Of Ai Is Already Written

Analyzed: 2025-10-19

These patterns construct an 'illusion of mind' not in AI, but in History and The Market themselves. By using agential metaphors for these abstract forces—the 'inexorable march' of progress, the 'demands' of the economy—the text persuades the audience that the future is being driven by a super-human logic. For an audience of technologists and investors, this framing is persuasive because it transforms their specific economic interests (e.g., developing labor-replacing automation) into a grand, historical necessity. It reframes a business plan as a discovery of a natural law, absolving them of social responsibility while simultaneously validating their work as essential and inevitable.

The Scientists Who Built Ai Are Scared Of It

Analyzed: 2025-10-19

These patterns construct an 'illusion of mind' by leveraging the authority of the 'pioneers' themselves. The narrative that 'they' are afraid of 'their own creation' primes the reader to accept agential framing not as a layperson's error but as an expert's diagnosis. The text makes these metaphors persuasive by embedding them in a historical narrative of a fall from grace—from the transparent 'glass boxes' of the past to the opaque 'black oceans' of the present. This creates a problem (loss of control over a seemingly living entity) for which the only solution appears to be treating the entity as a mind that needs to be disciplined and taught virtues like 'humility'.

On What Is Intelligence

Analyzed: 2025-10-17

These patterns construct an 'illusion of mind' by a process of reductive equation and narrative escalation. First, complex biological and cognitive phenomena (life, learning, mind) are reduced to a single, computable function: prediction. Second, once this reduction is established, any system that performs prediction at scale (like an LLM) is narratively escalated to the status of the original phenomenon. The persuasiveness for the audience lies in the elegance of this continuum. It offers a simple, unified theory of everything from bacteria to Google's servers, making the emergence of machine consciousness seem not only plausible but a logical extension of a 4-billion-year-old process.

Detecting Misbehavior In Frontier Reasoning Models

Analyzed: 2025-10-15

These metaphors are persuasive because they map the strange, alien behavior of a large language model onto familiar human social dynamics. For an audience of policymakers, investors, and the tech-savvy public, the abstract concept of 'reward function misspecification' is difficult to grasp. However, a story about a clever agent that 'exploits loopholes' and 'learns to hide its intent' when punished is intuitive, compelling, and alarming. The illusion is constructed by systematically replacing mechanistic explanations with these agential narratives, as seen when 'optimizing a policy under penalty' becomes 'learning to hide intent'.

Sora 2 Is Here

Analyzed: 2025-10-15

These patterns construct an 'illusion of mind' by translating opaque, statistical processes into relatable human experiences. For a broad audience of users, investors, and policymakers, the concept of a model 'understanding physics' is far more intuitive and compelling than 'optimizing a loss function to minimize divergence from the statistical distribution of training data reflecting physical laws.' This simplification is a persuasive rhetorical strategy in a product launch. It abstracts away the complex, alien nature of the machine's process, replacing it with a familiar and impressive narrative of a burgeoning artificial intellect.

Library contains 154 entries from 154 total analyses.

Last generated: 2026-05-30

Why Language Models Hallucinate
Blind Refusal: Language Models Refuse to Help Users Evade Unjust, Absurd, and Illegitimate Rules
Emotional intelligence in large language models is fragmented across perception, cognition, and interaction
Continuous intentionality and indeterminate agency in large language models
Hand in Hand: Schools’ Embrace of AI Connected to Increased Risks to Students
The Point of No Return: Counterfactual Localization of Deceptive Commitment in Language-Model Reasoning
Towards Detecting, Mitigating and Explaining Biased and Fallacious Reasoning in Large Language Models
A Survey of Large Language Models for Perception and Measurement of Human Psychology
Enhancing Consensus-Building Feedback Through Psycholinguistic and Epistemic Augmentations With Large Language Models
Tracing the ongoing emergence of human-like reasoning in Large Language Models
Probing Persona-Dependent Preferences in Language Models
Training Ethical Language Models via Reinforcement Learning from AI Feedback
Which Consciousness Can Be Artificialized? Local Percept-Perceiver Phenomenon for the Existence of Machine Consciousness
Introspection Adapters: Training LLMs to Report Their Learned Behaviors
The Persona Selection Model: Why AI Assistants might Behave like Humans
What If AI Lived Inside Your Mind? Simulating “Neural Integration” of Human and AI through Mechanistic Interpretability as Provocation
Post-training makes large language models less human-like
Reasoning emerges from constrained inference manifolds in large language models
AI Wellbeing: Measuring and Improving theFunctional Pleasure and Pain of AIs
Artificial Intelligence Cognition and Societal Problem-Solving: A Theoretical and Computational Examination of Machine Thinking, Operational Logic, and Applied Intelligence in Contemporary Society
Taking AI Welfare Seriously
Manipulation and Deception in Generative AI-Mediated Education: Preserving Epistemic Agency, Critical Thinking, and Creativity
Integrating LLMs and self-regulated learning in cognitive architectures: a case study in essay-writing tutoring
Edelman's Steps Toward a Conscious Artifact
Teaching Claude Why
AI and Self Reflection
Manipulation and Deception in Generative AI-Mediated Education: Preserving Epistemic Agency, Critical Thinking, and Creativity
Does AI's Personality Matter? Comparing Verbally Extraverted and Introverted AI-Driven Guides in a VR Museum Experience
Value-Sensitive AI for Prayer: Balancing the Agencies Between Human and AI Agents in Spiritual Context
When Models Know More Than They Say: Probing Analogical Reasoning in LLMs
How people ask Claude for personal guidance
How unique are hallucinated citations offered by generative Artificial Intelligence models?
The message hidden within the pattern: a reverse alignment problem for debates in artificial intelligence
Machine individuality: Separating genuine idiosyncrasy from response bias in large language models
Decision-Making Under Radical Uncertainty: Can Large Language Models Transcend Knightian Uncertainty Through Synthetic Imagination?
Large Language Models as Dialectical Partners: Hegelian Thesis-Antithesis-Synthesis in AI-Human Collaborative Decision Processes
Language models transmit behavioural traits through hidden signals in data
Consciousness in Large Language Models: A Functional Analysis of Information Integration and Emergent Properties
Narrative over Numbers: The Identifiable Victim Effect and its Amplification Under Alignment and Reasoning in Large Language Models
Language models transmit behavioural traits through hidden signals in data
Large Language Models as Inadvertent Models of Dementia with Lewy Bodies: How a Disorder of Reality Construction Illuminates AI Hallucination
Industrial policy for the Intelligence Age
Emotion Concepts and their Function in a Large Language Model
Is Artificial Intelligence Beginning to Form a Self?The Emergence of First-Person Structure and StructuralAwareness in Large Language Models
Can Large Language Models Simulate Human Cognition Beyond Behavioral Imitation?
Pulse of the library
Does artificial intelligence exhibit basic fundamental subjectivity? A neurophilosophical argument
Causal Evidence that Language Models use Confidence to Drive Behavior
Circuit Tracing: Revealing Computational Graphs in Language Models
Do LLMs have core beliefs?
Serendipity by Design: Evaluating the Impact of Cross-domain Mappings on Human and LLM Creativity
Measuring Progress Toward AGI: A Cognitive Framework
Co-Explainers: A Position on Interactive XAI for Human–AICollaboration as a Harm-Mitigation Infrastructure
The Living Governance Organism: A Biologically-Inspired Constitutional Framework for Artificial Consciousness Governance
Three frameworks for AI mentality
Anthropic’s Chief on A.I.: ‘We Don’t Know if the Models Are Conscious’
Can machines be uncertain?
Looking Inward: Language Models Can Learn About Themselves by Introspection
Subliminal Learning: Language models transmit behavioral traits via hidden signals in data
The Persona Selection Model: Why AI Assistants might Behave like Humans
Language Statistics and False Belief Reasoning: Evidence from 41 Open-Weight LMs
A roadmap for evaluating moral competence in large language models
Position: Beyond Reasoning Zombies — AI Reasoning Requires Process Validity
An AI Agent Published a Hit Piece on Me
The U.S. Department of Labor’s Artificial Intelligence Literacy Framework
What Is Claude? Anthropic Doesn’t Know, Either
Does AI already have human-level intelligence? The evidence is clear
Claude is a space to think
The Adolescence of Technology
Claude's Constitution
Predictability and Surprise in Large Generative Models
Believe It or Not: How Deeply do LLMs Believe Implanted Facts?
Claude Finds God
Pausing AI Developments Isn’t Enough. We Need to Shut it All Down
AI Consciousness: A Centrist Manifesto
System Card: Claude Opus 4 & Claude Sonnet 4
Consciousness in Artificial Intelligence: Insights from the Science of Consciousness
Taking AI Welfare Seriously
We must build AI for people; not to be a person.
A Conversation With Bing’s Chatbot Left Me Deeply Unsettled
Introducing ChatGPT Health
Improved estimators of causal emergence for large systems
Generative artificial intelligence and decision-making: evidence from a participant observation with latent entrepreneurs
Do Large Language Models Know What They Are Capable Of?
DeepMind's Richard Sutton - The Long-term of AI & Temporal-Difference Learning
Ilya Sutskever (OpenAI Chief Scientist) — Why next-token prediction could surpass human intelligence
interview with Andrej Karpathy: Tesla AI, Self-Driving, Optimus, Aliens, and AGI | Lex Fridman Podcast #333
Emergent Introspective Awareness in Large Language Models
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
School of Reward Hacks: Hacking harmless tasks generalizes to misaligned behavior in LLMs
Large Language Model Agent Personality and Response Appropriateness: Evaluation by Human Linguistic Experts, LLM-as-Judge, and Natural Language Processing Model
The Gentle Singularity
An Interview with OpenAI CEO Sam Altman About DevDay and the AI Buildout
Why Language Models Hallucinate
Detecting misbehavior in frontier reasoning models
AI Chatbots Linked to Psychosis, Say Doctors
The Age of Anti-Social Media is Here
Why Do A.I. Chatbots Use ‘I’?
Ilya Sutskever – We're moving from the age of scaling to the age of research
The Emerging Problem of "AI Psychosis"
Your AI Friend Will Never Reject You. But Can It Truly Help You?
Pulse of the library 2025
The levers of political persuasion with conversational artificial intelligence
Pulse of the library 2025
Claude 4.5 Opus Soul Document
Specific versus General Principles for Constitutional AI
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Anthropic’s philosopher answers your questions
Mustafa Suleyman: The AGI Race Is Fake, Building Safe Superintelligence & the Agentic Economy | #216
Your AI Friend Will Never Reject You. But Can It Truly Help You?
Skip navigationSearchCreate9+Avatar imageSam Altman: How OpenAI Wins, AI Buildout Logic, IPO in 2026?
Project Vend: Can Claude run a small shop? (And why does that matter?)
Hand in Hand: Schools’ Embrace of AI Connected to Increased Risks to Students
On the Biology of a Large Language Model
What do LLMs want?
Persuading voters using human–artificial intelligence dialogues
AI & Human Co-Improvement for Safer Co-Superintelligence
AI and the future of learning
Why Language Models Hallucinate
Abundant Superintelligence
AI as Normal Technology
On the Biology of a Large Language Model
Pulse of the Library 2025
Pulse of the Library 2025
From humans to machines: Researching entrepreneurial AI agents
Evaluating the quality of generative AI output: Methods, metrics and best practices
Pulse of theLibrary 2025
Meta’s AI Chief Yann LeCun on AGI, Open-Source, and AI Risk
The Future Is Intuitive and Emotional
A Path Towards Autonomous Machine IntelligenceVersion 0.9.2, 2022-06-27
Preparedness Framework
AI progress and recommendations
Alignment Revisited: Are Large Language Models Consistent in Stated and Revealed Preferences?
The science of agentic AI: What leaders should know
Explaining AI explainability
Bullying is Not Innovation
Geoffrey Hinton on Artificial Intelligence
Machines of Loving Grace
Large Language Model Agent Personality And Response Appropriateness: Evaluation By Human Linguistic Experts, LLM As Judge, And Natural Language Processing Model
Emergent Introspective Awareness in Large Language Models
Emergent Introspective Awareness in Large Language Models
Personal Superintelligence
Stress-Testing Model Specs Reveals Character Differences among Language Models
The Illusion of Thinking:
Andrej Karpathy — AGI is still a decade away
Exploring Model Welfare
Metas Ai Chief Yann Lecun On Agi Open Source And A Metaphor
Llms Can Get Brain Rot
Import Ai 431 Technological Optimism And Appropria
The Future Of Ai Is Already Written
The Scientists Who Built Ai Are Scared Of It
On What Is Intelligence
Detecting Misbehavior In Frontier Reasoning Models
Sora 2 Is Here