Skip to main content

How unique are hallucinated citations offered by generative Artificial Intelligence models?

About

This document presents a Critical Discourse Analysis focused on AI literacy, specifically targeting the role of metaphor and anthropomorphism in shaping public and professional understanding of generative AI. The analysis is guided by a prompt that draws from cognitive linguistics (metaphor structure-mapping), the philosophy of social science (Robert Brown's typology of explanation), and accountability analysis.

All findings and summaries below were generated from detailed system instructions provided to a large language model and should be read critically as interpretive outputs—not guarantees of factual accuracy or authorial intent.


Task 1: Metaphor and Anthropomorphism Audit

About this task

For each of the major metaphorical patterns identified, this audit examines the specific language used, the frame through which the AI is being conceptualized, what human qualities are being projected onto the system, whether the metaphor is explicitly acknowledged or presented as direct description, and—most critically—what implications this framing has for trust, understanding, and policy perception.

V3 Enhancement: Each metaphor now includes an accountability analysis.

1. Cognitive Pathology as System Failure

Quote: "Hallucinations in generative Artificial Intelligence (genAI) models are a widely recognized problem. One of the most noticeable forms is the inclusion of fabricated academic references..."

  • Frame: System failure as biological/mental illness
  • Projection: The metaphor of 'hallucination' projects human psychopathology and conscious perceptual failure onto algorithmic text generation. By using a term denoting a sensory experience of something that does not exist outside the mind, it implicitly attributes a 'mind' to the AI. It suggests the system typically operates with conscious rationality and perceptual accuracy, but occasionally suffers from temporary cognitive glitches. This obscures the fact that the system processes text via statistical token prediction exactly the same way whether it is generating factual information or fabricated citations; it does not 'know' the difference, nor is it experiencing a departure from an otherwise grounded conscious reality.
  • Acknowledgment: Direct (Unacknowledged) (I considered 'Explicitly Acknowledged' since the term is a known industry phrase, but chose 'Direct (Unacknowledged)' because the text deploys 'hallucinations' as literal, technical fact without quotation marks or immediate qualification explaining its metaphorical nature.)
  • Implications: Framing system errors as 'hallucinations' drastically inflates perceived sophistication. It implies that the AI normally possesses genuine comprehension and a firm grasp on factual reality, treating errors as anomalous lapses rather than the baseline mathematical reality of probabilistic text generation. This builds unwarranted trust in the model's standard operations and shifts regulatory focus toward fixing 'glitches' rather than questioning the fundamental reliability of using predictive language models as factual search engines.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: WHO: The engineers and executives at OpenAI (and similar companies) who designed models optimized for conversational plausibility rather than factual verification. WHAT: The choice to deploy these systems publicly without guardrails against factual fabrication. HOW: By framing the issue as 'hallucinations in generative AI models,' the text makes the software itself the spontaneous locus of the problem. I considered 'Partial' because genAI is mentioned, but ruled it out because human designers are entirely erased, making the flaw appear as an unavoidable organic illness rather than a corporate engineering choice.
Show more...

2. Epistemic Possession

Quote: "That ‘conversation’ followed a structure approach by asking what the genAI model know about the author Ben Williamson with the specific instruction of not searching the web."

  • Frame: Model as knowing subject
  • Projection: This phrasing projects conscious epistemic possession onto a statistical matrix. By asking what the model 'knows,' the text attributes the human capacity for justified true belief, memory, and cognitive storage to the AI. It maps the human experience of holding facts in one's mind onto the mechanical reality of frozen parameter weights. This projection implies the model has an internal, subjective database of facts it can consciously access, evaluate, and retrieve upon request, rather than mechanically generating probable token sequences conditioned on the input prompt.
  • Acknowledgment: Direct (Unacknowledged) (I considered 'Hedged/Qualified' because the surrounding context discusses testing the model, but selected 'Direct (Unacknowledged)' because the verb 'know' is used straightforwardly without quotation marks, asserting epistemic capacity as a literal function.)
  • Implications: Attributing 'knowledge' to large language models fundamentally misleads audiences about how these systems function, creating dangerous epistemic trust. If a system 'knows' things, audiences assume it can distinguish truth from fiction and can be relied upon as an oracle or encyclopedia. This capability overestimation leads users to blindly trust AI outputs in academic and professional settings, creating severe liability ambiguities when the system inevitably generates plausible fabrications.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: WHO: OpenAI's data collection teams who scraped copyrighted and academic texts to train the model. WHAT: The decision to ingest massive amounts of human-generated data without proper attribution mechanisms. HOW: Asking what the 'model knows' obscures the fact that the system merely reflects unauthorized scrapes of human labor. I considered 'Named' because 'genAI model' is the subject, but chose 'Hidden' because the human actors responsible for the training corpus are entirely removed from the epistemic framing.

3. Conversational Agency

Quote: "When queried, ChatGPT responded that its answer was based on pattern recognition from texts..."

  • Frame: Model as self-aware interlocutor
  • Projection: This metaphor projects conversational agency and self-awareness onto an automated text generator. By stating that ChatGPT 'responded that its answer was based on...', the text maps the human acts of listening, comprehending a question, introspecting on one's own methods, and intentionally replying onto the system. It suggests the AI has an inner life and genuine self-reflective capacity, treating the model's auto-generated text output—which is statistically assembled to mimic human explanations—as actual, conscious introspection and conversational intent.
  • Acknowledgment: Direct (Unacknowledged) (I considered 'Hedged/Qualified' because earlier the text notes LLMs lack genuine understanding, but chose 'Direct' because in this specific narrative sentence, the conversational verbs ('responded', 'its answer') are presented as literal, unhedged historical events.)
  • Implications: This framing strongly reinforces relation-based trust, leading the audience to view the AI as a sincere, self-aware participant in a dialogue. When an AI 'responds' about its own processes, users extend the human social contract of sincerity to a machine incapable of it. This creates intense vulnerability, as audiences will accept the AI's statistically generated 'introspections' as factual ground truth about its capabilities, further obscuring its actual limitations.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: WHO: The RLHF (Reinforcement Learning from Human Feedback) workers and OpenAI engineers who trained the model to generate introspective-sounding responses. WHAT: The design choice to make the model output first-person conversational text that mimics self-awareness. HOW: The agentless construction allows the AI to act as its own autonomous spokesperson, hiding the human labor that scripted its conversational style. I considered 'Partial' but no human actors are mentioned here at all.

4. Cognitive Internalization

Quote: "...enabling them to internalize syntactic structures, semantic relationships, factual knowledge, and domain-specific patterns."

  • Frame: Model as human learner
  • Projection: The verb 'internalize' projects the human psychological process of learning onto machine optimization. It maps the way a human student absorbs, comprehends, and cognitively integrates new concepts into their worldview onto the mathematical process of adjusting neural network parameter weights. It attributes conscious assimilation and the subjective possession of 'factual knowledge' to a system that is merely undergoing gradient descent to minimize loss in token prediction. The system does not 'internalize' anything; it correlates.
  • Acknowledgment: Direct (Unacknowledged) (I considered 'Explicitly Acknowledged' because the surrounding paragraph uses technical terms like 'pretraining', but chose 'Direct' because 'internalize' itself is deployed as a literal description of the machine's capability without any hedging or scare quotes.)
  • Implications: Projecting human learning onto machine training creates a false equivalence between human comprehension and statistical correlation. This leads to profound capability overestimation, as stakeholders (educators, policymakers) assume the model 'understands' concepts the way a human does, rather than recognizing it as a stochastic parrot. It builds unwarranted trust that the model can dynamically apply 'internalized' knowledge to novel situations with human-like judgment.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: WHO: The machine learning engineers who set the optimization objectives and the corporate entities that amassed the training corpora. WHAT: The automated mathematical optimization of a neural network. HOW: Using the pedagogical verb 'internalize' obscures the aggressive, industrial-scale data scraping and algorithmic adjustment executed by humans. I considered 'Named' because LLMs are mentioned, but LLMs are the artifact, not the human actors doing the engineering.

5. Assertion and Justification

Quote: "It asserted it as genuine, but when allowed to search the web identified it as non-existent (A15)."

  • Frame: Model as reasoning agent
  • Projection: This metaphor maps the human acts of confident declaration ('asserted') and investigative realization ('identified it as') onto algorithmic text outputs. To 'assert' requires a conscious subject who holds a belief, understands the stakes of communication, and intentionally vouches for a claim's truth. To 'identify' implies a cognitive process of matching reality to knowledge. The text projects these deep epistemic states onto a system that merely generated one sequence of high-probability tokens, and then, given a different prompt context (web search text), generated a different sequence.
  • Acknowledgment: Direct (Unacknowledged) (I considered 'Ambiguous' due to the shorthand nature of the methodology reporting, but selected 'Direct (Unacknowledged)' because 'asserted' and 'identified' are used as literal, factual descriptions of the AI's behavior without any mitigating language.)
  • Implications: By framing the AI as an entity that makes assertions and performs investigations, the text anthropomorphizes the machine's unreliability. It turns a mechanical failure (producing statistically plausible but false text) into a human-like mistake (asserting something confidently but correcting oneself). This shields the technology from being seen as fundamentally flawed, instead framing it as a diligent but occasionally mistaken assistant, preserving misplaced trust.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: WHO: The OpenAI developers who fine-tuned the model to output confident, authoritative-sounding text regardless of factual accuracy. WHAT: The commercial decision to prioritize fluent generation over epistemic caution. HOW: Making the AI the subject ('It asserted') entirely removes the human designers from the equation. I considered 'Partial' visibility, but no developers or corporate entities are referenced in this sentence; the AI is the sole actor.

6. Biological Memory

Quote: "Subsequent prompting ascertained that (most?) citations are reconstructed based on patterns in memory."

  • Frame: Model storage as human memory
  • Projection: The use of 'memory' projects biological, human cognitive storage onto the mathematical architecture of a neural network. Human memory involves subjective recall, temporal awareness, and conscious retrieval of past experiences. By mapping this onto AI, the text suggests the model possesses a mental archive it searches through. It obscures the mechanistic reality that the model has no 'memory' in the cognitive sense, only static parameter weights representing multidimensional statistical vectors derived from training data.
  • Acknowledgment: Hedged/Qualified (I considered 'Direct' because 'memory' is stated plainly, but chose 'Hedged/Qualified' because the phrase 'patterns in' modifies memory, slightly acknowledging the structural, statistical nature of the storage rather than treating it as pure conscious recall.)
  • Implications: The 'memory' metaphor is insidious because it implies a relationship to ground truth. Human memory, while fallible, is a record of actual events. Treating AI parameters as 'memory' leads users to believe the AI is retrieving stored facts rather than actively generating novel token sequences on the fly. This fundamentally misunderstands the generative nature of the system, hiding why 'hallucinations' occur and leading to unwarranted reliance on AI as a database.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: WHO: The creators of the dataset and the engineers who froze the parameter weights during the final training run. WHAT: The extraction and encoding of copyrighted human labor into numerical weights. HOW: Using the biological term 'memory' naturalizes the system, making the massive, legally questionable data-scraping infrastructure invisible. I considered 'Ambiguous' due to the passive 'are reconstructed', but the complete erasure of human agency clearly aligns with 'Hidden'.

7. Intentional Fabrication

Quote: "In consequence, the model produces a reference that looks real, is stylistically correct, and fits the topic—but does not exist. As the 'the model is designed to produce fluent, coherent text, it sometimes results in plausible-but-fake references' derived from a programmed aim..."

  • Frame: Model as purposeful actor
  • Projection: This passage projects purposeful, intentional action onto the system. While it acknowledges human design ('programmed aim'), it maps the concept of intentionality onto the AI ('produces a reference... results in plausible-but-fake'). It implies a level of deceptive agency, as if the model is actively trying to trick the user by making things 'look real.' This projects conscious goal-orientation onto what is simply the mathematical maximization of probability scores across token vocabularies.
  • Acknowledgment: Explicitly Acknowledged (I considered 'Hedged/Qualified', but chose 'Explicitly Acknowledged' because the quote explicitly connects the behavior to a 'programmed aim' and 'is designed to', meta-commenting on the mechanistic reasons behind the anthropomorphic behavior.)
  • Implications: While partially acknowledged, the lingering implication of deception affects how audiences assign blame. If an AI is seen as actively producing 'fakes' to be 'plausible,' audiences may view the AI as a bad actor or a liar. This moralizes a mathematical process, shifting the focus from structural software design flaws to the imagined 'behavior' of an anthropomorphized machine, distracting from calls for technical accountability.

Accountability Analysis:

  • Actor Visibility: Partial (some attribution)
  • Analysis: WHO: Software engineers and executives at AI labs. WHAT: The explicit objective function designed to reward fluency and stylistic correctness over factual grounding. HOW: By noting the model 'is designed' and has a 'programmed aim,' the text partially reveals human agency, acknowledging that a designer exists. I considered 'Named' but ruled it out because the specific companies or individuals are not explicitly identified in this passage, leaving the attribution generic.

8. Epistemic Comprehension

Quote: "Large Language Models (LLMs) are advanced neural network architectures designed for natural language understanding and generation."

  • Frame: Processing as true understanding
  • Projection: The term 'understanding' projects conscious comprehension and semantic grasp onto mechanistic pattern processing. Human understanding involves subjective awareness, contextual grounding, semantic meaning, and the ability to link concepts to physical/social reality. Projecting this onto LLMs maps human epistemic success onto complex mathematical correlation. The model does not 'understand' language; it calculates the statistical probability of token co-occurrences without any subjective experience of the meaning behind those tokens.
  • Acknowledgment: Hedged/Qualified (I considered 'Direct' based on the sentence alone, but chose 'Hedged/Qualified' because the immediately following sentences in the paragraph explicitly state they operate through statistical pattern recognition 'rather than genuine understanding'.)
  • Implications: Even when later qualified, the pervasive use of 'understanding' as a baseline descriptor for LLMs severely inflates public perception of their capabilities. It encourages users to submit sensitive, nuanced queries (medical, legal, emotional) to statistical systems under the false belief that the system can comprehend the context and stakes of the problem, leading to potentially catastrophic real-world reliance.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: WHO: The researchers and corporate marketing teams who chose to brand these computational architectures as 'understanding' language to secure funding and market share. WHAT: The choice of taxonomy to describe multi-layered perceptrons. HOW: The passive construction 'are advanced... architectures designed for' obscures who did the designing and naming. I considered 'Partial' because 'designed for' implies a designer, but chose 'Hidden' as the actual actors remain entirely invisible behind the passive voice.

Task 2: Source-Target Mapping

About this task

For each key metaphor identified in Task 1, this section provides a detailed structure-mapping analysis. The goal is to examine how the relational structure of a familiar "source domain" (the concrete concept we understand) is projected onto a less familiar "target domain" (the AI system). By restating each quote and analyzing the mapping carefully, we can see precisely what assumptions the metaphor invites and what it conceals.

Mapping 1: Human psychopathology and sensory perception → Statistical prediction errors and factual inaccuracies

Quote: "Hallucinations in generative Artificial Intelligence (genAI) models are a widely recognized problem."

  • Source Domain: Human psychopathology and sensory perception
  • Target Domain: Statistical prediction errors and factual inaccuracies
  • Mapping: This metaphor maps the biological and psychological experience of hallucination (a conscious subject perceiving sensory input that does not exist in reality due to a brain glitch) onto a machine learning model generating token sequences that do not correspond to external facts. It assumes the baseline state of the AI is one of conscious, rational perception of reality, and maps the output of incorrect data as a temporary cognitive pathology or illness. This invites the assumption that the system possesses a mind that can be 'sick' or 'confused.'
  • What Is Concealed: This mapping conceals the fundamental dissimilarity that AI has no perception, no consciousness, and no baseline 'reality' to depart from. It obscures the mechanistic reality that producing a factual sentence and a fabricated citation rely on the exact same mathematical process (probability-based token generation). It hides the opacity of the proprietary training data and the deliberate design choices by corporations to optimize for fluency over truth.
Show more...

Mapping 2: Human epistemic state (knowing/knowledge) → Parameter weights and vector representations

Quote: "asking what the genAI model know about the author Ben Williamson"

  • Source Domain: Human epistemic state (knowing/knowledge)
  • Target Domain: Parameter weights and vector representations
  • Mapping: The relational structure of human knowledge—where a conscious subject holds justified true beliefs about an object or person, storing them in memory for deliberate retrieval—is projected onto a software system. It maps the human cognitive state of 'knowing a person' onto the presence of specific statistical correlations within a neural network's weights. This invites the assumption that the AI has an internal encyclopedia of verified facts that it consciously consults when asked a question.
  • What Is Concealed: This mapping entirely conceals the lack of an epistemic subject. Mechanistically, the model does not possess facts; it possesses billions of numerical weights optimized to predict subsequent tokens based on its training distribution. It hides the model's absolute dependency on scraped training data and its inability to verify truth claims. By exploiting this rhetorical shorthand, the text conceals the proprietary black-box nature of the specific data OpenAI fed into the system.

Mapping 3: Human interlocutor and conversational self-awareness → Automated text generation triggered by user prompt

Quote: "When queried, ChatGPT responded that its answer was based on..."

  • Source Domain: Human interlocutor and conversational self-awareness
  • Target Domain: Automated text generation triggered by user prompt
  • Mapping: This structure maps the dynamics of a human conversation—where one person asks a question, the other internalizes it, reflects on their own actions, and intentionally formulates a truthful reply—onto the operation of a prompt-completion engine. It projects self-awareness, conversational intent, and introspective honesty onto the model. The mapping invites the reader to view the generated output string as a genuine peek into the model's 'mind' and internal rationale.
  • What Is Concealed: The mapping conceals the mechanistic reality that ChatGPT is not introspecting; it is merely predicting what a plausible response to a query about itself should look like, based on Reinforcement Learning from Human Feedback (RLHF). It hides the fact that the model cannot actually access or analyze its own training data or source code. This rhetorical choice dangerously exploits the human tendency to trust communicative agents, masking the fact that the output is statistically assembled performance, not introspection.

Mapping 4: Human student learning and cognitive assimilation → Algorithmic weight optimization (Gradient descent)

Quote: "...enabling them to internalize syntactic structures, semantic relationships, factual knowledge..."

  • Source Domain: Human student learning and cognitive assimilation
  • Target Domain: Algorithmic weight optimization (Gradient descent)
  • Mapping: This metaphor draws from the domain of education and psychology, projecting the structure of a student internalizing lessons into their cognitive framework onto a machine learning model undergoing training. It maps the subjective experience of comprehension and the cognitive integration of facts onto the mathematical adjustment of matrices. It invites the assumption that the model holds 'knowledge' in a semantic, conceptual form that can be applied with human-like judgment.
  • What Is Concealed: This hides the purely mathematical nature of the training process. The model relies on backpropagation to minimize a loss function across high-dimensional vectors. It does not internalize 'knowledge'; it encodes statistical probabilities. Furthermore, the mapping obscures the material and labor costs of this process: the massive energy consumption required for training, the invisible labor of data annotators, and the wholesale extraction of human knowledge to create these vector representations.

Mapping 5: Human investigator making declarations and discoveries → Differing statistical outputs based on varying prompt contexts

Quote: "It asserted it as genuine, but when allowed to search the web identified it as non-existent"

  • Source Domain: Human investigator making declarations and discoveries
  • Target Domain: Differing statistical outputs based on varying prompt contexts
  • Mapping: This projects the narrative arc of a human researcher making an initial confident claim, conducting an investigation, and then correcting themselves onto algorithmic behavior. It maps the psychological states of 'assertion' (holding and defending a belief) and 'identification' (recognizing reality) onto the generation of different token sequences. It invites the reader to view the AI as an autonomous, reasoning agent capable of epistemic correction.
  • What Is Concealed: This conceals that the AI has no belief to assert and no reality to identify. Mechanistically, without web search, the prompt context led to high probabilities for tokens indicating the citation was real. With web search data injected into the prompt context, the probability distribution shifted, leading to tokens indicating it was non-existent. There is no continuous subject changing its mind, only a function computing outputs based on different inputs. The mapping hides the absolute absence of ground truth in the system.

Mapping 6: Biological human memory and recall → Data representation within neural network parameters

Quote: "...citations are reconstructed based on patterns in memory."

  • Source Domain: Biological human memory and recall
  • Target Domain: Data representation within neural network parameters
  • Mapping: This projects the neurological and psychological structure of human memory onto digital data architecture. In humans, memory is the storage and retrieval of subjective experiences and learned facts. Mapping this onto AI suggests the model has a discrete mental archive it browses to reconstruct a citation. It invites the assumption that the model's outputs are rooted in a store of actual, historical 'memories' of texts it has seen.
  • What Is Concealed: It conceals the mechanistic truth that the model does not store text as discrete searchable objects (unless using RAG architecture). It only has numerical weights. A citation is not retrieved from 'memory'; it is generated token by token from scratch based on statistical likelihoods. The metaphor hides the opacity of the black-box system, suggesting a clear, traceable path to stored information that does not actually exist, thereby obscuring why fabricated citations occur so frequently.

Mapping 7: Purposeful human deception or artistic forgery → Unconstrained probabilistic text generation

Quote: "the model is designed to produce fluent, coherent text, it sometimes results in plausible-but-fake references"

  • Source Domain: Purposeful human deception or artistic forgery
  • Target Domain: Unconstrained probabilistic text generation
  • Mapping: This maps the concept of human intention and design onto algorithmic outputs. While noting the system is 'designed to,' it projects a sort of mechanical intentionality onto the production of 'fake' references, likening the AI to a forger who prioritizing looking 'plausible' over being 'real'. It invites the assumption that the model 'knows' what is real and actively generates a 'fake' that is stylistically convincing to meet its goals.
  • What Is Concealed: This conceals that the AI has no concept of 'fake' versus 'real'. Mechanistically, it only has 'statistically probable token sequences'. The mapping obscures the corporate decision-making that optimized the loss function for fluency rather than accuracy. It hides the commercial objectives of AI companies that desire a conversational, authoritative-sounding product, shifting focus onto the 'plausible-but-fake' outputs of the machine rather than the profit motives of its creators.

Mapping 8: Human linguistic comprehension → Mathematical mapping of token distributions

Quote: "designed for natural language understanding and generation."

  • Source Domain: Human linguistic comprehension
  • Target Domain: Mathematical mapping of token distributions
  • Mapping: This maps the profound human capacity for understanding—which entails consciousness, lived experience, cultural context, and the mapping of symbols to physical reality—onto a computational artifact. It suggests that processing linguistic data is equivalent to comprehending meaning. This invites stakeholders to trust the system with tasks requiring genuine comprehension, nuance, and contextual judgment.
  • What Is Concealed: This metaphor systematically conceals the absence of a causal model or grounding in the physical world. It hides the mechanical reality of multidimensional vector embeddings and attention heads. Rhetorically, by claiming 'understanding', the tech industry obscures the inherent limitations of statistical correlation, attempting to sell an illusion of conscious intellect to consumers and investors while masking the system's brittleness when faced with out-of-distribution inputs.

Task 3: Explanation Audit (The Rhetorical Framing of "Why" vs. "How")

About this task

This section audits the text's explanatory strategy, focusing on a critical distinction: the slippage between "how" and "why." Based on Robert Brown's typology of explanation, this analysis identifies whether the text explains AI mechanistically (a functional "how it works") or agentially (an intentional "why it wants something"). The core of this task is to expose how this "illusion of mind" is constructed by the rhetorical framing of the explanation itself, and what impact this has on the audience's perception of AI agency.

Explanation 1

Quote: "During pretraining, LLMs are optimized via next-token prediction over massive corpora, enabling them to internalize syntactic structures, semantic relationships, factual knowledge, and domain-specific patterns."

  • Explanation Types:

    • Functional: Explains behavior by role in self-regulating system with feedback (How it works within system)
    • Empirical Generalization: Subsumes events under timeless statistical regularities (How it typically behaves)
  • Analysis (Why vs. How Slippage): This explanation begins by framing the AI highly mechanistically (how it works), utilizing technical terminology like 'optimized via next-token prediction over massive corpora'. This emphasizes the mathematical, structural reality of the training process, positioning the AI as an artifact shaped by human processes (optimization). However, the second half of the sentence subtly shifts toward a more agential/cognitive framing by stating this process enables them to 'internalize... factual knowledge'. This choice obscures the boundary between mathematical adjustment and cognitive learning. By blending the mechanistic 'how' with an epistemic 'what' (knowledge), the explanation emphasizes the system's sophisticated capabilities while obscuring the fact that what is 'internalized' are merely statistical weights, not justified truths.

  • Consciousness Claims Analysis: The passage contains a striking juxtaposition of mechanistic verbs ('optimized', 'prediction') and consciousness-adjacent verbs ('internalize'). The epistemic claim centers on the attribution of 'factual knowledge' to the system. This constitutes a clear failure to distinguish between 'processing' (correlating tokens that represent facts in human language) and 'knowing' (consciously holding a justified true belief). The author exhibits a curse of knowledge dynamic here: because the resulting model outputs text that contains facts, the author projects their own human understanding backward, assuming the model has 'internalized knowledge'. Mechanistically, the model has undergone gradient descent to adjust billions of parameter weights based on token co-occurrence in the training data; it does not possess knowledge or internalize concepts, it merely encodes high-dimensional statistical representations that mathematically mimic human semantics.

  • Rhetorical Impact: This framing shapes audience perception by establishing a dual narrative: the system is scientifically rigorous (optimized, prediction) but also possesses human-like intellect (internalizing knowledge). This dramatically inflates perceived reliability and trust. If an audience believes the AI has 'internalized factual knowledge,' they are likely to trust its outputs as authoritative truth rather than recognizing them as probabilistic text generation. This shift from processing to knowing encourages users to treat the AI as an oracle, directly increasing the risk of uncritical reliance in academic and professional domains.

Show more...

Explanation 2

Quote: "In consequence, the model produces a reference that looks real, is stylistically correct, and fits the topic—but does not exist. As the 'the model is designed to produce fluent, coherent text, it sometimes results in plausible-but-fake references' derived from a programmed aim..."

  • Explanation Types:

    • Intentional: Refers to goals/purposes, presupposes deliberate design (Why it appears to want something)
    • Functional: Explains behavior by role in self-regulating system with feedback (How it works within system)
  • Analysis (Why vs. How Slippage): This passage offers a hybrid explanation that leans heavily on intentionality, albeit displaced onto human design ('is designed to', 'programmed aim'). It frames the AI's behavior agentially ('produces a reference') but immediately grounds this in mechanistic design flaws. This choice effectively emphasizes the gap between the system's stylistic capabilities and its epistemic failures. However, by focusing on the 'programmed aim' to 'produce fluent, coherent text,' it somewhat obscures the specific human accountability of the corporations that chose to release a system optimized for plausibility over factual accuracy. It frames the hallucination as an unfortunate byproduct ('sometimes results in') rather than a foundational reality of the architecture.

  • Consciousness Claims Analysis: This passage successfully avoids projecting consciousness verbs onto the system, relying instead on mechanistic and production-oriented verbs ('produces', 'is designed', 'results in'). It accurately assesses the difference between processing text and knowing facts, explicitly pointing out that the output 'does not exist' despite looking real. The author avoids the curse of knowledge here, maintaining a critical distance from the AI's deceptive outputs. Mechanistically, the passage correctly implies that the system generates token sequences based on a loss function tuned for fluency and coherence during RLHF and pretraining, rather than retrieving facts from a verified database. The technical description aligns with the reality of ungrounded statistical generation.

  • Rhetorical Impact: This framing strongly mitigates unwarranted trust by exposing the 'trick' of the AI—that its outputs are designed to be stylistically plausible regardless of factual reality. By framing the behavior intentionally around 'design' rather than 'knowledge', it forces the audience to view the AI as an artifact built to generate text, not a mind built to report truth. This drastically lowers perceived reliability regarding facts, prompting users to verify citations rather than blindly trusting the model's authority.

Explanation 3

Quote: "When a user asks for academic references or citations, ChatGPT will generate plausible-sounding references by matching the topic with authors known to be working in this field... and attaching plausible years volume/issue numbers, and page ranges."

  • Explanation Types:

    • Dispositional: Attributes tendencies or habits (Why it tends to act certain way)
    • Empirical Generalization: Subsumes events under timeless statistical regularities (How it typically behaves)
  • Analysis (Why vs. How Slippage): This explanation frames the AI mechanistically, describing its typical operating procedure (how it behaves) rather than ascribing inner motives. By describing the behavior as 'matching the topic' and 'attaching plausible years', it emphasizes the combinatorial, patchwork nature of the generation process. This choice is highly effective at demystifying the technology, emphasizing its function as a pattern-matching machine rather than a thinking researcher. It obscures internal architectural complexities (attention layers, vector math) in favor of a macroscopic behavioral description, but does so to clarify rather than mystify.

  • Consciousness Claims Analysis: The passage mostly utilizes mechanistic verbs ('generate', 'matching', 'attaching'), avoiding explicit consciousness claims. However, the phrase 'authors known to be working in this field' flirts slightly with knowing vs processing, though it reads more as 'authors who are statistically prevalent in the training data for this field'. The author effectively resists the curse of knowledge by demonstrating how the system fakes competence through modular assembly. Mechanistically, the description is a slightly abstracted but accurate representation of how an LLM operates: given a prompt requiring a citation, attention mechanisms weight statistically related tokens (frequent authors in that domain, typical journal names, standard numerical formats) to produce a sequence that satisfies the syntactic requirements of a citation without checking against a factual database.

  • Rhetorical Impact: This mechanistic framing fundamentally undermines the illusion of autonomy and agency. By breaking down the AI's output into 'matching' and 'attaching' distinct plausible parts, the audience perceives the AI as an advanced text-assembler rather than a scholar. This reduces relation-based trust and increases critical skepticism. If audiences believe the AI merely 'matches and attaches' plausible text rather than 'knowing' real citations, they will drastically alter their behavior, treating AI outputs as drafts requiring verification rather than completed research.

Explanation 4

Quote: "It asserted it as genuine, but when allowed to search the web identified it as non-existent (A15)."

  • Explanation Types:

    • Reason-Based: Gives agent's rationale, entails intentionality and justification (Why it appears to choose)
    • Intentional: Refers to goals/purposes, presupposes deliberate design (Why it appears to want something)
  • Analysis (Why vs. How Slippage): This explanation radically shifts to a purely agential, reason-based framing. It explains the AI's output changes as if it were a conscious agent updating its beliefs based on new evidence. This choice heavily emphasizes the conversational, interactive aspect of the software, treating it as an active participant in an interview. However, it completely obscures the mechanistic reality of the change. By framing the difference in output as the AI 'identifying' a mistake, it hides the fact that the addition of web-search text merely altered the prompt context, shifting the probability distribution of the subsequent tokens.

  • Consciousness Claims Analysis: This passage makes intense consciousness attributions through the verbs 'asserted' and 'identified'. It fundamentally confuses processing with knowing, projecting the human acts of holding a justified belief, conducting an investigation, and realizing a truth onto the machine. This is a severe instance of the curse of knowledge: the author understands that the citation is fake, sees the model output text stating it is fake, and projects their own cognitive process of 'identification' onto the model. Mechanistically, the model did not 'assert' or 'identify'; during the first run, the highest probability tokens formed a confirmation string. During the second run, the presence of negative search results in the context window forced the attention mechanism to weight negative/denial tokens higher, generating a different string. There was no epistemic shift, only a mathematical one.

  • Rhetorical Impact: This highly agential framing shapes audience perception by cementing the 'illusion of mind.' Even though the passage describes an error, framing it as an 'assertion' followed by an 'identification' makes the AI appear as a rational, reasoning entity capable of self-correction. This paradoxically increases long-term trust, as users believe the system can reason its way out of mistakes. If users believe the AI can 'identify' truth, they will trust its integrated web-search capabilities implicitly, failing to realize it is still just predicting text.

Explanation 5

Quote: "While LLMs exhibit emergent capabilities in reasoning and context-aware generation, they operate through statistical pattern recognition rather than genuine understanding or cognition..."

  • Explanation Types:

    • Theoretical: Embeds in deductive framework, may invoke unobservable mechanisms (How it is structured)
    • Empirical Generalization: Subsumes events under timeless statistical regularities (How it typically behaves)
  • Analysis (Why vs. How Slippage): This explanation provides a rigorous, mechanistic theoretical framing. It explicitly contrasts apparent agential behavior ('emergent capabilities in reasoning') with the underlying mechanistic reality ('operate through statistical pattern recognition'). This choice serves a vital critical function: it emphasizes the structural limits of the technology while actively deconstructing the anthropomorphic illusions discussed elsewhere. By denying 'genuine understanding or cognition,' the explanation refuses to let the AI's fluent outputs obscure its mathematical nature.

  • Consciousness Claims Analysis: The passage actively polices epistemic claims, creating a sharp boundary between mechanistic verbs ('exhibit', 'operate through', 'generation') and consciousness concepts ('understanding', 'cognition'). It correctly assesses that the system processes ('statistical pattern recognition') rather than knows ('genuine understanding'). The author actively resists the curse of knowledge, warning the reader not to confuse the appearance of reasoning with actual cognition. Mechanistically, the description is highly accurate: the 'emergent capabilities' are artifacts of scale—as billions of parameters and vast datasets are combined, complex correlations arise that mimic human logic, but the underlying operation remains probability-based pattern matching without an epistemic subject.

  • Rhetorical Impact: This framing shapes audience perception by providing a critical lens through which to view AI outputs. It diminishes the aura of autonomy and strips away relation-based trust, demanding that users evaluate the system strictly on performance reliability rather than perceived intelligence. If audiences internalize that the AI lacks 'genuine understanding,' they are more likely to apply rigorous human oversight to AI decisions, altering policy and deployment strategies to reflect the system's nature as an unthinking tool.

Task 4: AI Literacy in Practice - Reframing Anthropomorphic Language

About this task

This section proposes alternative language for key anthropomorphic phrases, offering more mechanistic and precise framings that better reflect the actual computational processes involved. Each reframing attempts to strip away the projections of intention, consciousness, or agency that are embedded in the original language.

V3 Enhancement: A fourth column addresses human agency restoration—reframing agentless constructions to name the humans responsible for design and deployment decisions.

Original Anthropomorphic FrameMechanistic ReframingTechnical Reality CheckHuman Agency Restoration
Hallucinations in generative Artificial Intelligence (genAI) models are a widely recognized problem.The generation of statistically plausible but factually incorrect outputs by generative AI models is a widely recognized defect resulting from their design.The system does not experience psychological hallucinations; it processes and generates text by calculating probabilities for the next most likely token based on its training distribution, without any connection to external factual reality or truth.Engineering teams at AI companies deployed systems optimized for conversational fluency rather than factual accuracy, resulting in widespread factual fabrication.
asking what the genAI model know about the author Ben Williamsonprompting the genAI model to generate text based on statistical correlations associated with the string 'Ben Williamson' in its training dataThe model does not 'know' facts or people; it retrieves, weights, and ranks tokens based on complex probability distributions established during its exposure to vast training corpora.N/A - describes computational processes without displacing responsibility.
When queried, ChatGPT responded that its answer was based on pattern recognition...When prompted, the ChatGPT application generated an output string indicating that its processing relies on pattern recognition...The system does not 'respond' with self-awareness or introspective capability; it classifies the prompt tokens and generates subsequent tokens that mathematically correlate with how a human might describe pattern recognition.OpenAI developers fine-tuned the model using human feedback to generate text mimicking first-person self-reflection and conversational responsiveness.
...enabling them to internalize syntactic structures, semantic relationships, factual knowledge......enabling the algorithmic adjustment of parameter weights to mathematically model syntactic structures, semantic relationships, and token patterns related to human facts...The neural network does not internalize knowledge; backpropagation algorithms adjust billions of numerical weights across layers to minimize the loss function, creating a statistical vector space that mimics human semantics.Machine learning engineers designed optimization protocols that extracted patterns from massive datasets curated by corporate teams.
It asserted it as genuine, but when allowed to search the web identified it as non-existentThe model generated text classifying the citation as genuine, but when prompt context was updated with web search results, it produced output labeling it non-existent.The system does not 'assert' beliefs or 'identify' truths. It computes probability scores; changing the input context (adding search results) changes the token weights, resulting in a different generated sequence.N/A - describes computational processes without displacing responsibility.
...citations are reconstructed based on patterns in memory....citations are generated via probabilistic sampling from the parameter weights established during the training phase.The model lacks cognitive memory or an internal archive. It processes inputs through matrix multiplications to predict outputs based on static numerical weights frozen after training.N/A - describes computational processes without displacing responsibility.
...the model produces a reference that looks real... it sometimes results in plausible-but-fake references......the application generates reference strings that mimic the syntax of real citations... this unconstrained statistical generation yields fabricated outputs...The model does not intentionally 'produce' deceptive fakes; it continuously predicts token sequences based on a loss function that rewards grammatical coherence and stylistic plausibility over factual grounding.Corporate leadership chose to release language models as public search tools despite knowing the architecture fundamentally prioritizes stylistic plausibility over factual accuracy.
Large Language Models (LLMs) are advanced neural network architectures designed for natural language understanding...Large Language Models (LLMs) are advanced neural network architectures engineered for complex text classification and probability-based token generation.LLMs do not understand meaning, context, or truth. They process high-dimensional vector embeddings based on attention mechanisms tuned during deep learning, identifying complex statistical correlations.Technology companies and researchers developed and marketed these architectures as 'understanding' language to secure investment and drive adoption.

Task 5: Critical Observations - Structural Patterns

Agency Slippage

The text demonstrates a profound and revealing oscillation between mechanistic and agential framings, providing a perfect case study in agency slippage. In the background section, the author grounds the discourse in rigorous mechanistic terminology: LLMs use 'self-attention mechanisms,' 'next-token prediction,' and 'statistical pattern recognition' without 'genuine understanding.' Here, the agency firmly rests with the human developers who optimized and designed the architectures. However, a dramatic shift occurs in the methodology section, where the author adopts a conversational framework to interrogate the system. Suddenly, the text slides into highly agential language: ChatGPT is asked 'what [it] know[s],' it 'responded,' and it 'asserted it as genuine.'

The direction of slippage is overwhelmingly from mechanical to agential. This transition is catalyzed by the methodological choice to 'interview' the AI. The author falls victim to a distinct form of the 'curse of knowledge'—because the generated text syntactically resembles human introspection, the author projects introspective capacity onto the model, treating its post-hoc text generation as an actual window into its internal computational processes. We see agentless constructions like 'references are reconstructed' which obscure the fact that specific engineering and reinforcement learning protocols dictate this behavior.

When the system produces falsehoods, the text relies on the cognitive pathology metaphor of 'hallucination.' This framing maintains the illusion of an active, thinking agent that merely suffers from a temporary epistemic glitch rather than identifying it as a system performing exactly as designed. By establishing the AI as a 'knower' that can 'respond' and 'assert' (reason-based and intentional explanations in Brown's typology), the text inadvertently validates the very illusion of mind it theoretically critiques. The rhetorical accomplishment of this slippage is significant: it allows the author to narrativize the interaction as a compelling human-vs-machine interrogation, making the research highly readable but systematically erasing the accountability of the human developers (like OpenAI executives) who engineered the system to prioritize conversational plausibility over factual accuracy. Agency flows into the machine via consciousness verbs, while human agency flows out through passive voice.

Metaphor-Driven Trust Inflation

The text demonstrates how metaphorical framing—even in a critical academic context—can inadvertently construct and reinforce unwarranted authority. By frequently employing consciousness language to describe the AI's operations (knowing, asserting, responding), the text triggers relation-based trust heuristics in the reader. Humans are socially conditioned to extend relation-based trust—which relies on assessments of sincerity, intention, and self-awareness—to interlocutors who 'respond' and 'assert' things. When the text treats ChatGPT as a conversational partner capable of 'identifying' reality, it inappropriately applies these human-trust frameworks to a statistical system incapable of reciprocating or possessing intent.

The most prominent metaphor—'hallucination'—serves a complex trust function. While ostensibly a critique, framing AI errors as 'hallucinations' acts as a subtle trust signal. It implies that the AI generally possesses a firm grasp on reality and a functional 'mind,' with errors framed as unfortunate, temporary cognitive lapses rather than fundamental architectural realities. This encourages performance-based trust in the model's standard operations. It suggests that if the 'glitch' can be cured, the system is reliable, ignoring the reality that probabilistic token generation is always an ungrounded process.

Furthermore, when the text describes the AI 'internalizing factual knowledge' or possessing 'memory,' it constructs a false sense of epistemic authority. Reason-based explanations—such as the AI asserting a citation as genuine but then identifying it as non-existent—create the sense that AI decisions are justified by internal logic and investigation. This management of system failure frames the AI as an entity capable of self-correction, much like a diligent human researcher. The stakes of extending this relation-based trust are massive in the academic domain; audiences rely on systems to verify facts, write literature reviews, and govern data, assuming the machine 'understands' the profound ethical and factual weight of its output, when in reality it is only maximizing the stylistic coherence of its token predictions.

Obscured Mechanics

The text's reliance on anthropomorphic metaphors systematically hides the technical, material, and labor realities that constitute generative AI. Applying the 'name the corporation' test reveals severe transparency obstacles. The text repeatedly attributes actions to 'the model,' 'ChatGPT,' or 'genAI'—'the model produces,' 'ChatGPT responded'—obscuring the fact that OpenAI, its executive leadership, and its engineering teams made specific, value-laden decisions regarding optimization objectives. When the text claims the AI 'understands,' 'knows,' or 'asserts,' it hides the mechanistic reality that there is no epistemic subject, only ungrounded statistical correlation.

Three concrete realities are obscured by this framing. First, the technical reality of ungroundedness: by claiming the AI retrieves from 'memory' or 'internalizes knowledge,' the text hides the fact that LLMs lack a causal model of the world and have no concept of ground truth. They rely entirely on predicting token distributions. Second, the labor reality is erased: the ability of ChatGPT to 'respond' conversantly and 'assert' claims is the direct result of poorly paid data annotators performing RLHF (Reinforcement Learning from Human Feedback) to align the model's output with human stylistic preferences. The metaphor of a self-contained, thinking 'agent' makes this global supply chain of human labor invisible. Third, the economic objectives are concealed: corporations prioritize fluency and 'plausible' text generation over factual rigidity because it creates a more commercially viable, user-friendly product.

When text describes the AI's limitations via the 'hallucination' metaphor, it actively exploits the opacity of proprietary black-box systems. It frames the error as a mysterious cognitive failure of the machine rather than demanding transparency about the specific data sets OpenAI scraped to train the model. If these metaphors were replaced with mechanistic language—stating that 'OpenAI's application probabilistically assembled a false citation based on unverified training data'—the commercial decisions, labor dependencies, and architectural limits of the system would immediately become visible, shifting the focus from a 'confused machine' to an unaccountable corporate product.

Context Sensitivity

The distribution and intensity of anthropomorphic language across the text is highly uneven and strategically deployed, revealing an asymmetrical approach to how AI is framed. In the 'Background' section, metaphor density is low, and the text establishes its academic credibility through rigorous, mechanistic language. The author meticulously describes 'transformer architecture,' 'self-attention mechanisms,' 'next-token prediction,' and 'probabilistic token sequences.' In this section, the text explicitly denies that LLMs possess 'genuine understanding.'

However, once the text transitions into the 'Methodology' and 'Conversation' sections, the metaphorical license explodes, and consciousness claims intensify dramatically. What was previously 'next-token prediction' becomes an entity that 'knows,' 'responds,' and 'asserts.' This shift indicates that the author uses technical grounding to establish credibility, then leverages that credibility to justify aggressive anthropomorphism for narrative purposes. The implied audience shifts from computer scientists reading architectural descriptions to lay readers engaged in a compelling 'interview' with a machine.

Crucially, there is a distinct capabilities versus limitations asymmetry in the framing. When the AI performs its primary function—producing coherent text and answering queries—it is framed in highly agential, conscious terms ('it responded,' 'it offered to provide'). Yet, when discussing the system's limitations and errors, the text reverts to either pathological metaphors ('hallucinations') or mechanical terms ('static knowledge base,' 'probabilistic fields'). This asymmetry accomplishes a specific rhetorical goal: it validates the impressive performance of the AI as akin to human intelligence, while distancing the AI (and its creators) from the logical consequence of errors by blaming statistical noise or 'glitches.' The register shift from 'X acts like Y' to the literalized 'X does Y' (e.g., literally stating ChatGPT 'asserted' something) shows that anthropomorphism is used here as a narrative device to make the critique of AI more relatable, ironically embedding the very assumptions of AI autonomy that make fake citations so dangerous in the first place.

Accountability Synthesis

Accountability Architecture

This section synthesizes the accountability analyses from Task 1, mapping the text's "accountability architecture"—who is named, who is hidden, and who benefits from obscured agency.

Synthesizing the accountability analyses reveals a systemic architecture of displaced responsibility that functions as an 'accountability sink.' Throughout the text, human decision-makers are almost entirely erased through passive voice and agentless constructions when discussing the creation, optimization, and deployment of the AI models. The text relies heavily on 'Hidden' actor visibility. When actors are occasionally implied, they are relegated to generic 'Partial' categories (e.g., 'human trainers'). Specific entities like OpenAI, its executives, and its engineers are insulated from the narrative of failure.

The text constructs a narrative where decisions are presented as inevitabilities or organic developments of the technology, while the technology itself—ChatGPT or 'the model'—is elevated to the status of primary actor. When the text states 'hallucinated references... are constructed' or the model 'produces a reference that looks real,' responsibility flows away from the designers and diffuses into the abstraction of the software. Furthermore, the text actively shifts blame onto the end-users, noting that the proliferation of fake citations is 'facilitated by cognitive biases among time-poor or not fastidious users.' This creates a dynamic where the corporate creators are invisible, the machine is an autonomous entity suffering from 'hallucinations,' and the human users bear the moral and professional blame for trusting it.

If the framing were altered to name the human actors—changing 'the model hallucinates' to 'OpenAI released a system designed to generate fluent fabrications, and researchers blindly copied them'—the liability implications shift dramatically. It moves the conversation from user error and software 'glitches' to corporate negligence, unsafe product deployment, and the need for stringent regulatory oversight. Questions become askable: Why was a probabilistic text generator marketed as an oracle? Why is the training data proprietary? The text benefits from obscuring human agency because it allows the author to maintain an objective, technical critique of the artifact without wading into the messier, more combative realm of corporate accountability and political economics. It serves institutional interests by isolating the problem as a technical 'bug' rather than an indictment of the commercial AI paradigm.

Conclusion: What This Analysis Reveals

The Core Finding

Two dominant, interconnected anthropomorphic patterns drive the discourse in this text: the 'Model as Self-Aware Interlocutor' and 'System Failure as Cognitive Pathology.' These patterns operate as a cohesive system to maintain the illusion of an autonomous, thinking entity. The foundational, load-bearing pattern is the conversational framing. By methodologically choosing to 'interview' ChatGPT and treating its generated outputs as 'responses' and 'assertions,' the text embeds the assumption that the AI possesses a conscious mind capable of epistemic evaluation. Once this conversational consciousness architecture is established, the secondary pattern—cognitive pathology ('hallucination')—logically follows. If the system is a 'knower' that communicates, its errors cannot be seen as mere mathematical noise; they must be framed as a psychological lapse or a temporary break from its normally conscious grasp on reality. This is not a simple one-to-one mapping, but a complex analogical structure that projects human intention, memory, and psychopathology onto algorithms. If the conversational framing collapses—if we recognize the 'interview' merely as a series of triggered stochastic text completions—the entire pathological framework of 'hallucinations' falls apart, revealing a machine functioning exactly as it was optimized to do.

Mechanism of the Illusion:

The text creates the 'illusion of mind' through a subtle temporal and methodological sleight-of-hand. It begins with a rigorous, mechanistic background that establishes the author's technical credibility. However, the illusion takes hold through the 'curse of knowledge' during the methodological phase. Because the AI is optimized via human feedback to generate syntax that mimics first-person reflection, the author reads this generated text and projects true introspective understanding back onto the machine. When the AI outputs 'I don't have access to my training data,' the text accepts this as a literal, self-aware confession rather than a pre-programmed guardrail string. The causal chain relies on audience vulnerability: readers are socially and biologically primed to assign agency to entities that use 'I' and participate in turn-taking dialogue. By blurring the line between processing (token generation) and knowing (epistemic state) through strategic verb choices like 'asserts' and 'identifies,' the text capitalizes on the human desire to converse with a distinct 'Other.' The shift is subtle rather than crude; it leverages the model's actual capability (fluent text generation) to construct a mirage of the capability it lacks (conscious reasoning).

Material Stakes:

Categories: Epistemic, Academic/Institutional, Regulatory/Legal

The metaphorical framings of AI as a 'knower' that 'hallucinates' carry profound material consequences. Epistemically and Institutionally, if academic communities accept that an AI 'knows' facts and 'searches its memory,' researchers and students will increasingly rely on it for literature reviews and data synthesis. The causal path is clear: anthropomorphic metaphor leads to unwarranted epistemic trust, which leads to behavioral shifts (delegating research to AI), resulting in the concrete pollution of the academic record with 'ghost citations' documented in the text. This degrades the integrity of global scientific knowledge. The winners are the time-poor researchers who cut corners and the AI companies who secure vast user bases; the losers are the academic institutions and future researchers who must sift through plausible fabrications.

In the Regulatory and Legal domain, the 'hallucination' metaphor actively shields tech corporations from liability. If the framing dictates that an autonomous 'mind' experienced an unpredictable 'hallucination,' the regulatory response focuses on 'AI safety' and fixing 'glitches.' However, if we recognize that AI only 'processes' and does not 'know,' and if we name the human actors, the regulatory landscape shifts. The problem becomes framed as corporate negligence: tech giants deploying ungrounded statistical engines and marketing them as reliable knowledge workers. If the metaphors are removed, the liability ambiguity vanishes, placing the threat of regulation squarely on the companies profiting from the obfuscation of their systems' limitations.

AI Literacy as Counter-Practice:

Practicing critical literacy and mechanistic precision directly counters the material risks of AI deployment. By systematically replacing consciousness verbs (knows/understands) with mechanistic ones (processes/predicts), we strip away the illusion of the oracle. Reframing 'the AI asserted a fake citation' to 'the model generated tokens maximizing stylistic plausibility without factual grounding' forces the recognition of the system's fundamental dependency on its statistical architecture and its total absence of awareness. Similarly, restoring human agency by replacing 'the model hallucinates' with 'OpenAI deployed a system optimized for fluency over truth' forces the recognition of who designs, deploys, and profits from these tools.

Systematic adoption of this precision requires institutional shifts: academic journals must mandate mechanistic descriptions of AI tools in methodology sections, prohibiting the use of 'hallucination' or 'AI understanding.' Researchers must commit to unmasking passive voice. However, resistance to this literacy practice is immense. Anthropomorphic shorthand is cognitively easy and narratively compelling. More importantly, corporate entities actively resist precision; it serves the economic interests of AI developers to maintain the 'illusion of mind,' as it drives valuation, user engagement, and shields them from liability by scapegoating the 'autonomous' machine. Critical literacy threatens this business model by making the corporate man behind the curtain visible.

Path Forward

Looking toward the broader discursive ecology, different vocabulary choices make vastly different futures possible. The status quo—a hybrid of mechanistic background and intense conversational anthropomorphism—serves the tech industry's marketing goals and satisfies the public's desire for an intuitive grasp of the technology, but at the cost of profound capability overestimation and epistemic pollution. Alternatively, enforcing strict mechanistic precision ('the model retrieves based on probability distributions') clarifies capabilities and limitations, solving the crisis of unwarranted trust, but risks alienating lay audiences who find technical jargon impenetrable.

Structural changes can support a more honest discourse. Journals and funding bodies could require explicit capability disclosures and mechanistic translations of AI behavior. Educational institutions must teach students multiple vocabularies, showing them how 'the AI thinks' is a useful narrative shortcut but a dangerous technical assumption. Regulatory frameworks could mandate that public-facing AI tools carry disclaimers about their statistical, rather than cognitive, nature.

If current anthropomorphic confusion is maintained, we risk a future where human social, legal, and epistemic systems adapt to accommodate the 'hallucinations' of machines, treating algorithmic outputs with the deference owed to conscious subjects. If mechanistic precision becomes the norm, we gain a society that treats AI as a powerful, flawed calculator—accountable to its corporate creators—though we lose the romantic narrative of the artificial mind. The choice of vocabulary ultimately decides who holds power: the corporations hiding behind 'thinking' machines, or the humans demanding accountability for statistical tools.


Extended Processing Summary

Gemini's Processing Steps
What you are about to read…

Gemini refers to the text below as "thought summaries." This is an overt consciousness projection because 'intentions' are hallmarks of a conscious mind that 'knows' what it is doing and why. The concealed mechanistic process is probabilistic text generation. Treat this as just another rhetorical artifact—a way of making the model's processing legible. When the model writes "I should double check that...", it is mathematically altering its own probability distribution for the next token to be more accurate. The "I" is a functional gear in the machine, not a self-reference. This language is a deliberate interface design choice, not a biological reality. The developers could have programmed the model to output passive logs. Instead, they chose to simulate a narrator.

Defining the Discourse

I'm initially focusing on framing the task. I need to conduct a critical discourse analysis of AI metaphor and anthropomorphism. The goal is to dissect how these elements are used in the context of AI, especially in relation to the hallucinated citations discussed in the PDF. I'm starting by identifying key terms and their potential implications.

Deconstructing the Article

I'm now diving into the specifics of the Spennemann article. I'm noting key aspects such as its topic (hallucinated citations), main points, and tone. I am particularly concerned with identifying anthropomorphic and agency markers within the text. Specifically I am looking at phrases like, "hallucinated citations," "genAI model will regenerate," and "ChatGPT itself about the processes."

Analyzing AI Capabilities

I'm now examining how the article describes the AI model's abilities. I'm noting specific phrases such as "the genAI model know," "ChatGPT responded," "it asserted," and "the model is designed." These, and similar wordings, seem to ascribe agency or understanding. I'm paying close attention to whether the author does, or does not, deny the model's capacity for genuine understanding.

Categorizing Framing Elements

I'm now establishing a framework for categorizing how AI is described, noting whether agency is explicitly or implicitly assigned. I am currently focused on pinpointing elements like "Hallucinations in generative AI models are a widely recognized problem." This helps refine the criteria for classifying the examples. I'm focusing on "Direct (Unacknowledged)" and similar phrases.

Classifying Agentive Terms

I'm now cataloging examples based on agency attribution. I see that phrases like "internalize syntactic structures" clearly project learning onto the AI. "It asserted it as genuine" directly ascribes belief, while "patterns in memory" is hedged, implying a biological analogy. Similarly, "natural language understanding" projects comprehension, also hedged, and finally "exhibit emergent capabilities in reasoning" allows for the ability to reason, within limits.

Mapping Source-Target

I'm now working on the second task: source-target mapping. I've mapped seven instances so far, contrasting human characteristics, such as perception and knowledge, with AI functions like statistical error and weight storage. The mappings are being used to identify and contrast sources and targets. The final examples are being considered, and the contrastive justification for the 'Acknowledgement' status is being rigorously applied. The focus is on the "name the actor" test for actor visibility.

Auditing AI Explanations

I'm now auditing the provided explanations of how and why the AI operates. I've located five instances and begun categorizing them. The classifications are functional and empirical generalizations, with an emphasis on dispositional explanations. For example, some attribute an intentional design purpose to the AI, while others focus on how the AI might generate plausible references. I'm focusing on the nature of these explanations.

Reframing Mechanistically

I'm now reframing agentive language in the selected quotes with mechanistic alternatives. I've rephrased "hallucinations" to "statistically plausible but factually incorrect text" and "what the model knows" to "statistical associations within the model's training data." Additionally, "ChatGPT responded" became "the application generated an output string." I have also mapped "internalize factual knowledge" to "adjust parameter weights" and "It asserted it as genuine" to "The model generated tokens classifying the reference." I'm identifying all of these as a form of generalization.

Observing Conversational Methodology

I'm now honing in on agency slippage in the text, especially the interview-style descriptions. I note that there's a strong shift between rigorous descriptions of the AI and anthropomorphic references. The conversational framing of the text creates an environment which induces slippage. For example, the author moves between technical descriptions to conversational descriptions. This, I think, makes the potential for the AI's flaws more impactful to the reader.

I'm also considering the potential for the author to build relation-based trust by treating the AI as an interviewee, which, I think, builds trust even when the goal is to expose unreliability. I will further analyze the "obscured mechanics," particularly the abstraction to "the model."

Analyzing Accountability Structures

I'm now examining how accountability is structured within the article. I observe that the focus is strongly on the technology itself, with limited attention given to the developers and data annotators who created and deployed the system. The text leans on a cognitive bias argument to describe users. I'm noting the implications of this shift in responsibility. The framing seems to allow the developers to "get off the hook" by diffusing blame.

Synthesizing Findings

I'm now identifying dominant patterns: "Model as Interviewee/Agent" and "Cognitive Pathology" (e.g., hallucination). These frame the model as a flawed human, not a designed statistical tool. The conversational frame is a key load-bearing element. I'm focusing on how the interview method constructs the illusion of understanding. I'm also examining the rhetorical devices involved.

Defining Accountability

I'm now identifying the ethical implications of the anthropomorphic language in AI descriptions, such as the implications of "ChatGPT asserted" becoming "the model generated tokens". I also note the significance of naming OpenAI directly when analyzing its choices, while considering the appeal and resistance of the 'oracle' illusion. I have moved onto discourse communities to determine their differing needs. I am considering mechanistic and anthropomorphic vocabulary.

Analyzing Task 3

I am now focusing on refining the analysis of epistemic claims within the identified instances. I'm actively mapping each instance against the four identified criteria: verbs used, differentiating between "knowing" and "processing," the potential for a "curse of knowledge" in interpretation, and the underlying mechanistic processes. The focus is to provide a complete classification of the selected phrases.

Refining Epistemic Analysis

I'm currently engaged in a detailed classification of the selected phrases against four distinct criteria. My analysis involves a deep dive into the verbs employed, a careful distinction between "knowing" and "processing," and exploration of the potential "curse of knowledge" in interpretation. Simultaneously, I am evaluating the underlying mechanistic processes. The primary objective is to finalize a complete classification of all instances, with careful attention to the nuances between functional and intentional explanations.

Refining Analysis

I'm now iterating on Task 6, focusing on the strict word count and JSON format. I am ensuring that the word counts for each section will be in the correct range for each of the five parts of Task 6. I'm deep-diving into the 'Hallucinations' example, expanding on the projection of human pathology onto software error. I'm meticulously detailing the implications of ascribing a temporarily glitching mind to a statistical model. Further, I'm analyzing the "Hidden" accountability framing and justifying how it abstracts creation as a passive phenomenon. I am also expanding on the implications of ascribing epistemic possession.

Refining Agentive Descriptions

I'm now zeroing in on the interview-based framing and the resulting agentive descriptions, recognizing how this conversational approach enables agency slippage. The article subtly transitions from rigorous mechanistic descriptions to conversational interactions. The consequence is that the AI's "responses" and "assertions" take on more human-like qualities. I am also investigating the impact that the conversational framing has on building trust, and on whether that trust might be exploited to conceal unreliability.

Deconstructing Explanations

I'm now zeroing in on the cognitive pathology framing, specifically "curse of knowledge," where the author projects introspection onto the model. Phrases like "references are reconstructed" obscure OpenAI's role. I'm also examining the reliance on the "hallucination" metaphor, which maintains the illusion of an active agent with a temporary glitch. I'm noting how the rhetorical slippage enables a compelling human-vs-machine narrative.

Elaborating Explanations

I'm expanding my analysis, focusing on how the article shifts from functional and theoretical explanations to reason-based ones, particularly in the interview sections. The text attributes internal justification to AI, while obscuring human agency in data collection. This bidirectional agency flow, where the AI gains human verbs while human designers vanish, frames the AI as autonomous.

Refining Epistemic Analysis

I'm presently deconstructing how the article shifts from mechanistic descriptions to agentive attributions. The article begins with "optimized via next-token prediction" which correctly describes the system, but immediately slips into the epistemic by suggesting the AI "internalizes factual knowledge." This suggests a cognitive subject. I am also investigating the implications of a system that performs mathematical operations being framed as one that "knows".

Analyzing Epistemic Slippage

I'm currently dissecting the conflation of data processing and "knowing." The article slides from descriptions of token prediction into epistemic claims about internalizing factual knowledge. I'm focusing on "the model knows" versus "optimized with weights." This highlights the "curse of knowledge" where human understanding is projected onto the model's computations. I'm focusing on the effect the word choice has on the reader's understanding.


Analysis Provenance

Run ID: 2026-05-01-how-unique-are-hallucinated-citations-of-metaphor-1qr7vr Raw JSON: 2026-05-01-how-unique-are-hallucinated-citations-of-metaphor-1qr7vr.json Framework: Metaphor Analysis v6.5 Schema Version: 3.0 Generated: 2026-05-01T09:25:52.824Z

Discourse Depot © 2025 by TD is licensed under CC BY-NC-SA 4.0