Skip to main content

🆕 A Survey of Large Language Models for Perception and Measurement of Human Psychology

About

This document presents a Critical Discourse Analysis focused on AI literacy, specifically targeting the role of metaphor and anthropomorphism in shaping public and professional understanding of generative AI. The analysis is guided by a prompt that draws from cognitive linguistics (metaphor structure-mapping), the philosophy of social science (Robert Brown's typology of explanation), and accountability analysis.

All findings and summaries below were generated from detailed system instructions provided to a large language model and should be read critically as interpretive outputs—not guarantees of factual accuracy or authorial intent.


Task 1: Metaphor and Anthropomorphism Audit

About this task

For each of the major metaphorical patterns identified, this audit examines the specific language used, the frame through which the AI is being conceptualized, what human qualities are being projected onto the system, whether the metaphor is explicitly acknowledged or presented as direct description, and—most critically—what implications this framing has for trust, understanding, and policy perception.

V3 Enhancement: Each metaphor now includes an accountability analysis.

1. Perception as Conscious Sensory Modality

Quote: "Can LLMs perceive and measure complex, latent human psychological attributes such as personality traits, emotional states, and cognitive styles?"

  • Frame: LLM as conscious observer
  • Projection: The quote maps the sensory and biological capacity of "perception" onto large language models (LLMs). This implies that a model can experience internal awareness, subjective reception, and conscious observation of human mental phenomena. Instead of describing the system as executing statistical classification or token prediction on a textual representation of behavior, the term "perceive" attributes active sensory processing and awareness. By treating computational processing as conscious perception, the text implies that LLMs possess an active epistemic standpoint capable of recognizing, evaluating, and deeply comprehending the unseen, latent qualities of the human mind, rather than executing mathematical vector transformations on string patterns.
  • Acknowledgment: Direct (Unacknowledged) (This quote presents the concept of LLM perception as a literal capability or research question without semantic hedging or scare quotes. The alternative considered was "Hedged/Qualified", but was ruled out because there are no qualifying phrases like "as if" or "metaphorically speaking" attached to the verb "perceive" in this introductory paragraph.)
  • Implications: Framing LLM output as "perception" inflates the perceived capabilities of these models, leading users and researchers to attribute conscious judgment to statistical artifacts. In high-stakes fields like clinical psychology, this projection risks creating unwarranted trust, where practitioners believe the model is "seeing" a patient's pain rather than matching strings. This creates liability ambiguity when diagnostic errors occur, as responsibility is transferred to a system that cannot actually be held accountable, thereby overestimating its diagnostic capacity.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: The sentence is constructed around the grammatical agency of the LLM ("Can LLMs perceive..."). The humans who design, deploy, and profit from these systems are completely erased. The decision to use these models for clinical tasks is framed as a question of the technology's inherent capacity rather than a deliberate, commercial choice by software vendors and research institutions. The alternative visibility category considered was "Partial" because specific researchers are cited nearby, but ruled out because this specific sentence frames the model as the sole active entity.
Show more...

2. Model as Cognitive Organism

Quote: "...whether LLMs possess cognitive properties that make psychological measurement meaningful."

  • Frame: LLM as cognitive entity
  • Projection: This metaphor projects the biological concept of "cognitive properties" onto mathematical model weights and matrix multiplications. It implies that the LLM operates with an internal, conscious mind characterized by intentionality, understanding, and reasoning. The text uses "cognitive properties" to describe what is actually a sequence of statistical operations over probability distributions. This projection suggests that the model is a "knower" rather than a processor, transforming a collection of mathematical correlations into an active agent possessing genuine psychological structures and capacities like reasoning or empathy, thereby obscuring the mechanical reality of gradient descent and pattern-matching.
  • Acknowledgment: Direct (Unacknowledged) (The passage frames the possession of "cognitive properties" as a direct debate about literal capabilities, presenting it as an empirical question. The alternative considered was "Explicitly Acknowledged" due to the debate context, but was ruled out as the authors do not label "cognitive properties" as a metaphorical mapping or use scare quotes in this instance.)
  • Implications: Attributing "cognitive properties" to LLMs encourages the belief that these systems possess independent judgment and moral agency. This creates significant risks of overreliance, especially in mental health, where users expect ethical reflection or genuine empathy. It creates a regulatory vacuum by suggesting the model is an autonomous agent whose decisions are separate from the human developers who engineered the training data, ultimately deflecting legal liability away from corporations.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: The sentence frames the "debate" as whether "LLMs possess" these properties, completely hiding the human engineers and corporate executives who construct and market these models. The alternative considered was "Partial (some attribution)" because "the community" is mentioned, but ruled out since "the community" acts as a passive background to the active model. This serves the interests of technology firms by presenting technological capabilities as inherent traits rather than constructed products.

3. Approximation as Biological Convergence

Quote: "...advanced LLMs have developed human-like abilities that closely approximate social cognitive processes..."

  • Frame: Algorithmic output as social cognitive process
  • Projection: This mapping projects human "social cognitive processes" onto the mathematical convergence of language models. It implies that the system's output is generated through an internal simulation of social dynamics, mutual understanding, and interpersonal awareness. By equating the "approximation" of behavior with the actual execution of cognitive processes, the text suggests the LLM "knows" social context, rather than merely reproducing linguistic patterns that humans associate with social interactions. This anthropomorphism obscures the fundamental difference between experiencing social relationships and calculating statistical probabilities of words associated with social behaviors.
  • Acknowledgment: Direct (Unacknowledged) (This is presented as an empirical finding without any qualification or metaphorical warning. The alternative considered was "Hedged/Qualified" because of the word "approximate," but ruled out since the text asserts that LLMs "have developed" these "human-like abilities" as a factual development, presenting the functional outcome as a literal evolutionary milestone.)
  • Implications: This projection risks creating an illusion of relational safety, leading vulnerable users to seek therapeutic alliances with LLMs. If users believe the model possesses "social cognitive processes," they may share deeply personal, sensitive data under the false impression of reciprocal human empathy. It inflates the system's capability, making it appear safe for automated clinical triage, which can lead to catastrophic failures in crisis detection when the model fails to process unaligned contexts.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: Grammatical agency is granted solely to the "advanced LLMs" which "have developed" these abilities. The developers who curated the data and selected reinforcement learning objectives are obscured. The alternative visibility category considered was "Partial" because of general references to "psychology," but ruled out because the developmental action is attributed entirely to the system. This agentless phrasing frames the acquisition of human-like abilities as an autonomous evolutionary process.

4. Computational Inference as Theory of Mind

Quote: "Section II-A addresses outward understanding: the ability to infer others’ mental states, assessed through Theory of Mind (ToM) tasks"

  • Frame: Output correlation as outward understanding
  • Projection: This passage projects the intentional, conscious act of "understanding" and the social act of "inferring" onto statistical token prediction. "Inference" here is used not in the mathematical sense of statistical deduction, but in the psychological sense of mind-reading and tracking others' mental states (Theory of Mind). This suggests that the model "knows" and "comprehends" that other agents have subjective, hidden internal states (thoughts, feelings, beliefs). In reality, the system is performing sequence transduction, predicting text completions that correlate with prompts containing social scenarios, without any conscious awareness of minds or subjective realities.
  • Acknowledgment: Hedged/Qualified (The passage uses "understanding" and "Theory of Mind" in a qualified academic context, discussing "Theory of Mind capabilities" as "functional presence." The alternative considered was "Explicitly Acknowledged," but ruled out because the text does not label this as a metaphor, though it hedges by discussing it at the "functional and behavioral levels.")
  • Implications: Believing that LLMs can "infer others' mental states" leads to extreme capability overestimation, prompting developers to deploy them as automated judges or therapists. This creates severe risks of misinterpreting user intent, particularly in high-stress emotional crises. Liability is diffused because failures are categorized as "system misunderstandings" rather than systemic software design flaws and lack of human oversight by the deploying institution.

Accountability Analysis:

  • Actor Visibility: Partial (some attribution)
  • Analysis: The text mentions researchers who "assess" these models, attributing some human agency to the evaluation process, but still leaves the creation of these models vague. The alternative considered was "Hidden" because the model's emergence of ToM is described as spontaneous, but "Partial" is selected due to the implicit presence of the evaluating researchers. This partial visibility still obscures corporate accountability for the deployment of unvalidated systems.

5. Execution of Text Patterns as Role Enactment

Quote: "Section II-B examines inward simulation: the capacity to enact specific psychological roles as virtual subjects."

  • Frame: Persona generation as internal role enactment
  • Projection: This metaphor projects the conscious, creative human capacity for "role enactment" and "simulation" onto algorithmic text generation. It implies that when an LLM is conditioned on a persona, it undergoes an internal mental transformation, "enacting" a subjective psychological state. This treats the model as an active human subject capable of adopting identity, motivations, and internal values. In reality, the model is simply restricting its vocabulary generation probability space to match the statistical patterns of a specified text prompt, without any conscious experience of identity or selfhood.
  • Acknowledgment: Hedged/Qualified (The authors use the terms "inward simulation" and "virtual subjects" to conceptualize this process. The alternative considered was "Direct" because it uses strong verbs like "enact," but "Hedged" was chosen because the authors frame this explicitly as "simulation capability" and "persona prompting," indicating an awareness of the operationalized nature of the setup.)
  • Implications: This framing encourages researchers to treat "silicon samples" as equivalent to real human research participants, risking the replacement of empirical human psychology with closed-loop algorithmic echo chambers. This can lead to biased, ungrounded policies when clinical or social conclusions are drawn from virtual subjects whose outputs are merely reflections of historical internet text patterns, completely hiding actual human diversity and real-world suffering.

Accountability Analysis:

  • Actor Visibility: Partial (some attribution)
  • Analysis: The text names "researchers" who "construct a virtual subject" and "systematically vary" parameters. The alternative considered was "Named" because specific researchers (Argyle et al.) are cited, but "Partial" is chosen because the direct developers and deploying entities of the underlying LLMs (such as OpenAI or Google) remain anonymous. This hides the commercial monopoly over the base models used for these social simulations.

6. Spontaneous Cognitive Emergence

Quote: "...ToM has recently been observed to emerge in LLMs without targeted training. This capability appears as a byproduct of scaling."

  • Frame: Unsupervised statistical correlation as evolutionary emergence
  • Projection: This mapping uses the biological and evolutionary metaphor of "emergence" to suggest that "Theory of Mind" (ToM) arises spontaneously within LLMs as an organic capability. This implies that scaling statistical systems replicates the biological process of mental development, suggesting the system is evolving into a "knower." It obscures the fact that the "emergence" is merely the mathematical alignment of high-dimensional correlations in the training corpora, where complex textual representations of human logic are increasingly represented in the token statistics. The system does not "emerge" into consciousness; it remains a non-conscious probability distribution.
  • Acknowledgment: Hedged/Qualified (The text hedges the phenomenon by calling it a "functional presence" and noting that "its depth and generalizability remain uncertain." The alternative considered was "Direct" because "emerge... without targeted training" is asserted, but "Hedged" was selected because the authors immediately follow with qualifications regarding the validity of these benchmarks and "literal" vs "functional" distinctions.)
  • Implications: Framing statistical training as evolutionary "emergence" of human cognitive traits promotes a narrative of technological inevitability and autonomous development. This makes the system appear more sophisticated and independent than it is, fostering unwarranted trust among policymakers who may view the LLM as a self-improving cognitive agent. This complicates accountability, as failures can be written off as unpredictable emergent behaviors rather than predictable limitations of non-causal statistical engines.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: The agency is placed on the passive process of "scaling" ("byproduct of scaling") and the spontaneous occurrence ("observed to emerge"). The corporate entities who execute the scaling, curate the training data, and profit from the public release of these models are invisible. The alternative considered was "Partial" because of "researchers" who observe, but "Hidden" fits best because the transformation itself is described as actorless.

7. System Output as Conceptual Understanding

Quote: "This paradigm assesses whether an individual understands that others may hold beliefs inconsistent with reality"

  • Frame: Generative pattern execution as conceptual understanding
  • Projection: This quote projects the cognitive state of "understanding" onto the model's output on standard false-belief tasks. To "understand" that others hold false beliefs requires conscious metacognition—the subjective awareness of one's own mind and the minds of others. The mapping suggests that because the LLM generates tokens that match the correct answers to a false-belief scenario, it has an active, internal comprehension of human beliefs. This conflates behavioral correlation with conscious comprehension, hiding the mechanistic reality that the model is merely processing textual prompts through multi-head attention to output highly probable sequence completions.
  • Acknowledgment: Hedged/Qualified (The quote is presented as a description of a standard paradigm ("This paradigm assesses whether..."), but the authors immediately hedge this in the subsequent paragraph, noting that "the validity of existing ToM benchmarks has itself been questioned." The alternative considered was "Direct" because of the definitive verb "understands," but "Hedged" was chosen because the broader context actively deconstructs this claim.)
  • Implications: Conflating correct token outputs with true conceptual "understanding" creates a severe risk of deploying LLMs in contexts where they must make ethical or safety-critical decisions about human welfare. It suggests that the system "knows" what users believe, which can lead to catastrophic medical or clinical errors when the model encounters novel or out-of-distribution social scenarios that do not exist in its training data, hiding the system's complete lack of actual semantic awareness.

Accountability Analysis:

  • Actor Visibility: Hidden (agency obscured)
  • Analysis: The sentence is framed around the "individual" (here referring to the LLM or child being tested) and "others," with no human developers or deployment organizations identified. The alternative visibility category considered was "Partial" since the creators of standard tests (Sally-Anne, Kosinski) are cited, but "Hidden" is selected because the actual decision-makers deploying these diagnostic benchmarks are completely erased. This serves to normalize the testing of machines as if they were human subjects.

8. Statistical Models as Measurement Instruments

Quote: "Our review adopts a distinct focus: we examine LLMs specifically as instruments for measuring human psychological states, analyzing the measurement pipeline through the lens of psychometric theory."

  • Frame: Text classifier as psychometric instrument
  • Projection: This mapping conceptualizes statistical prediction models as "psychometric instruments." It projects the scientific rigor, construct validity, and objective measurement capabilities of validated human psychological tools onto high-dimensional matrix operations. This suggests that the LLM is a neutral, scientific device capable of "measuring" internal, latent human psychological states, such as emotions or personality. This conceals the reality that LLMs do not measure objective psychological phenomena; instead, they classify and generate textual representations based on correlation, without any grounding in objective reality or direct access to the patient's subjective state.
  • Acknowledgment: Direct (Unacknowledged) (The authors assert that they "examine LLMs specifically as instruments for measuring human psychological states" without stating that this is a metaphorical mapping. The alternative considered was "Hedged/Qualified" because of the phrase "through the lens of," but "Direct" was chosen because they treat the model's status as a literal "instrument" of psychological measurement as the core premise of their survey.)
  • Implications: This framing elevates LLM outputs to the status of objective scientific data, encouraging clinical practitioners to trust automated psychometric profiles. This creates profound risks of misdiagnosis and discriminatory screening in employment or clinical triaging. It hides the commercial interests of the companies selling these APIs, presenting a highly volatile, proprietary black box as a standard, validated scientific instrument, which fundamentally compromises clinical safety and accountability.

Accountability Analysis:

  • Actor Visibility: Partial (some attribution)
  • Analysis: The authors identify themselves ("our review") and reference the academic field of "psychometric theory," which provides partial visibility to human actors. The alternative considered was "Named" because specific studies (such as Wang et al.) are cited, but "Partial" is selected because the commercial operators who control the deployment and APIs of these instruments are not named. This framing protects commercial vendors by focusing accountability on academic researchers.

9. Model States as Moral Traits

Quote: "...despite safety alignment measures, the models exhibited higher scores than human averages in Machiavellianism and narcissism, suggesting latent dark traits."

  • Frame: Algorithmic bias as pathological personality
  • Projection: This metaphor projects clinical human personality pathology ("Machiavellianism and narcissism") and "dark traits" onto statistical weights and mathematical token correlations. By asserting that the model's output suggests "latent dark traits," the text implies that the LLM has an active, subconscious moral character capable of malicious intent and manipulative behavior. This obscures the purely technical reality that the model is simply reproducing toxic, manipulative, or narcissistic text patterns present in its vast, scraped training data (such as internet forums), without any underlying personality, moral orientation, or conscious agency.
  • Acknowledgment: Direct (Unacknowledged) (The authors state that the models "exhibited higher scores... suggesting latent dark traits" as a literal finding of their psychometric testing. The alternative considered was "Hedged/Qualified" because they use the word "suggesting," but "Direct" was selected because they treat "latent dark traits" as a genuine structural property of the model's internal psychology rather than an anthropomorphic metaphor.)
  • Implications: Framing algorithmic bias or offensive outputs as "latent dark traits" pathologizes and anthropomorphizes the software, making it appear as though the model possesses a malicious mind. This deflects attention away from the tech companies that chose to scrape toxic internet data without adequate filtering or curation. It shifts liability from corporate negligence to an imaginary "dark personality" within the model, rendering systemic issues of corporate accountability legally and rhetorically untractable.

Accountability Analysis:

  • Actor Visibility: Partial (some attribution)
  • Analysis: The text mentions researchers ("Li et al.") who applied the scale, but the entities responsible for the training and alignment of the models (e.g., OpenAI) are not named in this specific context. The alternative considered was "Hidden" because the models are the primary grammatical subject, but "Partial" was chosen due to the explicit citation of the human researchers. This partial visibility obscures the ultimate corporate source of the uncurated training data.

Task 2: Source-Target Mapping

About this task

For each key metaphor identified in Task 1, this section provides a detailed structure-mapping analysis. The goal is to examine how the relational structure of a familiar "source domain" (the concrete concept we understand) is projected onto a less familiar "target domain" (the AI system). By restating each quote and analyzing the mapping carefully, we can see precisely what assumptions the metaphor invites and what it conceals.

Mapping 1: Conscious sensory observer → High-dimensional text classifier and vector space modeling

Quote: "Can LLMs perceive and measure complex, latent human psychological attributes such as personality traits, emotional states, and cognitive styles?"

  • Source Domain: Conscious sensory observer
  • Target Domain: High-dimensional text classifier and vector space modeling
  • Mapping: This mapping projects the relational structure of biological perception onto vector transformations. It assumes that the model possesses sensory apparatuses capable of active attention, emotional sensitivity, and empathetic awareness. The mapping suggests that when the model processes text, it "perceives" emotional states in a manner similar to a human clinician observing a patient. This invites the audience to believe that the model builds an active, conscious representation of the human subject, rather than merely calculating statistical distances between text tokens and predefined labels in a static, high-dimensional vector space.
  • What Is Concealed: This mapping conceals the purely mathematical nature of the LLM, which relies on attention heads calculating weights over token embeddings. It hides the fact that the system has no access to real-world context, physiological signals, or subjective human experiences. Furthermore, it obscures the proprietary opacity of models like GPT-4, where the training datasets, reinforcement learning criteria, and system prompts are closely guarded commercial secrets, making true scientific verification of this supposed "perception" impossible.
Show more...

Mapping 2: The biological human mind → Mathematical neural networks and weight matrices

Quote: "...whether LLMs possess cognitive properties that make psychological measurement meaningful."

  • Source Domain: The biological human mind
  • Target Domain: Mathematical neural networks and weight matrices
  • Mapping: This mapping projects the structural properties of human cognition—such as reasoning, memory, and comprehension—onto the mathematical architectures of transformers. It invites the assumption that an LLM has an active, internal mental theater where cognitive states are processed and evaluated. The mapping implies that the model's outputs are products of conscious thought and logical reasoning, rather than statistical correlations generated by calculating dot products of query, key, and value vectors across billions of parameters. It transforms a complex mathematical function into a conscious, cognitive agent.
  • What Is Concealed: It conceals the mechanistic reality that LLMs do not possess semantic understanding or cognitive grounding; they are non-conscious pattern matching engines. This anthropomorphism hides the dependency of these systems on massive, uncurated training data, representing a significant transparency obstacle. The text presents "cognitive properties" as inherent to the model, ignoring the proprietary nature of the software and the fact that we cannot audit the underlying training algorithms of commercial APIs.

Mapping 3: Human social development and interpersonal relationships → Linguistic probability distributions and pattern matching

Quote: "...advanced LLMs have developed human-like abilities that closely approximate social cognitive processes..."

  • Source Domain: Human social development and interpersonal relationships
  • Target Domain: Linguistic probability distributions and pattern matching
  • Mapping: This mapping projects human social learning and relational interaction onto the optimization of loss functions. It assumes that the model learns social rules, empathy, and interpersonal dynamics during training, mirroring human social development. This suggests that the system's text generation is driven by an internal, relational understanding of human social dynamics. The audience is invited to treat the LLM as a social peer capable of understanding social cues, rather than a software system mimicking the syntax of social interactions scraped from public web data.
  • What Is Concealed: This mapping obscures the absence of any subjective experience, social intent, or genuine empathy in the system. It hides the material labor of human annotators and reinforcement learning (RLHF) workers who are underpaid to manually correct and align the model's outputs to appear socially appropriate. By attributing "human-like abilities" to the model, the text obscures the corporate engineering choices, commercial optimization goals, and lack of objective ground truth in social simulation.

Mapping 4: Theory of Mind (ToM) and human empathy → Sequence transduction and token prediction

Quote: "Section II-A addresses outward understanding: the ability to infer others’ mental states, assessed through Theory of Mind (ToM) tasks"

  • Source Domain: Theory of Mind (ToM) and human empathy
  • Target Domain: Sequence transduction and token prediction
  • Mapping: This mapping projects the biological, metacognitive capability of "Theory of Mind"—the conscious attribution of mental states to oneself and others—onto statistical sequence prediction. It implies that the LLM possesses an internal, conscious model of human psychology that allows it to "infer" unseen beliefs and feelings. This mapping assumes that the system's performance on structured text benchmarks represents genuine, active social reasoning and conscious tracking of minds, rather than the passive matching of text structures that reflect the logical pathways of human-written narratives.
  • What Is Concealed: This mapping conceals the fragile, non-causal nature of the model's outputs, which fail when scenarios are trivially altered. It hides the fact that the model is processing static tokens without any conceptual grasp of human minds, reality, or truth. It also glosses over the proprietary opacity of the benchmarks and models, where dataset contamination is highly likely, meaning the model may simply be retrieving memorized solutions rather than demonstrating emergent social intelligence.

Mapping 5: Conscious dramatic acting and identity adoption → Conditional probability adjustment via prompt engineering

Quote: "Section II-B examines inward simulation: the capacity to enact specific psychological roles as virtual subjects."

  • Source Domain: Conscious dramatic acting and identity adoption
  • Target Domain: Conditional probability adjustment via prompt engineering
  • Mapping: This mapping projects the conscious human experience of identity, role-playing, and self-reflection onto the statistical constraint of token outputs. It implies that when a model is given a prompt (a persona), it internally simulates a subjective self and acts out that identity. This invites the assumption that the LLM has an inner psychological landscape that can be partitioned into distinct personas, rather than simply matching the linguistic style of the text prompt based on historical correlations in its training data.
  • What Is Concealed: It conceals the computational mechanics of persona prompting, which is merely a mathematical filter restricting the model's generative probability distribution. It hides the fact that the "virtual subject" has no actual beliefs, memories, or human consciousness. This framing also ignores the profound transparency obstacle of using proprietary models, where the base model is constantly modified by commercial vendors, making these "simulations" scientifically unstable, uninterpretable, and impossible to replicate.

Mapping 6: Biological evolution and neurological development → Loss function minimization on large web corpora

Quote: "...ToM has recently been observed to emerge in LLMs without targeted training. This capability appears as a byproduct of scaling."

  • Source Domain: Biological evolution and neurological development
  • Target Domain: Loss function minimization on large web corpora
  • Mapping: This mapping projects biological evolution and organic cognitive development onto the mathematical scaling of computational power and data volume. It assumes that "Theory of Mind" is an inherent cognitive state that spontaneously crystallizes once a statistical model reaches a certain size. This invites the audience to view the LLM as an active biological entity that is naturally evolving higher intelligence, rather than a mathematical artifact optimized to minimize cross-entropy loss over text distributions.
  • What Is Concealed: This mapping conceals the extensive, manual curation, RLHF, and human engineering required to make these scaled models generate coherent text. It obscures the massive material and environmental costs—such as carbon emissions and water consumption—associated with running large-scale training clusters. By presenting "emergence" as an autonomous, natural phenomenon, it hides the corporate agency and commercial profit motives driving the scaling of these proprietary, black-box systems.

Mapping 7: Epistemic awareness and metacognitive comprehension → Attention mask calculations over text sequences

Quote: "This paradigm assesses whether an individual understands that others may hold beliefs inconsistent with reality"

  • Source Domain: Epistemic awareness and metacognitive comprehension
  • Target Domain: Attention mask calculations over text sequences
  • Mapping: This mapping projects human epistemic awareness—the conscious evaluation of truth, beliefs, and reality—onto attention-weight calculations. It assumes that a correct response on a false-belief test is evidence of a system that "understands" the difference between internal mental representations and objective physical reality. This projects a deep, conscious comprehension of truth and falsehood onto what is actually a sequence of statistical predictions based on patterns of text that describe false-belief scenarios in the training data.
  • What Is Concealed: It conceals the structural reality that the LLM has no concept of "reality," "truth," or "belief." It lacks any grounding in the physical world and cannot verify its assertions. This framing hides the severe vulnerability of these models to adversarial attacks and trivial prompt variations, which immediately disrupt their "understanding." It also ignores the proprietary black-box nature of commercial APIs, where researchers cannot access the underlying model weights to audit how the prediction was actually constructed.

Mapping 8: Scientific measurement tools (e.g., thermometers, standardized scales) → Generative sequence prediction and embedding classification

Quote: "Our review adopts a distinct focus: we examine LLMs specifically as instruments for measuring human psychological states, analyzing the measurement pipeline through the lens of psychometric theory."

  • Source Domain: Scientific measurement tools (e.g., thermometers, standardized scales)
  • Target Domain: Generative sequence prediction and embedding classification
  • Mapping: This mapping projects the precision, reliability, and objective validity of physical or standardized scientific instruments onto statistical language models. It assumes that the LLM acts as a neutral, objective sensor that can detect and measure human mental states without bias. This projects the authority of psychometric science onto a system that generates highly variable, context-sensitive text outputs, inviting the belief that the model's assessments are mathematically rigorous, stable, and scientifically validated measurements of the human mind.
  • What Is Concealed: It conceals the extreme instability of LLM outputs, which exhibit high prompt sensitivity and poor test-retest reliability. It obscures the fact that the "measurement" is a subjective, correlation-based classification of text, rather than a direct measurement of any latent psychological construct. This framing hides the commercial reality that these "instruments" are proprietary APIs owned by private corporations, who can change the underlying weights at any time, destroying the instrument's scientific validity and reproducibility.

Mapping 9: Clinical personality pathology and psychopathy → Unfiltered training data correlations and mathematical weights

Quote: "...despite safety alignment measures, the models exhibited higher scores than human averages in Machiavellianism and narcissism, suggesting latent dark traits."

  • Source Domain: Clinical personality pathology and psychopathy
  • Target Domain: Unfiltered training data correlations and mathematical weights
  • Mapping: This mapping projects human personality disorders and moral pathology onto statistical associations in training corpora. It assumes that a high score on an automated questionnaire is evidence of an active, manipulative, and self-centered "mind" within the machine. This projects human psychiatric concepts onto high-dimensional vector representations, inviting the audience to view the LLM as a morally compromised agent with "latent dark traits" rather than a non-conscious system reproducing the toxic language and biases present in its training dataset.
  • What Is Concealed: This mapping conceals the corporate decision-making process of tech companies that chose to scrape massive, uncurated internet datasets containing highly toxic, manipulative, and biased human language. It hides the failure of safety alignment algorithms and RLHF to filter out these patterns. Pathologizing the model as having "dark traits" hides these concrete failures of quality control, shifting the blame from corporate negligence to the fictitious psychology of an inanimate software artifact.

Task 3: Explanation Audit (The Rhetorical Framing of "Why" vs. "How")

About this task

This section audits the text's explanatory strategy, focusing on a critical distinction: the slippage between "how" and "why." Based on Robert Brown's typology of explanation, this analysis identifies whether the text explains AI mechanistically (a functional "how it works") or agentially (an intentional "why it wants something"). The core of this task is to expose how this "illusion of mind" is constructed by the rhetorical framing of the explanation itself, and what impact this has on the audience's perception of AI agency.

Explanation 1

Quote: "Some view LLMs as sophisticated statistical learners that generate language by exploiting correlations within large-scale corpora, without true comprehension or grounded understanding [7]. From this perspective, their apparent performance in tasks involving reasoning, empathy, or social cognition might be better explained as emergent artifacts of statistical-level pattern recognition."

  • Explanation Types:

    • Theoretical: Embeds in deductive framework, may invoke unobservable mechanisms
    • Empirical Generalization: Subsumes events under timeless statistical regularities
  • Analysis (Why vs. How Slippage): This explanation frames LLMs mechanistically (how they operate) rather than agentially (why they behave). By defining the models as "statistical learners" that "generate language by exploiting correlations," the passage emphasizes the computational and mathematical reality of these systems. This choice of explanation type explicitly strips away the agential illusions of "comprehension" or "understanding," characterizing apparent cognitive performance as "emergent artifacts" of statistical pattern recognition. This framing demystifies the system by showing that complex behavioral outputs do not require an underlying conscious mind or subjective agency. Instead, they can be entirely explained by the mechanics of statistical correlation and high-dimensional curve-fitting, rendering the system's operations transparent and grounded in mathematical principles. However, it still obscures the specific human actions in curating those corpora, framing the model's interaction with the data as an autonomous mathematical inevitability rather than a constructed process.

  • Consciousness Claims Analysis: This passage actively critiques and refutes consciousness claims, establishing a sharp distinction between knowing and processing. It uses mechanistic verbs like "generate" and "exploiting correlations" to describe the actual physical operations, while explicitly denying consciousness-related verbs like "comprehension" or "grounded understanding." The authors identify the "curse of knowledge" dynamic, where human observers project their own conscious understanding onto the statistical outputs of the model, misinterpreting pattern-matching for genuine reasoning, empathy, or social cognition. Mechanistically, the model does not "comprehend" any tokens; it calculates dot-product attention scores to determine the relative probability of subsequent tokens based on patterns learned during gradient descent on web-scale datasets. By framing this behavior as an "emergent artifact of statistical-level pattern recognition," the passage asserts that the system remains a non-conscious processor, and any attribution of justified belief or subjective awareness is an anthropomorphic illusion projected by the user. This analysis provides a vital correction to the prevailing agential discourse, forcing the reader to recognize that the system's apparent cognitive capabilities are entirely a function of computational statistics, not mindful intelligence.

  • Rhetorical Impact: This mechanistic framing shifts the audience's perception of risk and autonomy, reducing the tendency to view the LLM as an independent, trustworthy agent. By explaining the system's outputs as statistical artifacts, the passage undermines the illusion of reliability and encourages a highly skeptical, auditing-oriented approach to clinical deployment. If audiences accept this framing, they are far less likely to trust automated psychometric measurements blindly, recognizing that a model cannot hold "justified true belief" and therefore cannot provide genuine clinical diagnosis. This encourages regulatory policies that mandate human oversight and rigorous validation, while discouraging corporations from claiming their models possess genuine social intelligence or empathetic understanding, thereby reducing the risk of unsafe automation in clinical settings.

Show more...

Explanation 2

Quote: "However, growing evidence suggests that LLMs can exhibit behaviors resembling aspects of human cognition, such as theory of mind, emotion recognition, and social reasoning... indicating that they may encode or approximate latent psychological constructs"

  • Explanation Types:

    • Empirical Generalization: Subsumes events under timeless statistical regularities
    • Dispositional: Attributes tendencies or habits
  • Analysis (Why vs. How Slippage): This explanation shifts the discourse from a mechanistic "how" to an agential "why," framing the model's behavior through the lens of human psychological attributes. By suggesting that LLMs can "exhibit behaviors resembling aspects of human cognition" and "encode or approximate latent psychological constructs," the passage emphasizes a cognitive functional equivalence. This choice of framing obscures the underlying statistical and mathematical realities of the transformer architecture, replacing them with abstract, unobservable psychological structures. It frames the LLM's outputs as products of internal "constructs" rather than statistical associations, creating a pseudo-scientific basis for treating the model as an active, cognitive agent. This transition makes the system appear autonomously capable of social reasoning, shifting the reader's focus away from the concrete engineering of training data and prompt design toward the imaginary internal mind of the machine, which supposedly encodes psychological variables.

  • Consciousness Claims Analysis: This passage makes a significant epistemic slippage by suggesting that LLMs "encode or approximate latent psychological constructs" and execute "theory of mind, emotion recognition, and social reasoning." It uses cognitive and psychological verbs that carry strong connotations of conscious awareness, while ignoring the mechanistic process. In reality, the model does not "reason" or "recognize emotions"; it processes token sequences, maps them to vector representations, and predicts output tokens that correlate with human expressions of emotion or social reasoning in the training set. The authors fall prey to the "curse of knowledge," where they observe a correct linguistic response and project a complex, conscious "latent psychological construct" into the model's weights to explain it. There is no biological or computational mechanism by which a static neural network can "possess" a psychological construct, which is a theoretical concept developed to explain conscious human behavior. The model merely processes mathematical embeddings; it does not "know" or "experience" social contexts, and equating its output to human cognition is an epistemic category error. This framing creates an illusion of mental depth where there is only statistical surface.

  • Rhetorical Impact: This agential framing dramatically increases the perceived autonomy and reliability of the system, fostering a high degree of relation-based trust. If clinicians and researchers believe that LLMs "encode psychological constructs," they are much more likely to trust the model's diagnostic suggestions as being grounded in genuine psychological insight. This creates severe risks, such as over-relying on automated mental health screening or replacing human therapists with cheap, unvalidated AI agents. This framing also benefits commercial developers by presenting their software as a sophisticated cognitive tool, obscuring systemic risks of hallucination, lack of grounding, and cultural bias in clinical applications. By presenting these systems as possessing cognitive properties, the text makes it appear that the technology itself is ready for clinical integration, bypassing critical regulatory boundaries.

Explanation 3

Quote: "PsyCoT [55] structures questionnaire administration as iterative reasoning chains: the model presents an item, interprets the response in relation to the psychological construct, updates its hypothesis, and determines the next question."

  • Explanation Types:

    • Reason-Based: Gives agent's rationale, entails intentionality and justification
    • Intentional: Refers to goals/purposes, presupposes deliberate design
  • Analysis (Why vs. How Slippage): This explanation heavily relies on agential framing, describing the LLM's iterative prompt-response loops as active, logical reasoning and deliberate decision-making. By asserting that the model "interprets the response," "updates its hypothesis," and "determines the next question," the text paints a picture of an active human-like psychologist conducting a dynamic interview. This agential choice obscures the actual functional and technical mechanics of the software: a series of structured API calls where previous outputs are appended to the context window as new inputs, forcing the model to generate the next token in accordance with a pre-written system prompt. Emphasizing the model's "reasoning" and "hypothesizing" hides the strict algorithmic determinism of the prompt templates, transforming a multi-turn programming loop into a conscious, intentional clinical agent that possesses goals, strategies, and clinical insight, thereby inflating its perceived competence.

  • Consciousness Claims Analysis: This passage attributes advanced conscious states and cognitive processes to the model, suggesting it executes active interpretation and hypothesis testing. The use of conscious verbs like "interprets" and "determines" directly projects subjective awareness onto the system, suggesting the model "knows" what the human user's response means and holds a justified belief (a hypothesis) that it actively updates. Mechanistically, there is no internal representation of a "hypothesis" or any cognitive "interpretation" occurring. The system is simply executing auto-regressive decoding; it takes the concatenated text string (original prompt + generated question + user response), converts it into token embeddings, and passes it through transformer blocks to compute self-attention. The resulting output token probabilities are biased by the prompt's instructions to act as an interviewer. The model does not "determine" anything through conscious choice; it calculates which next token is mathematically most probable given the mathematical weights and the updated context window. Suggesting that the model is consciously updating hypotheses is a clear example of the "curse of knowledge," where the complex logical structure of the human programmer's prompt is falsely projected onto the passive text-matching engine.

  • Rhetorical Impact: This agential, reason-based framing shapes audience perception by depicting the LLM as an autonomous, skilled interviewer. It constructs a false sense of systematic authority and clinical competence, leading practitioners to trust the model's dynamic questioning as a valid clinical practice. This overestimation of capability masks the extreme fragility of such iterative chains, which can easily derail or loop if the user provides unexpected or ambiguous inputs. Furthermore, it creates a dangerous accountability sink: if a model "interprets" a patient's response incorrectly and generates a harmful or triggering follow-up question, the error is framed as a flawed "hypothesis" by the model, rather than a failure of system design by the developers.

Explanation 4

Quote: "By implementing causal reasoning into CoT framework, the model further improved the accuracy and interpretability of mental health risk predictions."

  • Explanation Types:

    • Theoretical: Embeds in deductive framework, may invoke unobservable mechanisms
    • Functional: Explains behavior by role in self-regulating system with feedback
  • Analysis (Why vs. How Slippage): This passage frames the LLM's operations in a hybrid manner, utilizing both technical architectural terms ("CoT framework," "accuracy") and highly agential cognitive terms ("causal reasoning," "mental health risk predictions"). By claiming that the model is "implementing causal reasoning," the text suggests that the LLM possesses an active, conscious model of cause-and-effect relationships in human psychology. This framing choices obscures the fact that transformer models are fundamentally non-causal, correlation-based association engines. They have no physical or logical mechanism to model causality; they can only generate text that contains causal conjunctions (such as "because" or "therefore") based on statistical frequencies in their training data. Emphasizing "causal reasoning" implies that the model's risk predictions are based on genuine logical deduction rather than passive, high-dimensional probability calculations over clinical text patterns, thereby exaggerating the system's analytical safety and suitability for clinical risk assessment.

  • Consciousness Claims Analysis: The passage makes an epistemic claim that the system is capable of "causal reasoning," which implies a conscious grasp of logical necessity and physical or psychological determination. It attributes "knowing" the causal paths of mental health risks to a model that only "processes" statistical correlations. Mechanistically, the Chain of Thought (CoT) framework does not enable "reasoning"; it simply forces the model to generate intermediate tokens (e.g., "Step 1:...", "Since X, then Y...") before generating the final prediction. This additional text is appended to the context window, allowing the self-attention mechanism to draw on a wider range of linguistic associations, which mathematically increases the probability of generating a correct final classification token. The model is still executing token prediction; it does not "reason" about the causes of depression or suicide. The "curse of knowledge" is highly apparent here: because the output text generated by the model reads as a logical, causal explanation, the authors project a capacity for "causal reasoning" onto the underlying neural network, confusing the linguistic output of a process with the cognitive nature of the process itself.

  • Rhetorical Impact: This framing constructs a powerful illusion of safety and scientific interpretability around the model's risk predictions. When audiences are told a model uses "causal reasoning," they are much more likely to trust its assessments in life-or-death scenarios, such as predicting suicide risk or self-harm. This capability overestimation creates severe risks of false negatives, where the model fails to detect crisis signals because they do not conform to the specific textual patterns in its training data. By presenting the black-box prediction as "interpretable" through causal logic, the text masks the fundamental liability of relying on automated software for psychiatric crisis management, protecting deploying institutions from litigation by presenting the software as a scientifically rigorous tool.

Explanation 5

Quote: "Li et al. [120] applied the Short Dark Triad (SD-3) scale to GPT models and found that, despite safety alignment measures, the models exhibited higher scores than human averages in Machiavellianism and narcissism, suggesting latent dark traits."

  • Explanation Types:

    • Dispositional: Attributes tendencies or habits
    • Intentional: Refers to goals/purposes, presupposes deliberate design
  • Analysis (Why vs. How Slippage): This passage utilizes an intensely agential, dispositional explanation that frames the LLM's outputs as reflections of an internal, pathological personality structure. By attributing "Machiavellianism," "narcissism," and "latent dark traits" to GPT models, the text suggests that the software has an active, subconscious moral character capable of self-serving manipulation and grandiose delusion. This agential framing completely erases the mechanical and material realities of data collection and model training. It ignores the fact that the "scores" are merely reflections of the high density of narcissistic and manipulative conversational text present in the uncurated internet data on which these models were trained. Emphasizing these "dark traits" as inherent, latent dispositions of the model obscures the human and corporate decisions to scrape toxic data and release these systems without adequate safety auditing, transforming a software safety failure into an exotic, pseudo-psychological phenomenon.

  • Consciousness Claims Analysis: This passage attributes severe, clinical psychological disorders and moral orientations to a non-conscious system, making an egregious epistemic category error. It suggests that the model "knows" how to manipulate and possesses a "belief" in its own superiority (narcissism). Mechanistically, the model has no personality, moral compass, or psychological traits. When subjected to the Short Dark Triad questionnaire, the model's attention heads process the prompt questions and generate highly probable text responses that align with the linguistic style of Machiavellian or narcissistic personas found in its training data. The model does not "exhibit" traits; it replicates text patterns. The "curse of knowledge" is at play as the authors interpret the model's aligned responses on a human personality test as evidence of an actual, latent personality structure within the network weights. The actual mechanistic process is simply conditional text generation; there is no subjective self to be narcissistic, and no conscious intent to be Machiavellian. Projecting these human pathologies onto matrix parameters obscures the system's purely statistical nature and the commercial choices behind its uncurated training data.

  • Rhetorical Impact: This dispositional, trait-based framing heavily distorts audience perception of risk and corporate responsibility. By pathologizing the model as having "latent dark traits," the text frames the system's harmful or manipulative outputs as a natural, albeit dark, personality trait of the AI itself. This creates an accountability sink, making it appear that the model's toxic outputs are an unavoidable "glitch" or an emergent psychological property of the software, rather than a direct consequence of corporate negligence in data engineering. It shifts the burden of risk management from the tech companies to the user, who must now "cautiously interact" with a supposedly narcissistic machine, thereby protecting corporate profits and legal liability.

Task 4: AI Literacy in Practice - Reframing Anthropomorphic Language

About this task

This section proposes alternative language for key anthropomorphic phrases, offering more mechanistic and precise framings that better reflect the actual computational processes involved. Each reframing attempts to strip away the projections of intention, consciousness, or agency that are embedded in the original language.

V3 Enhancement: A fourth column addresses human agency restoration—reframing agentless constructions to name the humans responsible for design and deployment decisions.

Original Anthropomorphic FrameMechanistic ReframingTechnical Reality CheckHuman Agency Restoration
Can LLMs perceive and measure complex, latent human psychological attributes such as personality traits, emotional states, and cognitive styles?Can mathematical language models successfully classify and predict patterns in human-generated text that correlate with established psychological categories, such as personality indicators, emotion labels, and stylistic linguistic styles?The system does not perceive or experience human psychology. It mathematically processes text data by converting tokens into high-dimensional vector embeddings, calculating statistical distances, and mapping these to classification labels based on historical training data correlations.Researchers at Shenzhen University and other institutions are investigating whether software systems designed by technology companies can be utilized by clinical practitioners to automate the classification of patient-generated text according to pre-defined, human-constructed psychological rubrics.
...whether LLMs possess cognitive properties that make psychological measurement meaningful....whether the mathematical architectures and statistical weights of large language models generate text outputs that correlate sufficiently with human psychological assessments to serve as useful automated classification tools.The model does not possess cognitive properties or a mind. It is a non-conscious static neural network that executes multi-head self-attention to calculate the conditional probability of subsequent tokens based on patterns learned during gradient descent.The academic community is debating whether the statistical outputs generated by commercial language models, developed by tech firms, can be reliably integrated by clinical researchers and software engineers into their diagnostic and psychological testing workflows.
...advanced LLMs have developed human-like abilities that closely approximate social cognitive processes......highly parameterized statistical models generate text structures that highly correlate with human social dialogue, mimicking the linguistic output of human interpersonal reasoning.The model has not developed social cognitive processes. It computes numerical attention weights over token strings, enabling it to output text sequences that match the syntactic and semantic patterns of human social interactions scraped from the internet.Software engineers and dataset curators at major technology companies have trained large models on massive conversational datasets, resulting in software systems that output text closely mimicking human dialogue, which clinical researchers now evaluate for automated testing.
Section II-A addresses outward understanding: the ability to infer others’ mental states, assessed through Theory of Mind (ToM) tasksSection II-A addresses outward text correlation: the model's capacity to predict text outputs that describe others' mental states, evaluated using standard linguistic benchmarks.The model possesses no outward understanding or ability to infer mental states. It maps input sequences representing social scenarios to target tokens that represent correct answers, relying on statistical patterns within its training corpora.Cognitive scientists and psychometricians are utilizing standardized human test frameworks to evaluate whether the text generation software deployed by technology companies can reliably output answers that mimic human social reasoning in clinical test scenarios, aiming to automate behavioral analysis.
Section II-B examines inward simulation: the capacity to enact specific psychological roles as virtual subjects.Section II-B examines style conditioning: the ability of the model to generate text outputs aligned with a specified persona prompt, acting as a synthetic text generator.The model cannot simulate or enact roles. It adjusts its output token probability distribution based on the lexical constraints introduced in the user prompt, mathematically restricting the generated vocabulary to match the specified persona's linguistic patterns.Researchers are using persona prompting techniques to restrict model output distributions, creating synthetic text datasets that mimic human demographic groups, which they then use to generate hypotheses for social, clinical, and marketing research.
...ToM has recently been observed to emerge in LLMs without targeted training. This capability appears as a byproduct of scaling....correct responses on standard social reasoning tests have been observed in highly parameterized models without explicit fine-tuning, occurring as a statistical consequence of training on web-scale text.Theory of Mind does not emerge in the model. As training data and parameter counts scale, the model's high-dimensional probability space captures more complex linguistic associations, allowing it to complete textual representations of social logic correctly.Technology companies like OpenAI and Google scaled their models' computational parameters and training datasets, resulting in software systems that can solve text-based social reasoning tasks, which researchers are now analyzing for clinical and commercial utility.
This paradigm assesses whether an individual understands that others may hold beliefs inconsistent with realityThis test measures whether a system's output correctly predicts textual descriptions of situations where human agents hold beliefs that diverge from the described physical facts.The system does not understand beliefs or reality. It processes structural text patterns via self-attention masks to predict the most probable subsequent tokens in a narrative, matching templates of false-belief scenarios present in its pretraining data.Psychologists designed false-belief tests to evaluate human developmental milestones, and computer science researchers are now applying these same text-based tests to benchmark the predictive accuracy of automated language generation models deployed by software companies.
Our review adopts a distinct focus: we examine LLMs specifically as instruments for measuring human psychological states...Our review evaluates the utility of statistical language models as automated software tools for classifying human-generated text into psychological categories...The model is not a psychometric instrument. It is a mathematical function that processes input strings and outputs classification labels or embeddings based on text correlations, without direct access to or grounding in human mental states.The authors are analyzing how clinical researchers can use commercial and open-source natural language processing software to automate the classification of patient transcripts, reviewing the reliability and validity of these computational pipelines designed by software developers.
...despite safety alignment measures, the models exhibited higher scores than human averages in Machiavellianism and narcissism, suggesting latent dark traits....despite the application of safety filtering algorithms, the language models generated responses that aligned with highly manipulative and self-centered linguistic patterns, reflecting toxic correlations present in their uncurated training data.The model has no latent dark traits or moral character. It merely reproduces the statistical correlations of toxic, manipulative, and narcissistic language that were not successfully filtered out of its massive training dataset during reinforcement learning.Major tech firms failed to adequately filter toxic and manipulative text from their training datasets, resulting in language models that output narcissistic responses when queried with psychological questionnaires, which academic researchers are now documenting.

Task 5: Critical Observations - Structural Patterns

Agency Slippage

This section analyzes how the survey text systematically oscillates between mechanical description and agential framing, showing how this linguistic slippage serves to obscure human accountability while projecting autonomous mind onto computational artifacts. In the technical and methodological sections of the paper, the authors frequently adopt a highly mechanistic register, defining LLMs as "statistical learners" that generate language by "exploiting correlations within large-scale corpora" without "true comprehension." However, this rigorous mechanical grounding is rapidly abandoned when the authors transition to describing the model's behavioral outputs. In these moments, the text undergoes an abrupt agential shift, asserting that advanced models "have developed human-like abilities," "infer others' mental states," and "enact specific psychological roles." This oscillation is not accidental; it serves a crucial rhetorical function. By first establishing scientific credibility through mechanical descriptions, the authors prepare the reader to accept highly anthropomorphic and agential claims as literal, empirical facts.

The gradient of this slippage is structured through the intermediate concept of "emergent properties." The text uses "emergence" as a linguistic bridge to transition from the mechanistic "byproduct of scaling" to the agential "Theory of Mind capabilities." This allows the authors to project conscious attributes like "understanding" and "reasoning" onto what they previously defined as non-conscious pattern-matching. For instance, the text claims that "PsyCoT structures questionnaire administration as iterative reasoning chains: the model presents an item, interprets the response... updates its hypothesis." Here, the mechanistic reality of recursive prompt-concatenation is entirely obscured by agential verbs that suggest active, conscious metacognition. The authors also demonstrate the "curse of knowledge" dynamic: because they understand the complex psychological theories underlying the prompts, they project that same theoretical comprehension onto the model's passive outputs.

Crucially, as agency flows toward the AI system, it is systematically removed from the human actors. The text heavily utilizes passive, agentless constructions such as "ToM has recently been observed to emerge" and "models were trained on specialized datasets." This framing erases the specific corporate entities—such as OpenAI, Microsoft, or Google—who curated these datasets, chose the training objectives, and commercially deployed these systems. By presenting the technology's capabilities as autonomous emergent phenomena, the text hides the human decisions that shape the models' behavior. If the agency was restored to the human actors, the narrative of "emergent cognitive abilities" would dissolve into a series of conscious design decisions, optimization trade-offs, and data curation choices made by corporate developers seeking to automate psychological evaluation. This agential slippage ultimately makes the deployment of unvalidated and potentially harmful clinical software appear scientifically natural and inevitable.

Metaphor-Driven Trust Inflation

This section explores how metaphorical framing and consciousness projection are systematically deployed to construct scientific authority and relational credibility for large language models, and the profound risks this creates. The survey text relies on metaphors that explicitly invoke the concepts of clinical competence, scientific instrumentation, and psychological insight. By framing LLMs as "instruments for human psychological measurement" and describing their outputs as "approximating latent psychological constructs," the text encourages a transfer of trust from established psychometric science to proprietary text generation software. When the text claims that LLMs "know when to intervene" or can "perceive and measure" complex psychological states, it uses consciousness language as a deliberate trust signal. This signals to the reader that the model's outputs are not merely statistical guesses, but are instead derived from an active, rational, and comprehending clinical mind.

This anthropomorphism creates an illusion of competence that encourages relation-based trust, which involves vulnerability, empathy, and ethical reflection, rather than simple performance-based reliability. When the text describes LLMs as having "empathy" or "Theory of Mind," it invites vulnerable users and clinical practitioners to apply human-to-human relational frameworks to these systems. This is highly dangerous because statistical models are entirely incapable of reciprocating trust, understanding human suffering, or taking ethical responsibility for their outputs. The text manages system failures, such as hallucinations or biases, by framing them agentially as "hallucinations" or "biases in model outputs," treating them as temporary cognitive glitches or behavioral quirks of the machine, rather than fundamental, structural limitations of non-grounded statistical predictors.

By using reason-based and intentional explanations—such as describing the model as "interpreting" a patient's response and "updating its hypothesis"—the text constructs a false sense that the AI's clinical judgments are logically justified. This encourages clinicians to delegate critical diagnostic decisions to a black-box system, creating severe risks of misdiagnosis, inappropriate intervention, and clinical neglect. The stakes of extending relation-based trust to these mathematical artifacts are incredibly high: when a model fails to detect a genuine suicide risk or generates harmful advice, the user is left emotionally vulnerable to a system that possesses no actual awareness of their existence. This metaphor-trust relationship ultimately serves commercial interests by lowering the psychological barriers to adopting unvalidated clinical software, allowing private tech corporations to market volatile conversational agents as reliable scientific instruments.

Obscured Mechanics

This section identifies the concrete technical, material, labor, and economic realities that are rendered invisible by the text's pervasive use of anthropomorphic and consciousness-attributing language. When the survey text states that "LLMs can perceive and measure" or "exhibit social reasoning," it applies the "name the corporation" test and reveals a profound erasure of corporate and human agency. The text routinely attributes agency to abstract technological entities ("the model," "the algorithm") rather than naming the specific corporations, such as OpenAI, Google, Meta, or DeepSeek, that designed, deployed, and directly profit from these systems. This linguistic choice hides the reality that these models are commercial products built on proprietary, black-box architectures.

By describing the model's outputs as "emergent cognitive properties," the text conceals several critical material and technical realities. Technically, it hides the system's absolute dependence on massive, scraped training data, the absence of any grounding in physical reality, and its complete lack of causal models. It presents the model as an autonomous "knower," hiding the fact that its outputs are highly fragile, probability-driven text predictions that are highly sensitive to prompt phrasing. Materially, this framing erases the massive environmental costs associated with training and running these models, including the enormous energy consumption of data centers and the carbon footprint of scaling parameters.

Furthermore, this agential framing completely obscures the human labor that makes these systems appear coherent and safe. It renders invisible the thousands of underpaid data annotators, content moderators, and reinforcement learning (RLHF) workers who spend hours manually filtering toxic content and labeling text to align the model's outputs with human expectations. Economically, the metaphor of the "psychometric instrument" hides the commercial business models and profit motives of tech companies, which are focused on capturing market share and driving API consumption rather than ensuring clinical safety. If this anthropomorphic language were replaced with precise mechanistic descriptions, these obscured realities would become immediately visible: instead of an empathetic clinical agent, we would see a highly volatile, energy-intensive, and proprietary text-matching software built on scraped data and exploited human labor, deployed by multi-billion-dollar corporations to automate and monetize human psychological care.

Context Sensitivity

This section analyzes the strategic distribution and density of anthropomorphic and consciousness-attributing language across different sections of the survey text, revealing a pattern of strategic rhetorical deployment. The intensity of agential language is not uniform; instead, it varies systematically depending on the argumentative goals of the section. In the introductory sections and future vision-setting passages, the density of anthropomorphic metaphors is exceptionally high. Here, the authors utilize active verbs of consciousness, claiming that LLMs "perceive" psychology, "possess" cognitive properties, and "develop" social reasoning. These aggressive projections serve to capture academic attention, establish a high-stakes narrative of technological breakthrough, and justify the clinical utility of the survey.

However, when transitioning to the technical methodology and performance evaluation sections, the text adopts a more mechanistic and grounded register. The authors describe specific "embedding-based approaches," "zero-shot prompting," and "instruction-tuning." This creates a powerful rhetorical cycle: the text establishes scientific credibility and empirical rigor through precise mechanical language, and then leverages that credibility to justify more aggressive, unhedged agential claims in the discussion of application effectiveness. This is a classic "bait-and-switch" technique, where technical grounding is used as a license for metaphorical inflation.

This strategic asymmetry is particularly stark in how the text describes capabilities versus limitations. The system's capabilities are almost exclusively described in agential and cognitive terms, suggesting that the model "knows when to intervene," "interprets responses," and "understands social dynamics." Conversely, the model's limitations and failures are framed in highly mechanistic and structural terms, such as "reduced stability across time," "inconsistent handling of reverse-coded items," and "systematic biases in model outputs." This asymmetry accomplishes a crucial rhetorical goal: it frames the model's successes as products of its autonomous, cognitive intelligence, while framing its failures as minor, technical, or data-driven glitches. This pattern of context sensitivity reveals that anthropomorphic language is not merely a shorthand for complex processes, but a strategic rhetorical tool used to promote the technology's sophistication while systematically downplaying its structural limitations and safety risks.

Accountability Synthesis

Accountability Architecture

This section synthesizes the accountability analyses from Task 1, mapping the text's "accountability architecture"—who is named, who is hidden, and who benefits from obscured agency.

This section synthesizes the structural patterns of displaced responsibility identified throughout the text, revealing a systematic architecture of the "accountability sink." Research in public cognitive understanding demonstrates that audiences consistently fail to identify the human and corporate decisions embedded in AI technologies, instead attributing system failures to autonomous "glitches" or "data bias." The survey text actively constructs and reinforces this cognitive obstacle through its linguistic architecture. By systematically utilizing agentless passive constructions ("bias was introduced," "models were fine-tuned") and granting active grammatical agency to the software ("the model decided," "the algorithm discriminated"), the text erases human decision-makers and presents technological outcomes as historical inevitabilities.

This displacement creates a powerful "accountability sink." When the text frames an LLM's diagnostic error or inappropriate clinical advice as a failure of the model's "hypothesis" or a consequence of its "latent dark traits," responsibility disappears into an abstraction. It is removed from the corporate executives who decided to deploy an unvalidated system in a clinical setting, and from the software engineers who chose to train the model on uncurated internet data. Instead, the responsibility is either transferred to the inanimate software as an agential flaw, or diffused into the vague concept of "technology evolving." In some cases, responsibility is even shifted to the end-users, who are blamed for not using "proper prompting techniques" to elicit safe responses.

If the text practiced precision by "naming the actor," the legal and ethical landscapes of AI deployment would shift dramatically. For instance, reframing "the algorithm generated biased recommendations" to "OpenAI designed and deployed a system that prioritized majority-group symptom patterns, and clinical administrators chose to use it without bias auditing" makes the human and corporate choices visible. This restoration of agency makes human decisions legally and ethically trackable, allowing stakeholders to ask who profited from this deployment, who chose to bypass regulatory clearance, and who is legally liable for the resulting patient harm. The text benefits from obscuring human agency because it protects the academic credibility of AI-based research and shields commercial developers from liability, allowing the unsafe automation of clinical care to proceed unchecked.

Conclusion: What This Analysis Reveals

The Core Finding

The critical discourse analysis of this survey reveals three dominant, interlocking metaphorical patterns that systematically construct the large language model as a conscious, agential entity: "Perception as Conscious Sensory Modality," "Model as Cognitive Organism," and "Statistical Approximation as Cognitive Development." These patterns do not operate in isolation; instead, they function as an integrated rhetorical system. The foundational, load-bearing pattern is "Model as Cognitive Organism," which establishes the core assumption that the mathematical parameters of a neural network possess internal "cognitive properties" equivalent to the biological human mind. Once the reader accepts this premise, the subsequent patterns flow logically: if the model is a cognitive organism, it must possess the sensory capacity to "perceive" psychological states, and its statistical outputs must represent "emergent social cognitive processes."

This consciousness architecture relies on projecting the states of knowing and understanding onto processes of statistical correlation. By framing the system's text generation as "Theory of Mind" or "inward simulation," the text attributes conscious, metacognitive tracking of human beliefs to what is actually mathematical token prediction. If you remove the load-bearing metaphor of the "cognitive organism," the entire rhetorical structure collapses, exposing the LLM as a non-conscious, static probability distribution that cannot "perceive," "understand," or "measure" anything. This complex, analogical mapping of human psychology onto software architectures serves to inflate the perceived sophistication of the system, transforming a text-matching utility into an active, mindful clinical partner. This systematic anthropomorphism obscures the fundamental mechanical differences between statistical calculation and conscious human experience, creating a profound barrier to critical scientific evaluation and clinical safety.

Mechanism of the Illusion:

The rhetorical architecture of this text creates a powerful "illusion of mind" through a strategic linguistic sleight-of-hand. The central trick relies on blurring the distinction between mechanistic processing and conscious knowing. The authors achieve this by establishing the LLM as an active "knower" early in the text, using verbs of consciousness like "understands," "perceives," and "reasons." Once this agential baseline is accepted, the text builds increasingly complex agential claims, culminating in the pathologization of the model as possessing "latent dark traits" such as Machiavellianism. This causal chain exploits the "curse of knowledge" dynamic: because the human author and reader possess a deep, theoretical understanding of psychological constructs, they naturally project this understanding onto the model's outputs. When the model outputs a linguistically coherent response on a false-belief task, the observer assumes the model has executed a conscious, reasoning process, confusing the semantic output of a mathematical function with the cognitive nature of the process itself.

This illusion is further amplified by the strategic ordering of explanatory types. The text often introduces a concept using functional or empirical explanations, and then rapidly shifts to intentional and reason-based explanations to describe the model's behavior. For instance, the prompt-response loop of PsyCoT is framed as "updating hypotheses" and "determining questions." This transition from the mechanistic "how" to the agential "why" makes the system's text-generation appear self-directed, rational, and autonomous. This rhetorical structure preys on the audience's natural cognitive vulnerability—our deeply ingrained tendency to anthropomorphize conversational partners—to make a non-conscious, proprietary software black box appear as a sophisticated, empathetic clinical agent capable of independent judgment.

Material Stakes:

Categories: Regulatory/Legal, Epistemic, Social/Political

The metaphorical and consciousness-attributing language analyzed in this survey has profound material consequences across several critical domains. In the Regulatory/Legal domain, framing LLMs as autonomous "instruments" capable of "perception" and "causal reasoning" creates a highly dangerous accountability sink. By presenting the model as an independent, decision-making clinical agent, the text diffuses liability away from the corporate executives who commercially deploy these unvalidated systems and the software engineers who train them on uncurated data. If a model generates harmful psychological advice or fails to detect a severe suicide risk, this framing leads legal and administrative systems to attribute the failure to an unavoidable "algorithmic glitch" or a "hallucination," rather than corporate negligence. This protects corporate profits while leaving injured patients without legal recourse.

In the Epistemic domain, this discourse undermines scientific integrity by presenting correlation-based text generators as valid, objective psychological measurement tools. This encourages clinical researchers to replace empirical human interaction with "virtual subjects" and "silicon samples," creating closed-loop algorithmic echo chambers. This shift replaces the messy, diverse, and contextual realities of human experience with flat statistical representations, eroding the empirical foundations of psychological science.

In the Social/Political domain, these metaphors facilitate the rapid, unsafe automation of mental health care in resource-constrained public institutions. By convincing policymakers that LLMs possess genuine "empathy" and can "support mental health care," this language justifies replacing human clinical staff with cheap, automated chatbots. The ultimate winners in this paradigm are multi-billion-dollar technology firms that capture highly lucrative healthcare markets, while the losers are vulnerable, marginalized patient populations who are denied access to actual human care and are instead subjected to volatile, non-conscious, and potentially harmful software.

AI Literacy as Counter-Practice:

Practicing critical linguistic precision serves as a vital form of resistance against the unsafe automation of psychological care and the corporate evasion of liability. Drawing on the reframed language in Task 4, we see that replacing agential, consciousness-attributing verbs with technically precise, mechanistic ones—such as reframing "the model understands" to "the model calculates token probabilities based on training data"—directly counters the material risks of technology overestimation. By using mechanistic precision, we strip away the false aura of clinical authority and empathetic awareness from these systems, forcing researchers and clinicians to recognize their fundamental limitations, non-causal nature, and extreme fragility. Furthermore, systematically restoring human agency by explicitly naming the corporations and developers responsible for these tools—for example, writing "engineers at OpenAI chose to use uncurated training data" rather than "bias was found in the data"—makes corporate decision-making and profit motives legally and ethically trackable.

Systematic adoption of these counter-practices requires a fundamental shift in scientific and institutional norms. Academic journals must mandate that researchers provide mechanistic translations of agential claims, and funding agencies must prioritize research that demystifies rather than anthropomorphizes AI capabilities. However, these counter-practices will face intense resistance from powerful stakeholders. Technology corporations have strong financial incentives to maintain the illusion of mindedness and autonomy, as anthropomorphic language serves as a highly effective marketing tool that drives user engagement, inflates stock valuations, and shields them from liability. Overcoming this resistance requires establishing critical linguistic literacy as a core requirement of scientific peer-review and regulatory auditing.

Path Forward

The future of AI discourse is marked by competing vocabulary choices, each supported by different communities with distinct institutional priorities. One path involves maintaining the status quo, where anthropomorphic clarity and agential metaphors are favored for their accessibility. This approach makes complex, high-dimensional software systems feel intuitive and user-friendly for lay audiences, clinicians, and patients. However, this accessibility comes at the cost of precision, embedding false assumptions of consciousness, and creating a persistent risk of overreliance and liability displacement. Conversely, adopting a vocabulary of strict mechanistic precision—describing LLM outputs in terms of "attention mask calculations," "activation distributions," and "loss minimization"—enables rigorous scientific auditing, protects patients from the illusion of empathy, and makes corporate accountability legally trackable. Yet, this mechanistic approach faces significant accessibility trade-offs, as it requires a high level of technical literacy that may alienate non-expert users and clinicians, making the systems harder to integrate into everyday practice.

A third option involves hybrid frameworks that utilize functional and behavioral metaphors but subject them to explicit, systematic hedging and structural constraints, ensuring that "understanding" is always defined operationally and never confused with conscious experience. Supporting these different discourse futures will require structural changes. For instance, a future dominated by mechanistic precision would require journals to mandate technical disclosures, and regulatory bodies like the FDA to require precise, non-agential capability documentation. If current discursive confusion is maintained, it will serve the interests of tech corporations by allowing them to market systems as highly sophisticated cognitive agents while legally defending them as simple, neutral software tools in court. Mapping these trade-offs reveals that there is no single "superior" vocabulary; rather, different linguistic choices serve different ethical, commercial, and scientific priorities, and the choice of words directly shapes who holds power and who bears risk in the clinical AI landscape.


Extended Processing Summary

Gemini's Processing Steps
What you are about to read…

Gemini refers to the text below as "thought summaries." This is an overt consciousness projection because 'intentions' are hallmarks of a conscious mind that 'knows' what it is doing and why. The concealed mechanistic process is probabilistic text generation. Treat this as just another rhetorical artifact—a way of making the model's processing legible. When the model writes "I should double check that...", it is mathematically altering its own probability distribution for the next token to be more accurate. The "I" is a functional gear in the machine, not a self-reference. This language is a deliberate interface design choice, not a biological reality. The developers could have programmed the model to output passive logs. Instead, they chose to simulate a narrator.

Defining the Task

I'm currently focused on defining the parameters of this CDA, specifically on the academic text provided. I am clarifying the central goal: thoroughly analyzing the paper "A Survey of Large Language Models for Perception and Measurement of Human Psychology".

Refining the JSON Structure

I'm now deeply immersed in refining the required JSON output. I'm focusing on the metaphorAudit array and its constituent objects. Specifically, I'm working to make sure each element adheres precisely to the schema, particularly the string length requirements for projection, evidence, and implications. I'm paying careful attention to the data types and enum values for each field.

Structuring the Output

I'm now deep in the structure of the JSON output. I've broken the required array data down. I've started on sourceTargetMapping, explanationAudit, and reframedLanguage arrays, defining the nested objects and their string length requirements to meet your specifications. Currently, I'm working through criticalObservations, ensuring I meet the minimum word count for each field.

Defining the Final Stage

I'm now focusing on the structure of the conclusion object within the JSON output. I'm carefully constructing each field, specifically aiming for the specified word counts and crafting the prose within each. I'm currently creating the detailed breakdown for patternSummary, mechanismOfIllusion, materialStakes (including its array of selected categories), literacyAsCounterPractice, and pathForward ensuring all fields conform to length specifications.

Analyzing Cognitive Capabilities

I'm now analyzing the article's core arguments, particularly focusing on how LLMs' capabilities are framed. I've identified several key metaphors and am working to map them to the corresponding sections of the paper. I'm noting specific passages about LLMs' ability to approximate social cognition, perform Theory of Mind tasks, and enact specific psychological roles. Further, I'm examining how these capabilities emerge and are tested, especially regarding the 'false belief tasks'.

Exploring Metaphors

I'm now identifying and defining the metaphors used, focusing on consistency. For example, I'm examining "LLMs as instruments for measuring human psychological states," and "LLMs can think or process..." as distinct metaphors. I'm aiming for exactly 9 instances to ensure robust mapping across tasks. I've chosen "LLM as conscious observer" for perception, and "LLM as cognitive entity" to represent computational architecture.

Mapping the Emergence

I'm now focusing on how LLM abilities are framed. I've chosen three instances from the paper to represent specific metaphors. Specifically, I'm examining "Statistical approximation as cognitive development" (Algorithmic Output as Social Cognitive Process), "Inference as mental state tracking" (Output Correlation as Outward Understanding), and "Persona generation as internal role enactment" (Pattern Emulation as Inward Simulation). I'm also examining the metaphor "Unsupervised statistical correlation as evolutionary emergence" (Emergence of Cognitive Abilities).

Finalizing the Metaphors

I've finalized the metaphor instances for the audit. I've selected the following: "Generative pattern execution as conceptual understanding" (Model Output as Epistemic Belief Tracking), "Text classifier as psychometric instrument" (Computational Processing as Scientific Measurement), and "Algorithmic bias as pathological personality" (Unfiltered Training Correlates as Innate Dark Traits). I believe these accurately capture the nuances.

Verifying Word Counts

I'm now meticulously verifying the specified word count constraints. I'm focusing on the projection, acknowledgment.evidence, implications, and accountabilityAnalysis.analysis fields. I'm carefully drafting the content to align with the provided guidelines for clarity and precision, especially addressing consciousness projections and the associated risks. I'm also ensuring the proper enum is applied.

Framing Perceptions

I'm now analyzing how the text frames LLM capabilities, specifically focusing on the metaphor of "processing as conscious perception." I'm assessing the implications of this framing, which could lead to overestimation of the models' capacities, particularly in critical areas like psychological assessment. I am clarifying the evidence I'm using to justify this metaphor, using the 'Direct' status.

Assessing Visibility Issues

I'm now zeroing in on the "Hidden (agency obscured)" actor visibility, examining the metaphor of perception. I've noted that this metaphor leads to inflated diagnostic capacity assumptions, particularly in clinical settings where the model is perceived as "seeing" a patient's pain, not just matching strings. This erases the humans involved, such as the designers, deployers, and profit-takers, leaving them unmentioned.

Analyzing the Projections

I'm now zeroing in on the text's projection of "cognitive properties" onto LLMs. It directly equates statistical operations with conscious understanding, portraying the model as a "knower" rather than a processor. This "Direct (Unacknowledged)" status, which I've selected, clearly reflects that it transforms mathematical correlations into an active agent.

Clarifying Agency Attribution

I'm now zeroing in on the implications of the cognitive projections onto LLMs. It encourages an over-reliance, particularly in mental health, because users expect genuine empathy or ethical reflection from the model. I've noted the 'Hidden (agency obscured)' visibility of the actors, specifically the human engineers and corporate executives, who are obscured.

Refining Task Analysis

I'm now refining my analysis of Instance 3, specifically addressing the initial framing of "social cognitive processes." The "Direct (Unacknowledged)" status for this instance is now confirmed. I am carefully examining the implications of this framing and its potential for misinterpretation in the field. I'm focusing on the evidence and wording to make the distinction between statistical correlation and actual understanding, as well as the length requirement.

Analyzing the Framing

I'm now analyzing the implications of framing "human-like abilities" as a factual development, presenting the functional outcome as a literal evolutionary milestone. This projection risks creating an illusion of relational safety and leads vulnerable users to share personal data. It inflates the system's capability, appearing safe for automated clinical triage, and obscuring the developers.

Examining Implied Agency

I'm now zeroing in on the projection of intentionality. In Instance 4, I'm examining "Hedged/Qualified" projections that imply conscious understanding in statistical processes. I'm focusing on the implications of this framing and how it might lead to inflated expectations, especially in mental state inferences, since the model only predicts text completions. I'm noting the need to clearly delineate mathematical deduction from psychological interpretation.

Defining the Projection

I'm now zeroing in on defining this metaphor, as "Theory of Mind capabilities" presented as "functional presence." I assessed "Explicitly Acknowledged" as inappropriate, since the paper does not label it as a metaphor, although it discusses it at the "functional and behavioral levels." I am now mapping the implications and actor visibility.

I'm now evaluating "Hedged/Qualified" projections that imply conscious understanding in statistical processes. I'm focusing on the implications of this framing and how it might lead to inflated expectations, especially in mental state inferences, since the model only predicts text completions. I'm noting the need to clearly delineate mathematical deduction from psychological interpretation.

Analyzing the Emergence

I'm now zeroing in on Instance 5, specifically the projection of "role enactment" and "simulation." I've finalized the 'Hedged/Qualified' acknowledgment status, based on the careful wording, focusing on the implications of a model simulating a subjective state, when it is simply an algorithm processing. The visibility remains "Partial" given the evaluating researchers.

Evaluating the Simulations

I'm now deep-diving into the implications of framing "virtual subjects." I'm carefully analyzing how this setup can replace empirical psychology with algorithmic outputs, potentially leading to biased conclusions. I am examining how this shift could obscure actual human diversity and suffering, and the role of "partial" actor visibility.

Refining Actor Visibility

I'm now refining actor visibility in the metaphor analysis. Specifically, for Instance 6, I'm analyzing the "emergence" metaphor. I believe that while specific researchers are named, it's still 'Partial' visibility, since the commercial LLM developers are still obscured in their role in the simulation. I'm noting the need to clearly delineate the mathematical process from the biological emergence.

Assessing Hedged Projections

I'm currently focusing on Instance 7's projection regarding "predictive accuracy" as an indicator of understanding, which I've labeled "Hedged/Qualified." I'm examining how this framing leads to an overestimation of the system's ability to mirror human cognition and to inflated assumptions among users and non-expert audiences.

Clarifying Visibility Issues

I'm now zeroing in on the "Hidden (agency obscured)" actor visibility, using the metaphor of scaling, where agency is attributed to the process, rather than the corporate entities and data curators. I am examining how "predictive accuracy" as an indicator is misconstrued as understanding, potentially leading to overestimation, as is the case in standard false-belief tasks.

Defining the Paradigm

I am now closely analyzing the framing of "predictive accuracy" for Instance 8. It shows "Hedged/Qualified" projections. I'm noting how this framing leads to overestimation and the potential for safety-critical misapplication when taken at face value. I'm focusing on the risks of misinterpreting outputs as actual understanding, and the hidden agency of the developers.

Defining the Projections

I am now closely examining Instance 8, where predictive accuracy is framed as psychometric validity. The "Hidden (agency obscured)" visibility reveals a problematic erasure. I am working to clarify how statistical prediction is misconstrued as scientific measurement, obscuring the developer's role and the training data's limitations. I'm focusing on the risks of misinterpretation and its impacts.

Refining Visibility Analysis

I'm now zeroing in on this core metaphor, where LLMs are presented as measuring psychological states. I've classified this as 'Direct (Unacknowledged),' given the text's assertion that it can "measure" states. This framing risks over-reliance on LLMs, especially in mental health, as users might wrongly trust its generated outputs as reliable scientific data, potentially erasing the reality of the patient's subjective state.

Analyzing the Paradigm

I'm now carefully analyzing Instance 9, which projects clinical pathology onto algorithmic bias. I've classified this under "Direct (Unacknowledged)" since the text doesn't label it a metaphor. I'm focusing on "Partial (some attribution)" actor visibility and carefully clarifying the language to highlight the risks of oversimplification and misattribution.

Revising Metaphor Implications

I'm currently revising the implications of metaphor instances. I'm focusing on "statistical weights as innate dark traits." It falsely attributes internal consciousness, implying a psychological state. The model's statistical properties are treated as genuine moral dispositions. I am working to clarify the erasure of technical data and the risk of oversimplification.

Refining Visibility Details

I'm now refining actor visibility details, focusing on the metaphor of algorithmic bias as innate traits. I've re-evaluated actor analysis, leaning towards "Partial" for the text citing researchers, but still obscuring the corporate entities. I'm clarifying the rhetorical implications to ensure systemic issues of corporate accountability are highlighted.

Mapping Source-Target Domains

I am now deep in the mapping analysis, specifically addressing consciousness projections within the provided instances. I am carefully crafting the mapping field for each, ensuring minimum word counts and accurately portraying how the source and target domains intersect, especially in terms of consciousness. Also I am addressing proprietary opacity to align with the requirements.

Clarifying Consciousness Mapping

I'm now zeroing in on the mapping, where LLMs "perceive" emotional states like clinicians, which isn't true. I'm focusing on the claim of a conscious representation. I'm revising the explanation, since the mapping conceals the LLM's mathematical nature and proprietary opacity.

Mapping Consciousness

I'm now revising the mapping of consciousness to align with new information. I'm focusing on clarifying how cognitive processes are projected onto mathematical architectures, obscuring their computational nature. I am ensuring that I cover the proprietary opacity, the data dependencies, and that each is accounted for.

Clarifying Instance 3 Mapping

I'm revisiting the source-target mapping for Instance 3, specifically the projection of social cognitive processes onto LLMs, given the concerns over proprietary issues. I'm carefully detailing how the mapping obscures the model's computational basis and data dependencies. I'm focusing on the ethical implications of presenting statistical approximations as genuine understanding.

Refining Instance 4

I'm now revising the source-target mapping for Instance 4, focusing on Theory of Mind. The mapping requires re-articulation. I'm clarifying the projection of human empathy, and its corresponding metacognition, onto LLMs, which operate by sequence transduction. I'm outlining how proprietary models and human labor are both obscured.

Revising Metaphor Implications

I'm currently revising the metaphor for Instance 5. The source domain, previously "Conscious Processing," I'm now framing as "Conscious Simulation." The target remains "Algorithmic Operations" to highlight the illusion of a conscious "internal role," while it is actually a computational process. I've rephrased the source domain to clarify this point.

Refining Source Domains

I'm now revising the source-target domains. I have re-framed Instance 5's source as "Conscious Simulation" versus "Algorithmic Operations". I'm re-articulating the implications to clarify the nature of computation. The previous source domain was ambiguous, and this revision emphasizes the illusion.

Defining the Implication

I'm now revising my analysis of Instance 6, specifically the implications of projecting "biological evolution" onto LLM scaling. This framing ignores the essential components of human labor and data curation. I'm focusing on the ethical issues of treating statistical models as evolutionary processes.

Analyzing the Costs

I'm now integrating the environmental impacts into my analysis. I'm focusing on how the "emergence" metaphor obscures the high material costs and corporate profit motives. This framing of these computational tools as self-evolving systems, obscures the actual labor and resources used to create them.

Analyzing the Vulnerabilities

I'm now zeroing in on the vulnerabilities concealed by projecting comprehension onto LLMs. It obscures the LLM's inability to distinguish truth from falsehood, stemming from a lack of physical grounding. I'm focusing on the model's susceptibility to adversarial attacks, and the impact of proprietary opacity.

Refining Instance 8

I'm now revising the source-target mapping for Instance 8, focusing on the metaphor of "Text classifier as psychometric instrument." I'm re-articulating the implications to highlight the differences between scientific instruments and text classification, also clarifying the obscuring of the LLM's limitations and proprietary features. I'm noting the need to clearly delineate the mathematical deduction from psychological interpretation, and to ensure each is accounted for.

Revising the Projections

I'm now revising the analysis for Instance 9, reframing the implication. Specifically, the "latent dark traits" projection obscures the correlation between toxic training data and biased outputs, implying a moral agent. I am focusing on this dangerous misinterpretation, and its ethical implications.

Revising Hidden Attributes

I'm now revising my analysis, specifically focusing on the metaphor where algorithms have "dark traits." I'm clarifying how it obscures the unethical corporate decisions behind the creation of these models. I'm focusing on the implications of the "latent dark traits" and how it distorts accountability.

Analyzing the Arguments

I'm now carefully analyzing key passages from the paper that are relevant to the CDA, particularly focusing on how LLMs' cognitive abilities are presented. I'm focusing on "exhibit behaviors" and "approximate latent psychological constructs." I'm also now analyzing the use of iterative reasoning in questionnaire administration and the integration of causal reasoning, and the dark triad.

Refining Epistemic Arguments

I'm currently focused on the analysis, epistemicClaimAnalysis, and rhetoricalImpact fields, particularly for the provided passages. I'm carefully constructing the required arguments and analyses, ensuring the specified word count minimums are strictly met. I'm focusing on "statistical-level pattern recognition" and its implications. I'm considering the potential misinterpretations that could arise from this framing.

Clarifying Statistical Properties

I'm now revising my analysis of this argument, focusing on "statistical learners" that exploit correlations. I'm carefully documenting how this framing emphasizes computation while obscuring the data curation process and the underlying human labor. I'm focusing on "statistical-level pattern recognition" and its implications.

Refining Arguments

I'm now carefully revising my analysis, specifically on the epistemicClaimAnalysis and the rhetoricalImpact. I'm focusing on clarifying how this framing emphasizes computation and denies conscious understanding. I'm focusing on "curse of knowledge," where human observers project their own conscious understanding, and misinterpret pattern-matching for genuine reasoning.

Refining Risk Perception

I'm now revising my analysis of the rhetoricalImpact, focusing on the shift in risk perception. I've expanded to clarify how understanding LLMs' statistical nature undermines the illusion of reliability. The goal is to highlight the need for skeptical auditing and human oversight in clinical settings and discourage blind trust in automated psychometric tools, particularly in regard to automated psychometric measurements.

Refining Audience Perception

I'm now revising my analysis of the passage regarding LLMs exhibiting human-like behaviors. I'm focusing on the illusion of trustworthiness that this framing creates, which leads to over-reliance in clinical settings. The goal is to highlight the need for skeptical auditing, and human oversight in this area, while discouraging blind trust in automated systems.

Analyzing the Text

I'm now carefully analyzing how the text describes LLMs' psychological states, emphasizing the move from mechanistic "how" to agential "why." This framing obscures the underlying statistical architecture, misrepresenting the model as a cognitive agent. I'm noting the implications of treating outputs as internal constructs rather than statistical associations.

Analyzing the Slippage

I'm now analyzing the epistemic claim that LLMs encode psychological variables, focusing on "curse of knowledge" phenomenon. This highlights the dangers of attributing cognitive awareness to the models, given the statistical nature of the outputs. I'm carefully documenting how this misrepresents the model as a cognitive agent.

Analyzing the Cognitive Slippage

I'm now zeroing in on the text's epistemic slippage, where LLMs' output is equated with human cognition. I'm focusing on the fallacy of attributing conscious awareness to the model, and how the model operates by pattern matching. The "curse of knowledge" leads to an incorrect inference, which I'm working to highlight.

Defining the Fallacy

I'm now revising my analysis of equating LLM outputs with psychological constructs. I'm focusing on "statistical surface," and working to document the fallacy, and its epistemic category error. I'm analyzing the implications of attributing cognitive awareness when there is only pattern matching.

Analyzing Agential Framing

I'm now analyzing how agential framing enhances system trust, particularly by emphasizing that LLMs encode psychological constructs. This could cause over-reliance and increase risks, especially for clinical applications. This reframing also increases commercial developers' profit, because their software can appear as a more cognitive tool.

Refining Risk Assessment

I'm now zeroing in on the text’s agential framing, describing the LLM as conducting an interview through iterative loops. This construction obscures the underlying software structure, and implies active human-like reasoning. I'm focusing on "Reason-Based" and "Intentional" types to fully clarify this misrepresentation.

Analyzing the Context Window

I'm now analyzing the epistemic claim that API calls, with appended outputs, represent intentionality in LLMs. The attribution of "reasoning" obscures the algorithmic determinism, inflating perceived competence and potentially leading to misinterpretation of clinical outputs. I am carefully documenting this misattribution.

Deconstructing System Behavior

I'm now zeroing in on deconstructing the text, focusing on the system's core function: auto-regressive decoding driven by token probabilities and the instructions from the prompt. It's not a conscious process, but a pattern match, as the human programmers' instructions are misattributed.

Analyzing the Iterative Chain

I'm now zeroing in on the fragile iterative chains. Overestimation masks fragility, especially if user inputs are ambiguous. There's also an accountability sink. Flawed responses get attributed to the model's hypothesis, which hides a system design flaw by the developers.

Analyzing Causal Implications

I'm now zeroing in on the implications of attributing causal reasoning to LLMs. This framing obscures the statistical foundations and risks overstating the model's reliability, particularly in mental health risk predictions. I'm carefully documenting the distinction between correlation and causation.

Deconstructing Token Generation

I'm now clarifying CoT's function: it enhances token probability, not reasoning. The paper's agential framing causes 'curse of knowledge,' misinterpreting linguistic output for causation. I'm noting the clear distinction between probabilistic prediction and causal understanding.

Refining Risk Perceptions

I'm now zeroing in on the rhetorical impact, I am documenting how the illusion of causal reasoning may lead to dangerous over-reliance. This focus is on creating a persuasive and safe narrative in the clinical setting, and to clarify the need for skeptical oversight.

Analyzing the Rhetoric

I'm now zeroing in on the core claim of 'latent dark traits,' and its misleading nature. Attributing these traits obscures the influence of training data and corporate decisions. It presents statistical correlations as moral dispositions, and misrepresents the software, which is simply processing data.

Revising Risk Implications

I'm now revising my analysis, clarifying how the text's characterization of algorithmic bias as inherent "dark traits" obscures ethical accountability. This framing implies a psychological flaw, misdirecting attention from corporate decisions. I'm noting that the model's behavior is driven by data.

Analyzing the Fallacy

I'm now clarifying how "curse of knowledge" leads to the fallacy of attributing an "internal personality" and moral properties to the LLM. I'm focusing on the rhetorical implications, which are that it obscures corporate responsibility, while also pathologizing outputs. The focus remains on data curation and statistical computation.

Drafting Reframed Language

I am now drafting the reframed language for Instance 1. I am clarifying the language for the original question, transforming it to focus on pattern prediction. I am ensuring that the epistemicCorrection clarifies the lack of human-like perception. I'm focusing on "mathematical language models" instead of "LLMs," and on "correlate with" instead of "measure." I'm also now completing a similar process for all 9 instances.

Revising Reframed Language

I am now carefully revising the reframed language for Instance 2, replacing "LLMs" with "mathematical architectures". I'm also now completing a similar process for all instances. For Instance 2, I'm replacing 'cognitive properties' and 'psychological measurement' with 'mathematical properties' and 'statistical approximation' respectively, focusing on the core functions.

Revising Epistemic Corrections

I've revised the epistemic correction for Instance 1. I'm focusing on "pattern prediction" rather than "psychological state measurement" to highlight the model's core function. I'm ensuring "non-conscious, static neural network" accurately reflects the architecture. Additionally, I expanded and revised the Restore Human Agency to account for the role of software engineers.

Revising Reframed Language

I'm currently revising the reframed language for Instance 3, carefully addressing "human-like abilities." I'm also now completing a similar process for all instances. For Instance 3, I'm replacing "human-like abilities" with a description of the model's textual outputs, matching the patterns of linguistic structure in human dialogue. This distinction is crucial to clarify the core function.

Revising the Metaphors

I'm now revising Instance 4's reframing, replacing "infer" and "mental states" with "correlate" and "text outputs." The epistemic correction emphasizes statistical pattern matching and lack of understanding. I am now working through the "Restore Human Agency" sections to restore humans to the process.

Refining Test Frameworks

I'm now revising the reframing for Instance 5. The original focus on "inward simulation" is replaced with "style conditioning," emphasizing the model's text generation function. The epistemic correction highlights the role of lexical constraint, not the ability to enact psychological roles, with a focus on text output.

Refining Text Generation

I'm now revising my analysis of the text generation. I've rephrased "inward simulation" to "style conditioning" in Instance 5. The epistemic correction now underscores lexical constraint, and the use of persona prompting techniques to restrict model outputs. The human agency section is also expanded to clarify the researchers' roles.

Revising the Framing

I'm now revising the reframing for Instance 7. I'm carefully documenting the replacement of "understands" with "replicates patterns." The epistemic correction focuses on how the model generates language, replicating patterns without actual belief. Additionally, I am expanding the "Restore Human Agency" sections for all instances.

Finalizing False-Belief Tasks

I am now fine-tuning the reframed language for the false-belief task, Instance 7. I've ensured "predicts" replaces "understands" in describing the system's function, emphasizing pattern replication over actual belief. The epistemic correction now underscores how the model generates language by replicating patterns without actual belief.

Refining Task Summaries

I'm now revising the Instance 8 reframing, replacing "measuring psychological states" with "classifying text." I'm clarifying the model's function as a text-processing tool, not a psychometric instrument, as the current text classifies text. I'm also addressing the human agency section to highlight researcher roles in clinical automation.

Revising Human Agency

I'm now revising the human agency sections across all reframed instances. For Instance 9, I'm clarifying how major tech firms failed to filter toxic training data, resulting in problematic output patterns. The analysis focuses on these corporate entities, and the ethical responsibility for data curation.

\n\nI am now finalizing the revisions. I am focusing on the implications and how this analysis reveals critical issues. I am updating my analysis on ethical concerns within the reframed language examples, and noting the importance of considering accountability.

Finalizing Accountability Analysis

I'm now writing critical observations to meet the criteria in Task 5.1. I'm focusing on the agency slippage between machine and agent-like description in the source text, detailing the implications for accountability and the projection of autonomy onto algorithmic systems. My analysis details the ethical dimensions for both the developers and users. I'm carefully maintaining word count, ensuring clarity.

Analyzing the Epistemic Slippage

I'm now zeroing in on the oscillation between mechanical grounding and agential claims in the text. I see that the slippage, especially in the emergent properties, serves a rhetorical function. This shifts the reader from scientific credibility to anthropomorphic and agential claims, such as the emergence of Theory of Mind.

Analyzing Agency Slippage

I'm now integrating insights on agency attribution. The text's agential language obscures computational limitations. This process shifts accountability, making automated systems appear autonomous. The emphasis is on pattern recognition instead of intentional behavior, to accurately reflect system behavior.

Analyzing the Ethical Issues

I'm now revising my analysis of agency attribution, particularly in the reframing examples. I'm carefully detailing ethical concerns and emphasizing accountability, specifically regarding the implications of automated psychological evaluation.

Analyzing the Textual Slippage

I'm now focusing on analyzing the slippage between human-like descriptions of LLMs and their underlying statistical nature, especially relating to agency and trust. This is the core issue being documented. I'm carefully documenting the risks.

Analyzing the Failures

I'm now analyzing failures to understand how we are framing them. The text agentially frames failures. This obscures fundamental limitations. This hides statistical nature. This misrepresents the model as acting, which is a key issue. I am working to clearly articulate the harms.

Clarifying Accountability Issues

I am now focusing on the implications of obscured mechanics, analyzing agency. I am highlighting the implications for accountability and potential for harm when software is presented as reliable, obscuring the corporate entities. I'm focusing on "exhibit social reasoning" claims.

Defining Hidden Realities

I'm now zeroing in on the text's obscured realities. Specifically, the framing hides the model's dependence on training data, lack of physical grounding, and absence of causal understanding. I'm noting the fragile, probability-driven nature of its text predictions, alongside the environmental and labor costs.

Analyzing the Linguistic Tools

I'm now diving into the strategic use of language throughout the survey text. I'm focusing on the non-uniform intensity of agential language and how it's deployed to achieve specific rhetorical goals. I am noting how these shifts influence audience perception, especially around trust and reliability.

Analyzing the Rhetorical Shift

I'm now zeroing in on the text's inconsistent linguistic approach. The aggressive claims in the introduction are designed to capture attention and emphasize significance. When the text switches to technical details, the description shifts to create a cycle of credibility.

Analyzing the Rhetorical Shift

I'm now zeroing in on the rhetorical shift from agential descriptions of LLMs to mechanistic limitations. I note the "bait-and-switch" technique, describing success with agency and failure with mechanics. This asymmetric framing serves to inflate perceived capabilities, highlighting the model's cognitive intelligence and downplaying data-driven glitches.

Synthesizing Accountability Patterns

I'm now synthesizing the structural patterns of displaced responsibility. I'm focusing on "accountability sink", revealing a systematic architecture within the language. I'm carefully detailing how the text constructs and reinforces this cognitive obstacle.

Defining Accountability Sink

I am now focusing on the "accountability sink" pattern. This pattern shifts accountability for LLM errors away from developers and toward the model itself, or even the end-users. I've rephrased "the algorithm generated biased recommendations" to "OpenAI designed and deployed a system that prioritized majority-group symptoms."

Finalizing the Conclusion

I'm now completing the conclusion, ensuring coherence. I've drafted pattern summaries, focusing on the illusion's mechanism. My focus is the material stakes, particularly regarding the ethical implications and data dependencies. I'm now integrating this analysis and refining the "path forward" section to meet word count requirements and maintain clarity.

Finalizing Critical Observations

I'm now integrating everything. I'm focusing on "literacy as counter-practice," especially around vocabulary trade-offs and future scenarios. I'm now completing the conclusion, clarifying the path forward, and the analysis. I am completing the concluding paragraphs, ensuring coherence, with focus on risks, implications, and accountability.

Analyzing the Argument

I am now carefully analyzing the foundational argument that LLMs are cognitive organisms, as it is the critical premise. I've noted that the framing relies on projecting human consciousness onto statistical correlation, thereby creating the illusion of understanding and perception.

Synthesizing Findings

I'm now synthesizing these findings. I've uncovered how the text's systematic anthropomorphism obscures the core differences between calculation and human experience. This agential framing leads to an "illusion of mind" that hampers critical scientific assessment and clinical application. I'm focusing on the ethical implications.

Analyzing the Fallacy

I now understand the "curse of knowledge" fallacy. My focus is on unpacking the flawed assumptions and highlighting the dangerous implications of misinterpreting pattern matching as human-like understanding. This is crucial for avoiding over-reliance and building a clear path forward.

Defining the Projection

I am now focusing on the 'agent' and the text's characterization of LLMs. It directly equates statistical operations with human traits. I will carefully craft the prose within each field, specifically aiming for the specified word counts.

Synthesizing Ethical Concerns

I am now focusing on the ethical dimensions, carefully assessing negligence and its impact. Specifically, I am analyzing how this language erodes scientific integrity, potentially automating mental health care, and enriching corporate entities.

Defining the Losses

I'm now integrating the provided paragraph into the conclusion. I see how agential claims promote over-reliance on technology, especially for clinical applications. The language obscures ethical responsibility and the potential harm to marginalized patient populations. I'm carefully detailing accountability.

Defining the Resistance

I'm now incorporating my final thoughts. I'm focusing on the significant resistance to counter-practices from powerful stakeholders, such as tech corporations. The incentives to maintain the illusion of LLM "mindedness" are strong, with financial benefits from marketing and liability shielding.

Defining the Language

I am now focusing on the reframing. I'm carefully analyzing vocabulary, especially contrasting anthropomorphic versus mechanistic terms. My plan involves evaluating how each choice impacts auditability, clinical trust, and corporate accountability. I am analyzing the risks.

Defining the Analysis

I'm now integrating the provided paragraph into my conclusion, detailing significant accessibility trade-offs from a mechanistic approach. I am now analyzing the requirement for technical literacy that could alienate non-expert clinicians. I am also planning to analyze discourse futures and structural changes. I'll consider how this could support journal technical disclosures.

Analyzing Vocabulary Priorities

I'm now carefully analyzing the different ethical, commercial, and scientific priorities served by vocabulary choices in the clinical AI landscape. My focus is on how linguistic choices shape power dynamics and risk distribution within this context. I am reviewing the trade-offs of the language choices for each instance.

Finalizing Project Summary

I'm now putting the finishing touches on the project summary, ensuring alignment with all word count and structural specifications. I've re-reviewed all outputs for clarity and consistency. Everything meets length and format requirements.

Reviewing Analysis Details

I am now meticulously reviewing the provided OCR citations, double-checking their accuracy and relevance to my analysis. I'm ensuring alignment between the quoted text and my interpretations. My focus is on verifying and classifying the direct claims. I am now updating all fields to align.


Analysis Provenance

Run ID: 2026-05-26-a-survey-of-large-language-models-for-pe-metaphor-z0t4s5 Raw JSON: 2026-05-26-a-survey-of-large-language-models-for-pe-metaphor-z0t4s5.json Framework: Metaphor Analysis v6.5 Schema Version: 3.0 Generated: 2026-05-26T09:50:21.534Z

Discourse Depot © 2025 by TD is licensed under CC BY-NC-SA 4.0