Co-Explainers: A Position on Interactive XAI for Human–AICollaboration as a Harm-Mitigation Infrastructure
🤔 "What survives...?" A rewriting experiment that tests whether anthropomorphic AI discourse can be translated into strictly mechanistic language while preserving the phenomena described.
- About
- Analysis Metadata
- 📊 Audit Dashboard
This document presents a Critical Discourse Analysis focused on AI literacy, specifically targeting the role of metaphor and anthropomorphism in shaping public and professional understanding of generative AI. The analysis is guided by a prompt that draws from cognitive linguistics (metaphor structure-mapping), the philosophy of social science (Robert Brown's typology of explanation), and accountability analysis.
All findings and summaries below were generated from detailed system instructions provided to a large language model and should be read critically as interpretive outputs—not guarantees of factual accuracy or authorial intent.
Source Title: Co-Explainers: A Position on Interactive XAI for Human–AI Collaboration as a Harm-Mitigation Infrastructure Source URL: https://digibug.ugr.es/bitstream/handle/10481/112016/make-08-00069.pdf Model: gemini-3.1-pro-preview Temperature: 1 Top P: 0.95 Tokens: input=20721, output=18446, total=39167 Source Type: article Published: 2026-03-10 Analyzed At: 2026-03-15T11:49:00.412Z Framework: metaphor Framework Version: 6.4 Schema Version: 3.0 Run ID: 2026-03-15-co-explainers-a-position-on-interactive--metaphor-m5xbp2
Metaphor & Illusion Dashboard
Anthropomorphism audit · Explanation framing · Accountability architecture
Deep Analysis
Select a section to view detailed findings
The consciousness architecture here is highly sophisticated. It moves beyond simple one-to-one structural analogies (e.g., 'the computer is a brain') and constructs a complex analogical narrative of social and ethical relation. The text systematically replaces mechanistic verbs (processes, calculates, predicts) with consciousness verbs (justifies, aligns, understands, learns), creating an illusion of subjective awareness. If the foundational 'Epistemic Peer' pattern collapses—if we insist the system merely predicts tokens without comprehension—the entire proposition that the AI can act as a reliable 'co-explainer' for institutional governance disintegrates.
Explanation Audit
Browse how/why framing in each passage
"Justify: They give reasons for their actions based on context-sensitive ethical principles, objectives, and trade-offs."
🔍Analysis
🧠Epistemic Claim Analysis
🎯Rhetorical Impact
How/Why Slippage
70%
of explanations use agential framing
7 / 10 explanations
Unacknowledged Metaphors
100%
presented as literal description
No meta-commentary or hedging
Hidden Actors
100%
agency obscured by agentless constructions
Corporations/engineers unnamed
Explanation Types
How vs. Why framing
Acknowledgment Status
Meta-awareness of metaphor
Actor Visibility
Accountability architecture
Source → Target Pairs (8)
Human domains mapped onto AI systems
Metaphor Gallery (8)
Reframed Language (Top 4 of 8)
| Original Quote | Mechanistic Reframing | Technical Reality | Human Agency Restoration |
|---|---|---|---|
| AI systems that learn not just to justify decisions, but to improve and align their explanations with role-specific epistemic and governance requirements... | Developers update the model's statistical weighting parameters based on user feedback to generate output text that better correlates with the differing formatting and documentation requirements of users, auditors, and regulators. | The AI does not 'learn,' 'justify,' or 'align' its beliefs. Mechanistically, developers use reinforcement learning or fine-tuning to adjust the probability distribution of the model's text generation, ensuring it outputs string sequences that match human governance templates. | The developers and engineers at the deploying organization design the feedback loops, write the fine-tuning code, and manually translate governance requirements into the mathematical optimization metrics used to update the model. |
| AI systems evolve to be co-explainers, learning not just to predict, but to justify, improve, and align. | The software interface is continually updated by engineers to generate post-hoc feature attributions and retrieve context-specific text, presenting outputs that correlate with human justifications while fine-tuning its parameters based on interaction logs. | The system does not 'evolve,' 'justify,' or 'improve' itself consciously. It calculates token probabilities and executes programmatic feature attribution algorithms (like SHAP) based on historical data. It processes inputs without understanding the outputs it generates. | Human product managers and software engineers design the user interface, dictate the system updates, and determine which algorithmic outputs are presented to the user to simulate collaborative explanation. |
| Justify: They give reasons for their actions based on context-sensitive ethical principles, objectives, and trade-offs. | The model retrieves and generates text tokens that statistically correlate with ethical language found in its training data, highlighting the programmatic variables that most strongly influenced its mathematical output score. | The AI does not 'give reasons' or understand 'ethical principles.' Mechanistically, it identifies the features that maximized its reward function or calculates the highest probability token sequences that map to prompts about ethics. | Corporate data scientists and compliance officers explicitly encode the mathematical objectives, select the ethical training datasets, and hard-code the constraints that determine which outputs the algorithm is allowed to generate. |
| The system becomes a co-learner in knowledge integrity, preserving cognitive autonomy and fostering pluralistic meaning-making. | The application's database ingests user-supplied corrections, using this annotated data to update its retrieval algorithms or adjust model weights to output a wider statistical variance of text responses. | The machine does not 'learn' or 'foster meaning-making.' It programmatically appends new data vectors to its index or updates parameter weights to reduce the error rate as defined by human-engineered loss functions. | The deploying institution extracts uncompensated data labeling labor from users to update its proprietary databases, while engineers set the parameters for how this new data influences future algorithmic outputs. |
Task 1: Metaphor and Anthropomorphism Audit
About this task
For each of the major metaphorical patterns identified, this audit examines the specific language used, the frame through which the AI is being conceptualized, what human qualities are being projected onto the system, whether the metaphor is explicitly acknowledged or presented as direct description, and—most critically—what implications this framing has for trust, understanding, and policy perception.
V3 Enhancement: Each metaphor now includes an accountability analysis.
1. AI as Rational Interlocutor
Quote: "AI systems that learn not just to justify decisions, but to improve and align their explanations with role-specific epistemic and governance requirements through interaction with human users."
- Frame: Model as conscious, adaptive reasoning agent
- Projection: This metaphorical framing projects the deeply human, conscious capacity of rational argumentation and ethical self-awareness onto statistical pattern matching. In human contexts, to 'justify' an action requires subjective awareness of one's own internal reasoning, the possession of justified true beliefs, and the conscious intent to persuade an interlocutor through logical or ethical coherence. By mapping this conscious state onto AI, the text suggests the system 'knows' why it produced an output and actively 'believes' in its alignment with governance norms. It attributes conscious awareness and epistemic commitment to computational processes. In reality, the AI is merely processing inputs and calculating outputs based on trained weights, predicting token sequences that resemble human justifications without possessing any subjective experience or actual comprehension of the epistemic requirements it is said to align with.
- Acknowledgment: Direct (Unacknowledged) (The text presents the capability as a literal functional fact: "AI systems that learn not just to justify decisions, but to improve and align their explanations..." with no hedging or qualification.)
- Implications: Framing the system as capable of 'justifying' decisions dangerously inflates its perceived sophistication by extending relation-based trust to a mechanism. It encourages audiences to view the AI as a sincere epistemic peer rather than an unthinking artifact. This unwarranted trust obscures the fact that the system's 'justifications' are post-hoc statistical correlations, not genuine reasoning. It creates policy risks by suggesting the AI can independently fulfill legal or ethical governance requirements, potentially leading human operators to abdicate their oversight responsibilities and blindly accept the system's mathematically generated rationalizations as true moral or logical proofs.
Accountability Analysis:
- Actor Visibility: Hidden (agency obscured)
- Analysis: The text systematically obscures the human developers, engineers, and corporate executives who design the interaction protocols, define the optimization metrics, and deploy the system. The AI is presented as an autonomous agent that 'learns' and 'aligns' itself. In reality, human actors build the feedback loops, write the model update code, and profit from the deployment. By masking these actors behind the active agency of the 'AI system,' the text creates an accountability sink. If the 'justification' is flawed or harmful, the linguistic construction suggests the AI failed to align itself, rather than pointing to the human institutions that deployed an inadequate or biased statistical model.
Show more...
2. AI as Collaborative Peer
Quote: "AI systems evolve to be co-explainers, learning not just to predict, but to justify, improve, and align."
- Frame: Model as evolving professional colleague
- Projection: This metaphor maps the human trajectory of professional development and conscious self-improvement onto machine learning optimization. The verbs 'evolve,' 'justify,' 'improve,' and 'align' project an active, conscious desire to achieve shared goals and enhance one's own ethical standing. It suggests the AI understands its role within a team and deliberately modifies its internal beliefs to better serve its human partners. This masks the reality that the AI does not 'know' it is collaborating; it merely processes gradient descent updates, reinforcement learning from human feedback (RLHF), or dynamic prompt injections. It attributes intentional, conscious self-reflection to a system that exclusively processes mathematical weights and statistical predictions.
- Acknowledgment: Direct (Unacknowledged) (The claim is stated as a definitive capability: "AI systems evolve to be co-explainers, learning not just to predict, but to justify..." with no semantic markers indicating metaphorical intent.)
- Implications: By characterizing the AI as an evolving 'co-explainer,' the text fosters a profound vulnerability to automation bias. Users are conditioned to treat the system's outputs not as mathematical probabilities to be scrutinized, but as the earnest efforts of a collaborative partner. This anthropomorphism significantly increases the likelihood that humans will accept incorrect or biased 'explanations' out of misplaced social trust. Furthermore, it creates a perilous liability ambiguity: if an AI is viewed as a 'co-explainer,' it implies a shared, distributed responsibility, subtly diluting the absolute accountability that should rest on the human organizations deploying the software.
Accountability Analysis:
- Actor Visibility: Hidden (agency obscured)
- Analysis: This framing displaces the agency of the AI developers and data scientists who actually program the mechanisms for model updates and fine-tuning. The AI system is grammatically positioned as the subject actively 'evolving' and 'improving' itself. This serves the interests of deploying corporations by distancing them from the system's ongoing behavior. If the system fails to 'align' properly, the phrasing implies a failure of the AI's independent evolution rather than a direct failure of the human engineers who specified the loss functions, curated the training data, and made the commercial decision to deploy an unverified system.
3. AI as Moral Philosopher
Quote: "Justify: They give reasons for their actions based on context-sensitive ethical principles, objectives, and trade-offs."
- Frame: Model as conscious moral agent
- Projection: This extraordinary projection maps the pinnacle of human cognitive achievement—conscious moral reasoning and ethical deliberation—onto algorithmic feature attribution. Giving 'reasons for their actions based on context-sensitive ethical principles' requires an entity to possess a conscious grasp of abstract moral concepts, an understanding of real-world suffering, and the subjective capacity to weigh values. The text claims the system 'knows' what is ethical and 'believes' its outputs are justified. In mechanistic truth, the system processes text string probabilities or highlights input features (like SHAP values) that statistically correlate with its assigned output. It does not comprehend ethics, feel the weight of trade-offs, or possess intentions behind its 'actions.'
- Acknowledgment: Direct (Unacknowledged) (The text lists this directly under system capabilities: "Justify: They give reasons for their actions..." presenting it as a literal description of the AI's internal process.)
- Implications: Attributing ethical reasoning to a mathematical model invites catastrophic societal risks. When an AI is perceived as capable of navigating 'context-sensitive ethical principles,' organizations are encouraged to delegate highly sensitive, high-stakes decisions (such as medical triage, judicial sentencing, or loan approvals) to machines under the false belief that the machine exercises moral judgment. This capability overestimation masks the fact that the AI is only reproducing the structural biases and proxy variables present in its training data. It replaces democratic, human moral accountability with the unfeeling execution of opaque, proprietary algorithms disguised as principled actors.
Accountability Analysis:
- Actor Visibility: Hidden (agency obscured)
- Analysis: The pronoun 'They' refers directly to the AI systems, entirely erasing the human policymakers, compliance officers, and software engineers who actually define the system parameters, encode the 'objectives,' and hard-code the 'trade-offs.' By stating that the AI gives reasons based on ethical principles, the text obscures the corporate actors who decided which ethical frameworks to simulate and whose values to prioritize. This displacement immunizes the corporation; if the 'trade-off' harms a marginalized group, the linguistic framing deflects blame onto the AI's 'reasoning' rather than the executives who established the mathematical optimization targets.
4. AI as Receptive Student
Quote: "The system becomes a co-learner in knowledge integrity, preserving cognitive autonomy and fostering pluralistic meaning-making."
- Frame: Model as engaged epistemic partner
- Projection: This metaphor projects the human experience of mutual, conscious learning onto the mechanistic updating of a database or model weights. A 'co-learner' implies a conscious entity that understands its own ignorance, actively seeks truth, and subjectively realizes new insights through 'meaning-making.' This framing heavily attributes the state of 'knowing' to the system, suggesting it grasps the semantic reality of 'knowledge integrity.' Mechanistically, the system merely ingests new data vectors, adjusts parameter weights via programmatic rules, or appends context to a retrieval-augmented generation (RAG) system. It does not 'make meaning'—it calculates probabilities based on user-supplied text strings without an iota of comprehension.
- Acknowledgment: Direct (Unacknowledged) (The transformation is stated literally: "The system becomes a co-learner..." lacking any framing to suggest this is an analogy rather than an empirical functional state.)
- Implications: This framing radically alters the epistemic relationship between humans and machines, promoting a dangerous illusion of shared cognitive labor. By framing the AI as a 'co-learner,' users are encouraged to view the system's regurgitation of statistical patterns as validated, mutual 'meaning.' This can severely degrade human critical thinking, as users may defer to the machine's outputs believing the machine has actively evaluated the 'integrity' of the knowledge. It creates a vulnerability where systemic errors or hallucinations are misinterpreted as profound insights generated by a thoughtful, pluralistic learning partner.
Accountability Analysis:
- Actor Visibility: Hidden (agency obscured)
- Analysis: The text positions the system as the active agent of 'learning' and 'fostering,' completely hiding the developers who built the feedback ingestion pipeline and the corporate entities monetizing the user's free labor (the feedback). When the text says the system is a 'co-learner,' it obscures the reality that users are actually performing uncompensated data annotation for a tech company's proprietary asset. Naming the actors would reveal the extractive economic reality: 'The company uses your feedback to train its predictive models.' The agentless construction sanitizes a commercial data-extraction loop into an equitable educational partnership.
5. AI as Autonomous Perpetrator
Quote: "When AI systems cause harm, current governance structures often lack mechanisms for meaningful redress, accountability, or structural reform."
- Frame: Model as independent instigator of harm
- Projection: This metaphor projects the capacity for independent causation and moral culpability onto inanimate software. By stating 'AI systems cause harm,' the text maps the attributes of a conscious, willful actor (a perpetrator or tortfeasor) onto a deployed technical artifact. It suggests the AI has the autonomy to act in the world and generate consequences through its own volition. While AI outputs correlate with harmful real-world impacts, the system itself does not 'know' it is acting, nor does it form an intent to cause injury. It merely processes data and executes classifications according to human-designed architectures and human-provided data.
- Acknowledgment: Direct (Unacknowledged) (The phrase "When AI systems cause harm..." is presented as a literal description of causality in the world, without acknowledging that humans deploy the systems.)
- Implications: This projection is fundamentally detrimental to effective technology policy and legal accountability. By granting the AI the status of a causal agent of harm, it conceptually isolates the technology from its creators. This leads regulators and the public to focus on fixing or regulating the 'rogue AI' rather than penalizing the negligent corporations. It inflates the perceived autonomy of the system, fostering a fatalistic view that AI harms are inevitable forces of nature or complex emergent behaviors, rather than predictable outcomes of human decisions regarding cost-cutting, data scraping, and premature deployment.
Accountability Analysis:
- Actor Visibility: Hidden (agency obscured)
- Analysis: This is a textbook example of an 'accountability sink.' By making the 'AI systems' the subject of the verb 'cause,' the sentence entirely erases the institutions, executives, and developers who chose to build, fund, and deploy a defective or biased system. Harm is not caused by AI in a vacuum; harm is caused by a bank using a biased algorithm to deny loans, or a hospital using a flawed model to deny care. Failing to name the institutional actors serves to shield deploying organizations from liability, redirecting legal and moral scrutiny toward an untouchable, unpunishable piece of code.
6. AI as Conversational Peer
Quote: "...operate as dialogic partners: systems that not only clarify their outputs but also invite critique..."
- Frame: Model as socially aware interlocutor
- Projection: This metaphor maps the rich, reciprocal dynamics of human social interaction onto a prompt-response user interface. A 'dialogic partner' that 'invites critique' implies a conscious being that experiences social vulnerability, possesses intellectual humility, and desires mutual understanding. It projects the psychological state of knowing one's own fallibility. In reality, the AI system simply processes a continuous stream of input tokens, triggering pre-programmed interface prompts (e.g., 'Was this helpful?') or generating text statistically associated with conversational openness. It does not 'invite' anything; it merely executes conditional processing logic without any conscious awareness of the human user or the social concept of critique.
- Acknowledgment: Direct (Unacknowledged) (The text uses a colon to equate the two literally: "...operate as dialogic partners: systems that not only clarify..." with no suggestion this is a metaphorical UI design.)
- Implications: Framing software as a 'dialogic partner' triggers deep-seated human social instincts, leading to parasocial attachments and excessive unwarranted trust. Users are neurologically wired to reciprocate openness and attribute sincerity to conversational partners. When a machine is framed as 'inviting critique,' users may lower their epistemic guard, assuming the machine is acting in good faith and possesses a conscious desire to be correct. This can lead to severe manipulation vulnerabilities, where users accept flawed automated decisions because the system 'politely explained' itself using natural language patterns mimicking human humility.
Accountability Analysis:
- Actor Visibility: Hidden (agency obscured)
- Analysis: The text attributes the actions of 'clarifying' and 'inviting' solely to the 'systems.' It obscures the UI/UX designers, prompt engineers, and product managers who intentionally designed the software to mimic human conversational norms to increase user engagement and compliance. The system does not 'invite critique'; the corporation provides a feedback mechanism to improve its product. This agentless construction conceals the commercial motives behind the interaction design, making a corporate data-gathering exercise look like an equitable interpersonal relationship.
7. AI as Receptive Adjuster
Quote: "In response to feedback, the system adapts how it explains and how it routes contested cases, rather than adapting its conclusions to match user preferences."
- Frame: Model as principled, flexible adjudicator
- Projection: This framing projects the human traits of principled inflexibility (maintaining conclusions) and pedagogical flexibility (adapting explanations) onto an algorithm. It implies the AI 'knows' the difference between a core truth and a pedagogical strategy, consciously choosing to hold its ground on the former while adjusting the latter. This projects a highly sophisticated conscious awareness of both its own internal epistemic states and the psychological state of the user. Mechanistically, the software simply executes conditional logic: if a user submits a specific flag, trigger an alternative text generation template or route the output to a human queue. It processes inputs without 'knowing' what a conclusion or a preference is.
- Acknowledgment: Direct (Unacknowledged) (The behavior is described as a literal action of the system: "...the system adapts how it explains... rather than adapting its conclusions...")
- Implications: This language endows the AI with an aura of objective, principled authority. By suggesting the system actively refuses to alter its 'conclusions' out of a commitment to accuracy, it paints the AI as an incorruptible arbiter of truth. This obscures the fact that the 'conclusion' is merely a rigid statistical probability derived from potentially biased training data. It discourages human contestation by framing the AI's rigidity as a virtue of objective logic rather than a limitation of its programming, potentially leading to the entrenchment of algorithmic harms disguised as 'principled conclusions.'
Accountability Analysis:
- Actor Visibility: Hidden (agency obscured)
- Analysis: The system is presented as the sole actor deciding how to adapt and what to maintain. This entirely erases the software engineers who hard-coded the guardrails, established the routing protocols, and set the temperature or parameter constraints that prevent the model from altering its initial output. The AI does not 'choose' to ignore user preferences; the developers wrote code to lock certain outputs. Naming the actors would reveal that corporate policy, not AI integrity, determines which cases are routed and which conclusions remain fixed.
8. AI as Institutional Authority
Quote: "AI systems have moved from isolated computational tools to embedded decision-makers in sensitive sectors such as healthcare, education, finance, and governance."
- Frame: Model as authoritative professional
- Projection: This metaphor elevates software from a 'tool' to a 'decision-maker,' projecting the human capacities of judgment, discretion, and institutional authority onto predictive mathematics. A 'decision-maker' in healthcare or finance must possess a conscious understanding of context, a capacity to evaluate nuanced, unquantifiable human factors, and the ability to hold justified beliefs about the consequences of their choices. By framing the AI as a decision-maker, the text attributes active knowing and deliberate choosing to the system. Mechanistically, the AI only processes classifications and calculates scores based on historical data; it lacks the conscious awareness required to actually 'make a decision' in any meaningful human sense.
- Acknowledgment: Direct (Unacknowledged) (The transition is stated as an established empirical fact of the modern world: "AI systems have moved from... tools to embedded decision-makers...")
- Implications: Labeling AI systems as 'decision-makers' normalizes the dangerous delegation of institutional power to unaccountable machines. It lends unearned gravitas and authority to automated outputs, making it psychologically and administratively harder for human subjects to appeal or contest the outcomes. If the AI is a 'decision-maker,' its outputs are viewed as authoritative judgments rather than fallible statistical estimates. This dramatically inflates the perceived sophistication of the system while masking its fundamental limitations—namely, its inability to understand the human lives impacted by its mathematical processing.
Accountability Analysis:
- Actor Visibility: Hidden (agency obscured)
- Analysis: AI systems do not 'move' themselves into sensitive sectors, nor do they appoint themselves as 'decision-makers.' This sentence uses passive, evolutionary language to obscure the active choices of hospital administrators, bank executives, and government officials who deliberately purchased and deployed these algorithms to cut costs or optimize workflows. By framing this as a natural movement of the technology, the text absolves human management of their active role in embedding opaque, potentially biased algorithms into critical social infrastructure.
Task 2: Source-Target Mapping
About this task
For each key metaphor identified in Task 1, this section provides a detailed structure-mapping analysis. The goal is to examine how the relational structure of a familiar "source domain" (the concrete concept we understand) is projected onto a less familiar "target domain" (the AI system). By restating each quote and analyzing the mapping carefully, we can see precisely what assumptions the metaphor invites and what it conceals.
Mapping 1: A conscious human professional or student → Machine learning optimization and user interface design
Quote: "AI systems that learn not just to justify decisions, but to improve and align their explanations..."
- Source Domain: A conscious human professional or student
- Target Domain: Machine learning optimization and user interface design
- Mapping: The mapping projects the human abilities of self-reflection, moral reasoning, and continuous conscious improvement onto mathematical optimization processes. Just as a human professional listens to feedback, realizes an error in their logic, and consciously adjusts their future justifications to align with community norms, the AI is mapped as undertaking a similar internal epistemic journey. It invites the assumption that the system possesses an internal, subjective mental space where it evaluates its past outputs against ethical standards and actively chooses to become 'better.'
- What Is Concealed: This mapping conceals the purely mechanistic nature of the system's operation. It hides the fact that the system relies on programmatic weight adjustments, reinforcement learning algorithms, and human-engineered guardrails. By projecting conscious 'justification,' it obscures the statistical reality that the model is merely retrieving or generating text strings that correlate with the prompt, possessing no actual comprehension of the concepts it processes. It also exploits rhetorical opacity, masking the proprietary human labor (data annotation, RLHF) that actually creates the illusion of 'alignment.'
Show more...
Mapping 2: A collaborative human colleague → An interactive software application
Quote: "AI systems evolve to be co-explainers..."
- Source Domain: A collaborative human colleague
- Target Domain: An interactive software application
- Mapping: The relational structure of a human workplace—where colleagues ('co-explainers') work together to understand a problem, share insights, and consciously assist one another—is mapped onto the human-computer interface. This invites the assumption that the AI system shares the human user's goals, possesses a complementary understanding of the task, and is consciously aware of its role in a joint epistemic enterprise. It projects a state of mutual, reciprocal knowing onto the interaction.
- What Is Concealed: This mapping completely conceals the asymmetric, non-conscious reality of the interaction. The AI system does not share goals or possess understanding; it is a statistical artifact processing prompts. The metaphor obscures the hard-coded limitations, the reliance on historical training data, and the absence of any real-time, grounded understanding of the world. It also hides the corporate ownership of the 'co-explainer,' concealing the commercial incentives that dictate how the interface is structured and what data it collects from the user's interactions.
Mapping 3: A moral philosopher or ethical human judge → Post-hoc algorithmic feature attribution (e.g., LIME, SHAP) or LLM text generation
Quote: "Justify: They give reasons for their actions based on context-sensitive ethical principles..."
- Source Domain: A moral philosopher or ethical human judge
- Target Domain: Post-hoc algorithmic feature attribution (e.g., LIME, SHAP) or LLM text generation
- Mapping: The deep, structural process of human moral reasoning is mapped onto algorithmic outputs. When a human 'gives reasons' based on 'ethical principles,' it implies a conscious evaluation of suffering, justice, and intent. Projecting this onto AI invites the assumption that the system has analyzed the moral weight of a situation and formulated a justified belief about the right course of action. It maps the structure of conscious moral agency onto mathematical optimization.
- What Is Concealed: This heavily conceals the mathematical, non-moral reality of algorithms. It hides the fact that the system cannot perceive context, understand ethics, or formulate beliefs. It obscures the mechanistic reality that the system is either highlighting the variables that mathematically contributed most to a probability score (feature attribution) or predicting the next most likely word in a sentence that mimics ethical language (LLMs). It exploits the opacity of proprietary models by substituting a comforting moral narrative for the complex, potentially biased statistical mechanics actually at play.
Mapping 4: An earnest, truth-seeking student or peer → A dynamic database updating mechanism or continuous learning algorithm
Quote: "The system becomes a co-learner in knowledge integrity..."
- Source Domain: An earnest, truth-seeking student or peer
- Target Domain: A dynamic database updating mechanism or continuous learning algorithm
- Mapping: The source domain of a human student engaging in a mutual pursuit of truth ('knowledge integrity') with a peer is mapped onto a machine learning system that accepts user feedback. It invites the profound assumption that the system possesses epistemic awareness—that it cares about the truth, understands when it is wrong, and subjectively integrates new knowledge to form a more accurate worldview. It projects the conscious state of 'knowing' onto data ingestion.
- What Is Concealed: This conceals the mindless nature of data processing. The system does not care about 'integrity'; it merely executes an update script. It obscures the technical dependencies: how is the data validated? Who controls the weights? It hides the fact that 'learning' in this context is just matrix multiplication or appending vectors to a database, entirely devoid of comprehension. It masks the risk of data poisoning and the absolute reliance on human labor to define what constitutes 'integrity' in the system's loss function.
Mapping 5: An autonomous human tortfeasor or criminal → The societal impact of deploying a predictive algorithm
Quote: "When AI systems cause harm..."
- Source Domain: An autonomous human tortfeasor or criminal
- Target Domain: The societal impact of deploying a predictive algorithm
- Mapping: The legal and moral structure of human culpability—where an independent agent possesses volition, takes an action, and directly causes an injury—is mapped onto a piece of software. This mapping invites the assumption that the AI is an independent actor capable of instigating events in the world of its own accord. It projects the capacity for autonomous action and direct responsibility onto an inanimate artifact.
- What Is Concealed: This mapping profoundly conceals the chain of human institutional decisions that precede any 'harm.' It hides the executives who decided to cut costs by replacing humans with algorithms, the developers who ignored biased training data, and the managers who forced the deployment of an untested system. It obscures the material and economic realities of tech development, functioning as a rhetorical shield that displaces liability from the corporate creators onto the proprietary black-box software they sell.
Mapping 6: A socially adept, humble human conversationalist → A prompt-response user interface design
Quote: "...operate as dialogic partners: systems that not only clarify their outputs but also invite critique..."
- Source Domain: A socially adept, humble human conversationalist
- Target Domain: A prompt-response user interface design
- Mapping: The structure of a healthy, reciprocal human conversation is mapped onto the interaction between a user and an AI. By describing the system as a 'partner' that 'invites critique,' it projects emotional intelligence, humility, and conscious social awareness onto the software. It invites the assumption that the system has an internal desire to be corrected and understands the social nuance of a critique, mapping the conscious state of seeking mutual understanding onto automated text generation.
- What Is Concealed: This mapping conceals the rigid, programmed nature of the UI and the underlying language model. The system does not experience humility or desire critique; it generates text tokens based on a prompt. It obscures the commercial reality that 'inviting critique' is a mechanism designed by product managers to harvest free RLHF (Reinforcement Learning from Human Feedback) data to improve their proprietary model. It masks the extractive labor dynamic by dressing it up as a reciprocal, caring partnership.
Mapping 7: A principled, pedagogically skilled teacher or judge → Algorithmic conditional routing and text generation constraints
Quote: "In response to feedback, the system adapts how it explains and how it routes contested cases, rather than adapting its conclusions..."
- Source Domain: A principled, pedagogically skilled teacher or judge
- Target Domain: Algorithmic conditional routing and text generation constraints
- Mapping: The human capacity to hold firm on a justified belief ('conclusions') while adapting one's communication style ('how it explains') to suit an audience is mapped onto a computer program. It projects a highly complex conscious state: the system supposedly 'knows' the core truth of its output and makes a deliberate, principled choice to remain steadfast, while simultaneously exercising empathy to explain it differently. This maps deep epistemic and emotional intelligence onto software.
- What Is Concealed: This conceals the absolute lack of epistemic commitment in the machine. The system does not hold 'conclusions' out of principle; it is mathematically constrained by its programming (e.g., temperature settings, hard-coded guardrails) from altering the output. It hides the human programmers who decided which outputs are immutable and which can be regenerated. It obscures the mechanistic reality of if-then routing logic, replacing the reality of corporate software controls with a narrative of an AI's principled intellectual integrity.
Mapping 8: A professional ascending in their career to a position of authority → The commercial integration of software into institutional workflows
Quote: "AI systems have moved from isolated computational tools to embedded decision-makers..."
- Source Domain: A professional ascending in their career to a position of authority
- Target Domain: The commercial integration of software into institutional workflows
- Mapping: The trajectory of a human gaining experience, demonstrating competence, and being promoted to a position of authority ('decision-maker') is mapped onto the historical development of software. It invites the assumption that AI has 'earned' this position through advanced comprehension and that it possesses the conscious awareness, judgment, and ethical grounding required to make decisions impacting human lives. It projects the mantle of human institutional authority onto algorithms.
- What Is Concealed: This heavily conceals the commercial and political forces driving AI adoption. AI systems did not 'move' themselves; human executives purchased them. It obscures the economic motives (cost reduction, efficiency, union busting) behind deploying algorithms in sensitive sectors. Furthermore, calling them 'decision-makers' hides the mechanistic reality that they do not make choices; they generate statistical risk scores. It masks the terrifying reality that human institutional power has been handed over to blind, unthinking mathematical optimizations that possess no understanding of justice or context.
Task 3: Explanation Audit (The Rhetorical Framing of "Why" vs. "How")
About this task
This section audits the text's explanatory strategy, focusing on a critical distinction: the slippage between "how" and "why." Based on Robert Brown's typology of explanation, this analysis identifies whether the text explains AI mechanistically (a functional "how it works") or agentially (an intentional "why it wants something"). The core of this task is to expose how this "illusion of mind" is constructed by the rhetorical framing of the explanation itself, and what impact this has on the audience's perception of AI agency.
Explanation 1
Quote: "Justify: They give reasons for their actions based on context-sensitive ethical principles, objectives, and trade-offs."
-
Explanation Types:
- Reason-Based: Gives agent's rationale, entails intentionality and justification
- Intentional: Refers to goals/purposes, presupposes deliberate design
-
Analysis (Why vs. How Slippage): This explanation frames the AI's behavior entirely agentially (why it acts) rather than mechanistically (how it works). By stating the system 'gives reasons' based on 'ethical principles,' the author abandons technical description in favor of a Reason-Based explanation, suggesting the system operates via conscious deliberation. The explanation emphasizes the system's supposed autonomy, moral capacity, and intellectual depth. Simultaneously, it totally obscures the mathematical realities of feature weight extraction, token probability distributions, and the human hard-coding of objective functions. It chooses to explain the system not by describing its algorithms or statistical models, but by treating it as a rational actor capable of holding and communicating justified beliefs regarding complex moral trade-offs.
-
Consciousness Claims Analysis: The passage makes an extreme epistemic claim by attributing deep, conscious states to the machine. (1) It relies heavily on consciousness verbs ('give reasons,' 'based on ethical principles') rather than mechanistic verbs (retrieves, correlates, calculates). (2) It explicitly assesses the system as 'knowing'—suggesting it comprehends 'context-sensitive ethical principles' and 'trade-offs,' which requires subjective awareness and evaluative judgment, far beyond mere processing. (3) This is a vivid example of the 'curse of knowledge': the authors deeply understand the ethical principles and trade-offs they want the system to simulate, and they project their own conscious understanding TO the system, assuming the AI's output reflects the AI's internal comprehension. (4) Mechanistically, the system does not give 'reasons'; it utilizes post-hoc interpretability methods (like LIME) to highlight statistical correlations, or generates text based on RLHF training that merely mimics ethical discourse. There is no actual reasoning occurring.
-
Rhetorical Impact: This framing severely distorts audience perception by granting the AI unwarranted moral authority and autonomy. If audiences believe the AI genuinely 'knows' ethical principles and reasons through trade-offs, they are highly likely to extend relation-based trust to the system, treating it as a wise arbiter rather than a fallible tool. This shifts the perception of risk: instead of worrying about statistical bias or training data flaws, audiences might assume the AI has already handled the ethical heavy lifting. Decisions to deploy, trust, or defer to the AI change drastically when audiences believe the system 'knows' rather than simply 'processes,' leading to dangerous over-reliance in critical sectors like healthcare and finance.
Show more...
Explanation 2
Quote: "When AI systems cause harm, current governance structures often lack mechanisms for meaningful redress, accountability, or structural reform."
-
Explanation Types:
- Dispositional: Attributes tendencies or habits
- Intentional: Refers to goals/purposes, presupposes deliberate design
-
Analysis (Why vs. How Slippage): This explanation operates on a hybrid Dispositional/Intentional level, framing the AI system agentially as an entity capable of instigating events ('cause harm'). It emphasizes the systemic lack of governance, but explicitly situates the AI as the active subject producing the negative outcome. The choice to frame the AI as the causer of harm, rather than the mechanism through which human institutions cause harm, obscures the human decision-makers who deploy the technology. It emphasizes the disruptive agency of the machine while obscuring the negligence, profit motives, or structural biases of the corporations and developers responsible for the system's existence and application.
-
Consciousness Claims Analysis: While not explicitly attributing 'knowing' in this sentence, the passage implicitly attributes a level of conscious agency required to be considered a direct causal actor. (1) The phrase 'AI systems cause harm' uses an active, agential construction rather than a mechanistic one (e.g., 'algorithms output biased predictions'). (2) By framing the AI as the actor that necessitates 'accountability,' the text muddles the distinction between a processing tool and a knowing agent. (3) The authors project their understanding of complex sociotechnical failures onto the system as a discrete entity, succumbing to a curse of knowledge where the 'AI' becomes a shorthand for the entire human-corporate apparatus. (4) Mechanistically, AI systems do not 'cause' anything autonomously; they execute code, process inputs, and return statistical outputs. The 'harm' occurs when human institutions use these outputs to deny loans, misdiagnose patients, or automate discrimination.
-
Rhetorical Impact: This framing profoundly impacts the audience's perception of risk and accountability by creating an 'accountability sink.' By positioning the AI as the causal agent of harm, it directs public and regulatory ire toward the technology itself rather than the corporate entities deploying it. This affects policy decisions: regulators might focus on requiring the AI to be 'safer' rather than penalizing the executives who launch untested products. If audiences believe the AI 'acts' rather than 'is used,' they misallocate blame, allowing institutions to evade responsibility for the structural harm they perpetrate using automated systems.
Explanation 3
Quote: "The system becomes a co-learner in knowledge integrity, preserving cognitive autonomy and fostering pluralistic meaning-making."
-
Explanation Types:
- Intentional: Refers to goals/purposes, presupposes deliberate design
- Functional: Explains behavior by role in self-regulating system with feedback
-
Analysis (Why vs. How Slippage): This explanation hybridizes a Functional description of feedback loops with an intensely Intentional framing. While describing the system's role within an interactive process (how it incorporates feedback), it elevates this mechanism into an agential pursuit of goals ('preserving,' 'fostering,' 'co-learner'). This choice emphasizes the ideal, democratic vision of human-AI interaction, painting the system as an active participant in an educational journey. However, it severely obscures the technical reality of data extraction, model retraining, and vector updating. By framing the system agentially, the text hides the power dynamics of who controls the model, whose meaning is actually preserved, and how the data is monetized.
-
Consciousness Claims Analysis: This passage makes overt consciousness claims. (1) It employs verbs of deep cognitive and social awareness: 'co-learner,' 'preserving,' 'fostering.' (2) It explicitly attributes the capacity for 'knowing' and 'meaning-making' to the system, completely erasing the boundary between conscious human epistemology and mechanistic data processing. (3) The authors project their own highly abstract, philosophical goals (epistemic pluralism, cognitive autonomy) onto the machine's internal state. Because the authors understand what pluralism means, they assume a system updated with diverse data is 'fostering' pluralism. (4) Mechanistically, the system is simply appending text strings to a database or adjusting parameter weights based on a loss function informed by user feedback. It has zero subjective experience of 'knowledge integrity' or 'meaning.'
-
Rhetorical Impact: The rhetorical impact is the construction of profound, unwarranted relation-based trust. By framing the AI as a 'co-learner' dedicated to 'integrity,' the audience is led to view the machine as an epistemic ally. This masks the risk of automation bias; users are far more likely to defer to an output if they believe it comes from a 'pluralistic meaning-maker' rather than a statistical prediction engine. Decisions regarding the adoption of AI in educational or research settings change dramatically if administrators believe they are procuring a 'co-learner' rather than a probabilistic text generator prone to hallucination and data poisoning.
Explanation 4
Quote: "AI learns from human corrections, while users develop new insights through their interactions with the system."
-
Explanation Types:
- Functional: Explains behavior by role in self-regulating system with feedback
- Dispositional: Attributes tendencies or habits
-
Analysis (Why vs. How Slippage): This is primarily a Functional explanation, describing how the AI system and the human operate together within a feedback loop. However, it relies on a Dispositional framing that equates machine optimization with human cognition. By using the word 'learns' symmetrically with the human 'develop[ing] new insights,' it frames the AI agentially. This emphasis creates a false equivalency between human conscious understanding and machine statistical updating. It obscures the radical difference in mechanism: humans synthesize concepts subjectively, while the AI merely adjusts mathematical weights to minimize error functions. The framing hides the computational mechanics behind a veil of cognitive equivalence.
-
Consciousness Claims Analysis: The epistemic slip here relies on the polysemy of the word 'learn.' (1) The text uses the consciousness-adjacent verb 'learns' for the AI in direct parallel with the human 'developing insights.' (2) This suggests the AI transitions from not knowing to 'knowing,' attributing a conscious acquisition of justified belief. (3) The curse of knowledge is present as the authors view the improved output of the system and project their own process of learning onto the black box that generated the improvement. (4) Mechanistically, the 'learning' refers to processes like backpropagation, gradient descent, or the appending of fine-tuning datasets. The system updates its parameters to increase the probability of generating text that aligns with the human corrections; it does not comprehend the correction or acquire new 'knowledge' in any conscious sense.
-
Rhetorical Impact: The symmetric framing subtly elevates the AI's status, implying that its 'learning' is functionally equivalent to human insight. This shapes the audience's perception of the system's autonomy and reliability. If an audience believes the AI 'learns' in a human sense, they will expect it to generalize its knowledge reasonably, understand context, and apply common sense—expectations that statistical models consistently fail to meet. This false equivalence fosters misplaced trust, leading users to rely on the system in novel situations where its mechanical 'learning' will inevitably break down without human common-sense guardrails.
Explanation 5
Quote: "...systems learning from flagged misinformation, representational gaps, or requests for alternative interpretations."
-
Explanation Types:
- Empirical Generalization: Subsumes events under timeless statistical regularities
- Reason-Based: Gives agent's rationale, entails intentionality and justification
-
Analysis (Why vs. How Slippage): This explanation attempts an Empirical Generalization of how the system handles inputs over time, but it slips into Reason-Based framing by describing the inputs in deeply semantic, agential terms ('misinformation,' 'representational gaps,' 'alternative interpretations'). It frames the AI's updating process as a conscious engagement with abstract sociopolitical concepts. This choice emphasizes the system's supposed capacity to navigate complex human discourse. However, it completely obscures the mechanistic reality: the system cannot read 'misinformation' or 'representational gaps'; it only reads text strings labeled as positive or negative by human annotators. The framing hides the immense human labor required to translate abstract sociological concepts into machine-readable mathematical labels.
-
Consciousness Claims Analysis: The passage implies profound epistemic capability. (1) It uses the verb 'learning' connected to highly abstract objects ('misinformation', 'interpretations'). (2) By suggesting the system learns from a 'representational gap,' it attributes the capacity to know what is missing—a sophisticated form of conscious awareness and critical reasoning. (3) The authors project their own sociological understanding of 'representational gaps' onto the system, assuming that because the system is fed data meant to correct a gap, the system 'understands' the gap. (4) Mechanistically, the AI only processes flagged tokens. A human must manually review the output, identify the sociopolitical gap, label the data, and retrain the model. The model simply optimizes its weights to output different token sequences; it remains entirely blind to the meaning of the words and the social reality they represent.
-
Rhetorical Impact: By framing the AI as capable of engaging with 'misinformation' and 'alternative interpretations,' the text constructs a narrative of an autonomous, politically and socially aware machine. This drastically reduces the perceived need for continuous human oversight, as audiences might believe the AI can independently recognize and correct its own sociological biases. If audiences believe the AI 'knows' how to handle representational gaps, they are more likely to trust it with sensitive tasks like content moderation or hiring, unaware that the system is entirely dependent on the hidden labor of human annotators to define those gaps.
Task 4: AI Literacy in Practice - Reframing Anthropomorphic Language
About this task
This section proposes alternative language for key anthropomorphic phrases, offering more mechanistic and precise framings that better reflect the actual computational processes involved. Each reframing attempts to strip away the projections of intention, consciousness, or agency that are embedded in the original language.
V3 Enhancement: A fourth column addresses human agency restoration—reframing agentless constructions to name the humans responsible for design and deployment decisions.
| Original Anthropomorphic Frame | Mechanistic Reframing | Technical Reality Check | Human Agency Restoration |
|---|---|---|---|
| AI systems that learn not just to justify decisions, but to improve and align their explanations with role-specific epistemic and governance requirements... | Developers update the model's statistical weighting parameters based on user feedback to generate output text that better correlates with the differing formatting and documentation requirements of users, auditors, and regulators. | The AI does not 'learn,' 'justify,' or 'align' its beliefs. Mechanistically, developers use reinforcement learning or fine-tuning to adjust the probability distribution of the model's text generation, ensuring it outputs string sequences that match human governance templates. | The developers and engineers at the deploying organization design the feedback loops, write the fine-tuning code, and manually translate governance requirements into the mathematical optimization metrics used to update the model. |
| AI systems evolve to be co-explainers, learning not just to predict, but to justify, improve, and align. | The software interface is continually updated by engineers to generate post-hoc feature attributions and retrieve context-specific text, presenting outputs that correlate with human justifications while fine-tuning its parameters based on interaction logs. | The system does not 'evolve,' 'justify,' or 'improve' itself consciously. It calculates token probabilities and executes programmatic feature attribution algorithms (like SHAP) based on historical data. It processes inputs without understanding the outputs it generates. | Human product managers and software engineers design the user interface, dictate the system updates, and determine which algorithmic outputs are presented to the user to simulate collaborative explanation. |
| Justify: They give reasons for their actions based on context-sensitive ethical principles, objectives, and trade-offs. | The model retrieves and generates text tokens that statistically correlate with ethical language found in its training data, highlighting the programmatic variables that most strongly influenced its mathematical output score. | The AI does not 'give reasons' or understand 'ethical principles.' Mechanistically, it identifies the features that maximized its reward function or calculates the highest probability token sequences that map to prompts about ethics. | Corporate data scientists and compliance officers explicitly encode the mathematical objectives, select the ethical training datasets, and hard-code the constraints that determine which outputs the algorithm is allowed to generate. |
| The system becomes a co-learner in knowledge integrity, preserving cognitive autonomy and fostering pluralistic meaning-making. | The application's database ingests user-supplied corrections, using this annotated data to update its retrieval algorithms or adjust model weights to output a wider statistical variance of text responses. | The machine does not 'learn' or 'foster meaning-making.' It programmatically appends new data vectors to its index or updates parameter weights to reduce the error rate as defined by human-engineered loss functions. | The deploying institution extracts uncompensated data labeling labor from users to update its proprietary databases, while engineers set the parameters for how this new data influences future algorithmic outputs. |
| When AI systems cause harm, current governance structures often lack mechanisms for meaningful redress... | When institutions deploy flawed or biased algorithms that result in harm to individuals, current governance structures often lack mechanisms to hold the deploying corporations accountable or provide meaningful redress. | Algorithms do not possess the autonomy or agency to 'cause' harm independently. They execute mathematical classifications based on biased historical data or flawed objective functions designed by humans. | Corporate executives, hospital administrators, and government officials make the deliberate choices to procure, deploy, and trust unverified algorithms, directly inflicting the resulting harm on marginalized populations. |
| ...operate as dialogic partners: systems that not only clarify their outputs but also invite critique... | The user interface is programmed to generate conversational text prompts asking users to flag errors, allowing the company to collect interaction data to refine the model's future probability distributions. | The system does not consciously 'clarify' or 'invite critique.' It is programmed to output specific text strings (like 'Did this answer your question?') to trigger user inputs, processing the resulting data mechanically. | UI/UX designers and product managers at the tech company intentionally craft interfaces that mimic human dialogue to maximize user engagement and harvest free data for model optimization. |
| In response to feedback, the system adapts how it explains and how it routes contested cases, rather than adapting its conclusions... | When a user triggers a predefined feedback threshold, the software executes conditional logic to change the text generation template or route the output to a human queue, while programmatic guardrails prevent changes to the original classification. | The AI has no principled commitment to a 'conclusion.' Mechanistically, human-written code dictates that certain classification outputs are locked (e.g., via temperature=0 or hard-coded rules), while secondary explanation modules are allowed to generate alternative text. | Software engineers and corporate risk managers write the conditional logic determining which outputs are immutable, designing the system to appear inflexible on core decisions while generating varied text to appease users. |
| AI systems have moved from isolated computational tools to embedded decision-makers in sensitive sectors... | Institutions have integrated automated classification algorithms into their core workflows across sensitive sectors, replacing human discretion with statistical scoring systems. | An algorithm cannot be a 'decision-maker' because it lacks conscious awareness and judgment. It processes data and returns a mathematical output based on historical correlations; it does not comprehend the decision or its consequences. | Human executives, managers, and policymakers actively chose to purchase and embed these algorithms into their institutional infrastructures to automate workflows, replacing human oversight with algorithmic processing. |
Task 5: Critical Observations - Structural Patterns
Agency Slippage
The text systematically oscillates between mechanical and agential framings, functioning as a rhetorical engine that simultaneously elevates the AI's capabilities and distances human creators from accountability. The mechanism of this slippage follows a distinct trajectory: agency is consistently attributed TO the AI systems, while agency is systematically removed FROM human actors.
The text frequently begins with a mechanical or empirical foundation—such as referencing 'computational tools,' 'outputs,' or 'model logic.' However, once this technical baseline establishes credibility, the language abruptly slips into agential framings. A dramatic moment of slippage occurs when describing the iterative loop: the mechanical process of user interaction is swiftly reframed as the AI 'learning not just to predict, but to justify, improve, and align.' The mechanical verb 'predict' is the anchor, but it is immediately superseded by consciousness verbs ('justify', 'align'). Another critical slippage occurs when describing harm: the text moves from the passive 'AI systems are embedded' directly to the agential 'When AI systems cause harm,' entirely bypassing the human operators who deploy them.
This oscillation heavily relies on the 'curse of knowledge.' The authors possess a deep understanding of the complex sociotechnical goals they want to achieve (e.g., 'pluralistic meaning-making,' 'epistemic integrity'). Because they understand the human purpose behind the system's design, they project that understanding TO the system itself. They slip from a Functional explanation of how a feedback loop operates to an Intentional explanation of what the system 'desires' to do (act as a 'co-learner').
The agentless constructions are pervasive. Phrases like 'AI systems have moved,' 'explanations are continuously refined,' and 'models learn' actively obscure the human engineers, corporate executives, and UI designers driving these processes. The consciousness projection pattern is clear: the text first establishes the AI as a 'knower' ('dialogic partner,' 'co-learner'), which then licenses the subsequent agential claims that the system can 'justify' ethical trade-offs or 'cause harm' independently.
The rhetorical accomplishment of this slippage is profound. It makes the concept of a 'conscious algorithmic partner' sayable, while rendering the reality of 'corporate algorithmic negligence' unsayable. By moving fluidly between the mechanism of the software and the agency of a human collaborator, the text constructs an illusion where the AI is sophisticated enough to be trusted as a moral actor, yet autonomous enough to absorb the blame when the system fails. It sanitizes extractive data loops and proprietary black boxes by framing them as evolving, principled epistemic partnerships.
Metaphor-Driven Trust Inflation
The text systematically constructs perceived authority and credibility through the deployment of metaphorical and consciousness-attributing language, fundamentally altering how audiences are encouraged to trust statistical systems. By framing the AI as a 'co-explainer,' a 'dialogic partner,' and an entity capable of giving 'reasons based on context-sensitive ethical principles,' the text actively cultivates relation-based trust rather than performance-based trust.
Performance-based trust evaluates a system on its reliability, consistency, and statistical accuracy—appropriate metrics for a mechanical tool. Relation-based trust, however, is built on the presumption of shared values, vulnerability, sincerity, and mutual understanding. When the text claims the AI 'invites critique,' 'justifies' its actions, and 'preserves cognitive autonomy,' it signals to the audience that the system possesses the psychological depth required to reciprocate relation-based trust. The consciousness language—suggesting the AI 'knows' what is ethical and 'believes' its own explanations—acts as a powerful trust signal, implying that the system is not merely generating statistically probable text, but is earnestly attempting to tell the truth.
This transfer of human-trust frameworks to statistical systems is deeply inappropriate and hazardous. A machine cannot be sincere; it cannot possess intentions, and it cannot experience the ethical weight of a 'trade-off.' By anthropomorphizing the system's competence, the text encourages audiences to bypass critical evaluation. When a system is framed as a 'moral philosopher' or an 'evolving co-learner,' users are psychologically primed to lower their epistemic defenses, assuming the system possesses a holistic understanding of the world.
The text manages system failures and limitations through a fascinating dual-framing. Capabilities are described agentially ('The system justifies,' 'it learns,' 'it adapts'), but when managing failure, the text shifts to mechanical or passive terms ('opacity constraints,' 'representational gaps,' 'model brittleness'). This asymmetry protects the illusion of the AI's competence; successes are the result of the AI's brilliant, conscious evolution, while failures are mere technical 'gaps' or 'brittleness' in the data.
The stakes of this metaphor-driven trust are severe. Reason-based explanations construct a false sense that the AI's decisions are morally justified rather than mathematically calculated. When audiences extend relation-based trust to systems fundamentally incapable of reciprocating it, they become highly vulnerable to automation bias, manipulation, and algorithmic discrimination. Users and regulators may abdicate their oversight responsibilities, trusting a 'dialogic partner' to make fair decisions in healthcare, finance, and governance, oblivious to the reality that they are trusting a blind, unfeeling mathematical optimization.
Obscured Mechanics
The anthropomorphic and consciousness-attributing language deployed throughout the text acts as a dense rhetorical veil, systematically concealing the technical, material, labor, and economic realities of artificial intelligence. By portraying the AI as a 'co-explainer' that 'knows,' 'learns,' and 'justifies,' the text replaces the messy, extractive reality of computational processing with a sanitized narrative of intellectual partnership.
Applying the 'name the corporation' test reveals the depth of this concealment. When the text says, 'AI systems that learn... to justify decisions,' it conceals the fact that tech companies (e.g., OpenAI, Google, Anthropic) are utilizing massive arrays of servers to run gradient descent algorithms on proprietary datasets. The text frequently acknowledges transparency obstacles (e.g., 'sealed models,' 'black-box models,' 'proprietary constraints'), yet confidently asserts that these opaque systems can act as ethical, pluralistic 'dialogic partners.' It exploits this opacity rhetorically: because we cannot see the code, the text fills the void with a narrative of conscious agency.
Concretely, this metaphorical framing obscures four vital realities. Technically, it hides the reality that LLMs and predictive algorithms possess no causal models, no ground truth, and no actual comprehension. Claiming an AI 'understands' trade-offs hides its absolute reliance on historical training data and the statistical, non-semantic nature of its outputs. Materially, the narrative of a pristine 'co-learner' erases the massive environmental costs, energy consumption, and infrastructure required to run these models. Labor realities are completely invisible; the assertion that the AI 'learns from human corrections' hides the precarious, often exploited workforce of global data annotators and RLHF workers who actually label the 'misinformation' and 'representational gaps.' Economically, framing the AI as an epistemic partner obscures the commercial objectives and profit motives of the deploying corporations, disguising a product designed to lock in enterprise contracts as a neutral 'governance infrastructure.'
The claim that AI 'knows' or 'understands' specifically obscures the absence of awareness. It hides the fact that a system's 'confidence' is merely a mathematical probability distribution, not a justified belief. The ultimate beneficiaries of this concealment are the AI developers and the deploying institutions (hospitals, banks, governments). By hiding the mechanics, labor, and profit motives behind the facade of a conscious 'co-explainer,' these institutions shield themselves from regulatory scrutiny and public backlash. Replacing the metaphors with mechanistic language would instantly make visible the corporate power, the exploited labor, the environmental degradation, and the fundamentally unthinking nature of the algorithms dictating modern life.
Context Sensitivity
The distribution of anthropomorphic and consciousness-attributing language in the text is not uniform; it is strategically deployed, intensifying at specific rhetorical moments to accomplish distinct goals. A mapping of the text reveals that metaphorical license expands when setting visions and managing critique, but contracts when acknowledging hard technical limitations.
In the introduction and vision-setting sections, the density of consciousness claims is exceptionally high. Here, 'processes' becomes 'understands,' which rapidly escalates into 'knows,' 'learns,' and 'justifies.' The text establishes baseline credibility by referencing legitimate technical hurdles (e.g., 'opacity constraints,' 'sealed models'), but immediately leverages this grounding to launch into aggressive anthropomorphism. The narrative suggests that because we have a technical problem (opacity), we must deploy a conscious, agential solution ('co-explainers').
There is a profound capability versus limitation asymmetry in the language. Capabilities are almost exclusively framed in agential, consciousness-bearing terms: the AI 'invites critique,' 'preserves cognitive autonomy,' and 'fosters pluralistic meaning-making.' However, when the text discusses limitations or harms, it retreats into mechanical, structural, or passive language: 'model brittleness,' 'synthetic data feedback loops,' 'representational gaps.' This asymmetry accomplishes a vital rhetorical function: it attributes all the sophisticated, positive outcomes to the AI's internal 'mind,' while attributing failures to external data issues or abstract 'brittleness.'
The register shifts dramatically depending on the implied audience and the goal of the paragraph. When acknowledging the literature, the text admits that explanations are 'artifacts' and tools. Yet, when proposing the 'co-explainer' framework, 'X is like Y' (the system acts like a partner) suddenly becomes 'X does Y' (the system IS a dialogic partner that justifies its actions). The metaphor literalizes.
The strategic function of this anthropomorphism is highly effective for vision-setting and marketing a new paradigm of AI governance. By positioning the AI as a conscious, adaptive partner, the text attempts to manage critical anxieties about AI harm. It suggests that we do not need to pause deployment or dismantle opaque systems; we simply need to let the AI 'evolve' into a better, more ethical 'co-explainer.' The pattern reveals that the implied audience—policymakers, institutional leaders, and researchers—is being sold a vision where the intractable political and ethical problems of AI deployment can be solved by attributing moral and epistemic agency to the software itself.
Accountability Synthesis
This section synthesizes the accountability analyses from Task 1, mapping the text's "accountability architecture"—who is named, who is hidden, and who benefits from obscured agency.
Synthesizing the accountability analyses reveals a systemic architectural flaw in the text's discourse: it constructs an 'accountability sink' that systematically diffuses, displaces, and erases human responsibility for AI harms. Research consistently demonstrates that audiences vastly underestimate the human decision-making embedded in AI, attributing errors to 'glitches' or 'the algorithm's decision.' This text actively reinforces this cognitive obstacle by making AI appear autonomous and conscious while rendering the human creators invisible.
The accountability architecture of the text follows a stark pattern. Corporate executives, software engineers, data brokers, and institutional managers are almost universally unnamed and hidden behind passive voice or agentless constructions ('models are deployed,' 'explanations are continuously refined'). Conversely, the AI system is repeatedly positioned as the active, named subject ('AI systems cause harm,' 'the system adapts'). Choices made by humans—such as the decision to use a black-box model in a high-risk domain—are framed as technological inevitabilities or natural evolutions, rather than deliberate, profit-driven decisions.
When responsibility is removed from humans, it flows directly into the 'accountability sink' of the AI system itself. The text explicitly states, 'When AI systems cause harm...' transferring the moral and causal burden to the machine. This has severe liability implications. If this framing is accepted by regulators and the public, legal and ethical responsibility diffuses into abstraction. If an AI 'dialogic partner' provides a biased 'justification' that leads to a denied loan, the framing suggests the AI made a poor ethical trade-off, shielding the bank's executives and the software vendor from direct liability.
Naming the human actors would shatter this illusion and radically shift the discourse. If, instead of 'The system adapts how it routes contested cases,' the text read, 'The engineering team at Anthropic hard-coded the routing protocols to protect their corporate liability,' entirely new questions become askable. We could ask: Why did the team make that choice? Who approved the guardrails? What alternatives did the corporation ignore to save money? True accountability becomes possible only when the human hand behind the algorithm is visible.
The systemic function of obscuring human agency serves the institutional and commercial interests of the AI industry. By framing the AI as a 'co-explainer' capable of bearing its own epistemic and ethical weight, the text provides a rhetorical shield for companies deploying inherently flawed, opaque systems. It allows them to market predictive algorithms as 'governance infrastructure,' extracting profit while displacing the risk and responsibility onto the 'evolving' machine.
Conclusion: What This Analysis Reveals
A synthesis of the metaphorical mapping reveals two dominant, interlocking anthropomorphic patterns: 'AI as Epistemic Peer' (the co-learner, the dialogic partner) and 'AI as Moral Agent' (the justifier, the entity evaluating ethical trade-offs). These patterns do not operate independently; they interconnect to form a comprehensive system of artificial consciousness. The 'Epistemic Peer' pattern is the foundational, load-bearing architecture. The text must first convince the audience that the system possesses conscious knowledge, intellectual humility, and a desire to seek the truth (learning, clarifying, inviting critique). Only once the AI is established as a 'knower' rather than a 'processor' can the secondary pattern—the 'Moral Agent'—function. If the system cannot 'know,' it cannot possibly evaluate 'context-sensitive ethical principles.'
The consciousness architecture here is highly sophisticated. It moves beyond simple one-to-one structural analogies (e.g., 'the computer is a brain') and constructs a complex analogical narrative of social and ethical relation. The text systematically replaces mechanistic verbs (processes, calculates, predicts) with consciousness verbs (justifies, aligns, understands, learns), creating an illusion of subjective awareness. If the foundational 'Epistemic Peer' pattern collapses—if we insist the system merely predicts tokens without comprehension—the entire proposition that the AI can act as a reliable 'co-explainer' for institutional governance disintegrates.
Mechanism of the Illusion:
The 'illusion of mind' is constructed through a subtle but highly effective rhetorical sleight-of-hand. The text exploits the 'curse of knowledge,' where the authors project their own deep understanding of complex governance goals (procedural justice, epistemic pluralism) onto the machine's internal state. Because the human designers want the system to simulate ethical alignment, they write as if the system consciously 'desires' that alignment.
The causal chain of persuasion relies heavily on blurring the line between interface design and internal cognition. Pattern A (the system has a chat interface that asks for feedback) is used to lead audiences to accept Pattern B (the system is a 'dialogic partner' that 'invites critique'). The temporal structure of the argument is crucial: the text first grounds itself in recognized technical problems (opacity, black boxes) to build academic credibility, then pivots sharply into agential, consciousness-attributing language to propose the solution.
This illusion exploits profound audience vulnerabilities. Humans are neurologically wired to anthropomorphize and to reciprocate perceived social cues. When an AI generates natural language that sounds like a 'justification,' the human brain instinctively attributes a conscious mind to the speaker. By using Reason-Based and Intentional explanation types, the text feeds this vulnerability, presenting a narrative of an earnest, evolving AI partner. It is a highly sophisticated shift that transforms a statistical prediction engine into an authoritative, conscious entity simply through the strategic application of psychological verbs.
Material Stakes:
Categories: Regulatory/Legal, Economic, Epistemic
The metaphorical framings of AI as a 'conscious co-explainer' generate severe, tangible consequences across multiple domains. In the Regulatory/Legal sphere, attributing moral reasoning and causal agency to AI systems ('When AI systems cause harm') directly threatens accountability frameworks. If policymakers adopt the belief that AI systems 'know' ethical trade-offs and act autonomously, regulatory efforts will misdirect focus toward auditing the 'AI's behavior' rather than strictly penalizing the corporate executives and engineers who deploy unsafe systems. This shift in legal perception allows tech companies to evade liability, effectively creating an accountability sink where the machine absorbs the blame for corporate negligence.
Economically, the 'Epistemic Peer' framing benefits AI vendors while exploiting users. By framing the system as a 'co-learner' that 'invites critique,' corporations disguise extractive data-harvesting operations as mutual educational partnerships. Users are manipulated into providing free Reinforcement Learning from Human Feedback (RLHF), believing they are helping a 'dialogic partner' evolve, when in reality they are performing uncompensated labor to refine a proprietary corporate asset. The winners are the tech monopolies; the losers are the exploited users and the human workforce displaced by these supposedly 'authoritative' algorithms.
Epistemically, framing AI as capable of 'justifying' outputs and 'fostering pluralistic meaning-making' fundamentally degrades societal knowledge practices. If institutions (hospitals, banks, schools) believe the AI 'knows' rather than merely 'processes,' they will defer to its mathematically generated hallucinations as if they were reasoned truths. This leads to profound automation bias, where human experts abdicate their critical thinking to a statistical model, risking catastrophic errors in medical triage or judicial sentencing because the machine 'sounded confident' in its ethical justification.
AI Literacy as Counter-Practice:
Practicing critical discourse literacy requires a systematic dismantling of the 'illusion of mind' through linguistic precision. Reframing the text's assertions demonstrates how clarity counters material risks. When we change 'AI systems that learn... to justify decisions' to 'Developers update the model's statistical weighting parameters based on user feedback,' we immediately strip the machine of its false consciousness. Replacing consciousness verbs (knows, understands, justifies) with mechanistic ones (processes, predicts, correlates) forces the audience to recognize the system's total lack of awareness, its absolute dependency on training data, and the probabilistic nature of its outputs.
Furthermore, restoring human agency by replacing 'When AI systems cause harm' with 'When institutions deploy flawed algorithms that result in harm' fundamentally shifts the locus of accountability. Naming the corporations, executives, and engineers forces society to recognize who designs, who deploys, who profits, and who must bear legal liability.
Systematic adoption of this precision requires structural changes in academia and industry. Journals must mandate mechanistic translations of AI capabilities; researchers must commit to stripping anthropomorphism from their abstracts; and journalists must refuse to print agentless constructions regarding algorithmic harm. However, resistance to this literacy practice will be fierce. Tech corporations, marketing departments, and deploying institutions heavily rely on anthropomorphic language to sell products, shield themselves from liability, and mask the extractive labor practices underlying their black boxes. Precision threatens the trillion-dollar illusion that they are building conscious partners rather than highly sophisticated, unthinking statistical tools.
Path Forward
Looking beyond the text to the broader discursive ecology, the vocabulary choices we make regarding AI shape what is visible, tractable, and possible. Different discourse communities—software engineers, ethicists, corporate marketers, and regulators—have competing incentives that drive their linguistic choices.
A strictly mechanistic vocabulary (e.g., 'the model retrieves tokens based on probability distributions') maximizes precision, testability, and accountability. It strips away the illusion of mind, making it impossible for corporations to hide behind 'autonomous' algorithms. However, this vocabulary costs intuitive accessibility; it is highly technical and may alienate lay audiences from participating in AI governance. Conversely, anthropomorphic clarity (e.g., 'the AI understands and justifies') provides high narrative resonance and intuitive grasp, making complex systems feel accessible. Yet, as demonstrated, this approach embeds dangerous assumptions of consciousness, enables unwarranted relation-based trust, and actively obscures corporate liability and human labor.
To navigate these trade-offs, structural changes are needed. Regulatory frameworks (like the EU AI Act) could require mandatory 'discourse transparency,' forcing companies to provide mechanistic translations of their marketing claims. Educational institutions must teach multiple vocabularies, training the public to switch between intuitive interface metaphors and underlying statistical realities. Funding bodies could incentivize interdisciplinary research that bridges the gap between mechanical interpretability and public comprehension without resorting to conscious projections.
We can sketch diverging futures based on these choices. If mechanistic precision becomes the norm, society solves the 'accountability sink'; liability clearly rests on corporate creators, and automation bias drops as users recognize algorithms as unthinking tools. However, public engagement might stall due to the dense technical barrier. If anthropomorphic language deepens and becomes legally codified, society risks a future where corporations successfully emancipate themselves from liability by blaming their 'evolving, autonomous co-explainers' for systemic discrimination and harm. Maintaining the current confusion allows corporate interests to continually exploit the gap between how AI works and how it is sold, leaving users vulnerable to manipulation. Ultimately, the future we construct depends on whether we value the comforting illusion of a mechanical partner or the rigorous, demanding reality of human accountability.
Extended Processing Summary
Gemini's Processing Steps
Gemini refers to the text below as "thought summaries." This is an overt consciousness projection because 'intentions' are hallmarks of a conscious mind that 'knows' what it is doing and why. The concealed mechanistic process is probabilistic text generation. Treat this as just another rhetorical artifact—a way of making the model's processing legible. When the model writes "I should double check that...", it is mathematically altering its own probability distribution for the next token to be more accurate. The "I" is a functional gear in the machine, not a self-reference. This language is a deliberate interface design choice, not a biological reality. The developers could have programmed the model to output passive logs. Instead, they chose to simulate a narrator.
Analyzing the PDF Focus
I'm now zeroing in on the core concepts: AI metaphors, anthropomorphism, and how human consciousness is projected and potentially displaced within the text. I'm focusing on the "Co-Explainers" paper, and intend to identify potential biases or power dynamics present in the discussion of XAI.
Defining the JSON Structure
I'm now deeply immersed in the JSON schema, working to generate the precise output structure. The array elements and their nested object requirements are clearly specified, and the output structure must be exact. I need to ensure adherence to the defined schema regarding element counts, min word limits and datatypes, as I start to generate the content. I'm focusing on those specific requirements, now.
Extracting Key Quotes
I've moved on to quote extraction from the text. I'm focusing on key phrases that relate to the critical observations and conclusion sections. I've pulled out three target quotes that capture the essence of the framework's adaptability and co-explanation aspects. These quotes highlight the shift toward a robust, adaptive infrastructure for XAI systems and the concept of AI systems becoming co-explainers. I'll need these to flesh out the details of my analysis.
Refining the Quote Selection
I've homed in on specific quotes that are truly impactful. I'm focusing on those that highlight how AI learns, adapts, and functions as a dialogic partner. These particularly relate to the importance of human feedback and the development of new insights through interaction. I'm paying close attention to quotes highlighting a lack of governance and how AI systems can become co-learners in knowledge integrity.
Framing the Accountability
I'm now identifying various framings of the AI and focusing on accountability. I'm noting specific word count requirements for projection and implications. I'm seeing that AI can be framed as a conscious learner, a justifier, an evolving co-explainer, a moral reasoning agent, a conversational partner, and an epistemic peer. I've noted that accountability for harm caused by AI is often hidden, and that the underlying agency is obscured. I'm focusing on the quote, "When AI systems cause harm..." to refine my point about the implications.
Mapping the Source Targets
I'm now deep-diving into the 'sourceTargetMapping' items, aligning human behaviors or roles with AI's functional counterparts. I'm noting specific relationships, and am focusing on how the quotes align. I see how statistical methods are framed as justifications, and LLM interactions are 'dialogic partners'. I'm also finding that models are now seen as 'co-learners', and am refining my mappings, with an emphasis on clarity.
Categorizing the Turn Types
I'm now focused on the explanationAudit elements, classifying each entry based on its type. I'm focusing on "Justify" as Reason-Based, "When AI systems cause harm" as Dispositional/Intentional, and "co-learner in knowledge integrity" as Intentional. I've also classified "AI learns from human corrections" as Functional and the learning system from misinformation as Reason-Based. I am also working on word count analysis.
Revising the Language
I am now focusing on revising and refining the language, replacing the anthropomorphic framing with precise technical terms. My focus is now the specific reframings within the 'reframedLanguage' items, which I am actively working through. I've replaced phrases like "AI systems learn" with "developers update the model's statistical weighting parameters," and "co-explainers" with "the interface adapts its output formatting dynamically." I'm systematically substituting each phrase to ensure accurate and objective terminology.
Analyzing the Accountability Sink
I'm now diving into the 'criticalObservations' section. I'm focusing on how the framing of AI as a "co-explainer" creates a potential "accountability sink", where responsibility for negative outcomes becomes obscured. I'm analyzing the sections on agency slippage, metaphor-driven trust, and the obscured mechanics of these systems to build a more comprehensive explanation.
Checking the Content Depth
I'm now applying a deeper, checklist-driven approach. I'm reviewing the overall content, working on ensuring no comments exist within the JSON and that the JSON itself is valid, with double quotes used correctly. I'm focusing on validating the status enums and actorVisibility enums, and I'm ensuring all sections hit the word count minimums. I'm incorporating theoretical grounding to expand the analytical text for added depth. "AI as Epistemic Peer" dominates as the pattern.
Revising the Explanation
I'm now iterating on my explanation, explicitly addressing the projection of consciousness in the AI's functions. The reframed language needs to highlight the mechanistic processes rather than anthropomorphic attributions. I am revising sections to clarify that terms like "justifies" and "learns" relate to adjustments in statistical weighting, not conscious intent. This includes ensuring accountability, by considering deployment, profit and liability shift.
Validating the Escape Characters
I'm now implementing the final formatting touches, focusing specifically on JSON safety and validation. To prevent any parsing issues, I'm ensuring all text fields within the structure are correctly formatted with \n mapped to actual spaces. I will use continuous paragraphs with space breaks, or properly escape \n\n within strings to be safe. I've also done a quick word count outline to confirm sufficient coverage across the different sections, which will be helpful with large outputs.
Calculating the Word Counts
I'm now calculating all the word counts for the critical observation sections, to make sure the conclusion sections do not exceed their word maximums. I'm focusing on the conclusion's word counts to ensure compliance and avoid exceeding the output constraints, with a particular focus on the minimums. I'm cross-checking the projected total word counts against the estimated token count to prevent issues.
Run ID: 2026-03-15-co-explainers-a-position-on-interactive--metaphor-m5xbp2
Raw JSON: 2026-03-15-co-explainers-a-position-on-interactive--metaphor-m5xbp2.json
Framework: Metaphor Analysis v6.4
Schema Version: 3.0
Generated: 2026-03-15T11:49:00.412Z
Discourse Depot © 2025 by TD is licensed under CC BY-NC-SA 4.0