LLM Inference Journey

The Alchemy Switch

The same toggle mechanism shifts between:

Narrative Mode describes inference as a conversation—the AI "reads" your question, "thinks" about it, "chooses" words carefully, and "remembers" context.
Mechanistic Mode reveals the actual process: string-to-integer conversion, matrix multiplication, cached tensor retrieval, probability sampling.

The Slippage Audit

Anthropomorphic terms are flagged throughout. When the narrative says the model is "thinking hard," the tooltip clarifies: "No thinking occurs. The GPU is performing dense matrix-matrix multiplications at maximum throughput—pure linear algebra." When it mentions "conversation," learners see: "'Conversation' implies mutual understanding and exchange. The model processes a sequence of tokens—your input concatenated with its previous outputs. There's no dialogue, just an ever-growing input array."

Critical Literacy Function

This tool targets the everyday language we use when interacting with AI assistants. Every time we say "I asked ChatGPT and it told me..." we reinforce a conversational frame that implies understanding and intention. By making the slippage visible, learners develop the capacity to use these systems without unconsciously attributing to them minds they don't have.

Attribution

LLM Inference Journey draws on Arpit Bhayani's "How LLM Inference Works"
Inspired by one of the key obstacles (AI as Mystery) in Communicating About the Social Implications of AI: A FrameWorks Strategic Brief
Created by Troy Davis | W&M Libraries

The Alchemy Switch​

The Slippage Audit​

Critical Literacy Function​

The Alchemy Switch

The Slippage Audit

Critical Literacy Function