LLM Inference Journey
The Alchemy Switch​
The same toggle mechanism shifts between:
- Narrative Mode describes inference as a conversation—the AI "reads" your question, "thinks" about it, "chooses" words carefully, and "remembers" context.
- Mechanistic Mode reveals the actual process: string-to-integer conversion, matrix multiplication, cached tensor retrieval, probability sampling.
The Slippage Audit​
Anthropomorphic terms are flagged throughout. When the narrative says the model is "thinking hard," the tooltip clarifies: "No thinking occurs. The GPU is performing dense matrix-matrix multiplications at maximum throughput—pure linear algebra." When it mentions "conversation," learners see: "'Conversation' implies mutual understanding and exchange. The model processes a sequence of tokens—your input concatenated with its previous outputs. There's no dialogue, just an ever-growing input array."
Critical Literacy Function​
This tool targets the everyday language we use when interacting with AI assistants. Every time we say "I asked ChatGPT and it told me..." we reinforce a conversational frame that implies understanding and intention. By making the slippage visible, learners develop the capacity to use these systems without unconsciously attributing to them minds they don't have.
Attribution
- LLM Inference Journey draws on Arpit Bhayani's "How LLM Inference Works"
- Inspired by one of the key obstacles (AI as Mystery) in Communicating About the Social Implications of AI: A FrameWorks Strategic Brief
- Created by Troy Davis | W&M Libraries