LLM Inference Journey
The Alchemy Switch
The same toggle mechanism shifts between:
- Narrative Mode describes inference as a conversation—the AI "reads" your question, "thinks" about it, "chooses" words carefully, and "remembers" context.
- Mechanistic Mode reveals the actual process: string-to-integer conversion, matrix multiplication, cached tensor retrieval, probability sampling.
The Slippage Audit
Anthropomorphic terms are flagged throughout. When the narrative says the model is "thinking hard," the tooltip clarifies: "No thinking occurs. The GPU is performing dense matrix-matrix multiplications at maximum throughput—pure linear algebra." When it mentions "conversation," learners see: "'Conversation' implies mutual understanding and exchange. The model processes a sequence of tokens—your input concatenated with its previous outputs. There's no dialogue, just an ever-growing input array."
Critical Literacy Function
This tool targets the everyday language we use when interacting with AI assistants. Every time we say "I asked ChatGPT and it told me..." we reinforce a conversational frame that implies understanding and intention. By making the slippage visible, learners develop the capacity to use these systems without unconsciously attributing to them minds they don't have.
Attribution
- LLM Inference Journey draws on Arpit Bhayani's "How LLM Inference Works"
- Inspired by one of the key obstacles (AI as Mystery) in Communicating About the Social Implications of AI: A FrameWorks Strategic Brief
- Created by Troy Davis | W&M Libraries