Skip to main content

About Discourse Depot

What this is

A public workshop for experiments in using large language models as potential research instruments and specifically for analyzing how language frames and constructs meaning in popular, technical, academic and political discourse. I wrote a blog post about it.

Why it exists

This started with a simple concern: the relentless anthropomorphism in AI discourse. Words like "thinks," "understands," and "learns" started, for me, to obscure more than they hoped to explain. As I dug into metaphor theory, framing analysis, and discourse studies, a question emerged: Could an LLM be configured to systematically apply these same analytical frameworks to texts?

The answer is this site, so decide for yourself.

Rooted In Librarian-ish Preoccupation with Literacies

My approach to and interest in AI literacy has focused less on developing skills in “using AI” (whatever that means) and more focused on other questions. While it is interesting to say that generative AI is “just another technological disruption,” I’m not sure, at this moment, that’s quite right…there’s a bit more going on. Where it is different, for me, is this: When it comes to generative AI, especially right now when it seems to blowing up the world, it is a good time to take a step back, take a deep breath and reflect on this idea:

Before we can teach what AI is, we may want to do a refresher on what it means to explain anything.

Generative AI is different in that by trying to do, via our explanations, what every good communicator does, make something abstract feel intuitive, we are actually changing (or narrating into existence) the very kind of thing we think we are explaining. That’s the difference.

While reading an article or blog post on generative AI, I started to notice something interesting and common that kept triggering some analytical resistance: often, in mid-sentence, a rhetorical drift would happen between how/why questions which kept signaling, to me as a reader, slippage between "grammars" of explanation. For example, a sentence would start mechanical: “softmax converts logits to probabilities” and end psychological/anthropomorphic: “the model chooses the most likely word.” So the “aboutness” of AI that began in a mechanical register often slid into the anthropomorphic register, while altogether skipping over the human-system register (who designed or profits from this framing). I started to think: whatever we call AI Literacy, a significant literacy practice must be about how to detect these types of shifts.

Before AI Literacy: Explanation Literacy

How is generative AI different from other so-called disruptive technologies? Can we really say, we’ve been here before? I’m not so sure. I can’t think of any disruptive technology in higher education or academic librarianship that has prompted me to start asking: Can we make sense of this thing, really, if we haven’t grappled with what we mean by sense-making itself? I’ve never heard these types of concerns about Wikipedia, or Google. Why? Because we never talked about those “technologies” via metaphors that carried with them a theory of mind or deployed explanations of them that carried implicit assignments of agency.

So, I’m starting upstream. I think AI literacy may just begin with considering the conditions necessary for explanations to be meaningful. AI literacy starts with explanation literacy. I’m putting aside whether or not it is essential for students or readers to become experts in understanding “backpropagation” or to develop advanced skills in manipulating chatbots. I’m starting with a focus on the interpretive skills of recognizing when discourse shifts from one explanatory framework to another, and how such transitions alter perceptions of agency, risk, and moral responsibility.

I think, more than ever, we might want to get a bit preoccupied with helping students develop foundational interpretive skills for an AI-saturated world where they can recognize:

  • What counts as an explanation in different discourses (scientific, political, narrative, moral).
  • What conditions make an explanation trustworthy or meaningful.
  • How metaphor and anthropomorphism function as referential shortcuts (not errors per se), but performative choices that need interrogation.

This is not a “no metaphors or no anthropomorphism” campaign but “Know what kind of world each metaphor builds” campaign.

LLM-generated explanations, and much of the language we use to describe them, perform the surface features of explanation without grounding in the world. They reproduce the rhetorical shape of understanding (causal syntax, explanatory tone, connective logic) but without the causal, experiential, or accountable ties that make explanation meaningful.

But here’s the crux. The apparent “breach” of the explanatory contract doesn’t occur in the model; rather it occurs in the discourse between us, the human interpreters, and the machine’s simulated speech acts. Recognizing that distinction is the first act of what I’m referring to as explanation literacy: learning to detect when language stops explaining and starts performing the appearance of explanation.

Common Examples we read all the time now.

  • "The model chooses a word" → Imports intentionality
  • "The model decides what to write" → Imports agency
  • "The model understands context" → Imports consciousness

Discourse Depot started there. At the arrow. What fables are we constructing in the “→”?

So instead of focusing on technical comprehension, i’m starting with AI literacy as a bundle of practices that focus on on the recognition of framework drift and the ability to detect when an explanation stops being explanatory. To recognize when explanation is drifting from mechanism into mind, and to ask who benefits from that drift. What kind of worlds do they build? And for whom?

And I don’t think this is really about just about finding better metaphors. It is a bit deeper. It’s about the conditions of explanation itself and the threshold where an explanation means something.

The Bundle of Practices

So my experiments in using an LLM to critique discourse about LLMs was forensic in nature. Asking what kind of system instructions could, when unleashed on a text:

  • Articulate the narrative or story about the “thing”(how people talk about AI)
  • Explain the mechanism (what is actually happening)
  • Map the gains (why narrative is useful)
  • Identify the costs (what gets obscured)
  • Recognize rhetorical how/why slippage (where language breaks)
  • Ask critical questions (who benefits? what alternatives exist?)

The internal scaffolding baked into each prompt attempts to:

  • Audit specific texts for anthropomorphic and metaphorical frames
  • Analyze what explanation types are deployed
  • Recognize power dynamics in discourse
  • Propose alternative framings

Importantly, I'm not suggesting learners need to know how to talk about linear algebra to competently talk about generative AI, but I am suggesting that we pay some front-loaded attention to how it is talked about at this moment. Yes, understanding an awareness that narrativity facilitates a more intuitive interaction with AI systems but it doesn't have to be at the cost of attributing intentionality or agency to them.

A Note on the Probabilistic Minefield

So we can say that LLMs are mechanical systems that contain randomness and that mechanical systems can also be probabilistic. The banana peel slip happens when we then say that probabilistic implies or is “like” intentionality. Probabilistic ≠ Intentional. Randomness can also be fully mechanical.

This whole project started with an exploration of observing happens when you write 4,000-word system instructions, strap the model into a JSON schema straightjacket, and then watch a probabilistic text generator perform its illusion of coherence.

What an LLM actually does

  • It generates text by predicting the next likely token based on patterns in its training data.
  • It does not read, understand, or reason about documents in a human sense.
  • It produces linguistic performances shaped by probability.

However, when we say "the model generated a response," that language is a type of framing, while useful, might also suggest something we want to interrogate: conscious choice.

Why the output sometimes feels confident

  • Even though I am creating system instructions, there are instructions in the LLM that shape language in response to patterns, prompts, and built-in instructions I do not see
  • These instructions come from the tool developer and are not visible to users of the model

Where variation comes from

  • Running the same prompt twice may yield different phrasings or emphasis.
  • This variation is a feature of probabilistic generation.
  • It cannot be fully removed, even when systems appear deterministic.

So yes, using an LLM for discourse analysis is stepping into a probabilistic minefield. But that’s the interesting part. When the model produces structure that almost fits the schema, or weirdly fails to fit it, that friction is some kind of evidence. It shows where predictive text tries (and fails) to inhabit conceptual distinctions it cannot hang onto.

Again, full transparency. I’m creating elaborate prompts. Testing them out. Putting the results here. Thinking about how I might make the process into a syllabus for a course on AI literacy.

This site is just the archive (depot) of those attempts displayed as the readable markdown versions of an LLM’s json outputs. It is just what happens when I attempt to operationalize some theory, constrain the output with schemas, and let a generative model perform inside those boundaries. Some of the outputs seem insightful and are really interesting to read but that’s not ultimately the point. The point is to see what the performance reveals.

Addressing Potential Critiques

"Isn't This Hypocritical?"

Using an LLM while saying LLMs “know” nothing? I guess this would only be hypocritical if I claimed an LLM "understood" my critique or "agreed with" my analysis. I make no such claims. I’m using a computational tool to execute procedures I designed, and I’m transparent about exactly what that tool is doing.

The project actually demonstrates its own thesis: we can use LLMs effectively when we understand what they actually do (process patterns, generate probable text) rather than what anthropomorphic language suggests they do (understand meaning, know truths).

"Are the Outputs Reliable?"

The outputs are reliable in the same sense that any structured analysis following explicit procedures that is then handed off to a probabilistic language generator is reliable. They are not reliable as definitive interpretations, or insight into authorial intention and I don't claim they are.

This is why each framework developed emphasizes:

  • Multiple analytical perspectives
  • Explicit methodological choices (the analysis reveals what I asked it to look for)
  • Human evaluation (outputs are starting points for discussion, not endpoints)

What's here

Outputs from my experiments. Each analysis represents a different configuration: different prompts, different theoretical frameworks, different texts. I'm sharing them publicly because:

  • It's easier than sending individual examples to colleagues via Teams (lol)
  • Transparency about the process matters when we're teaching AI literacy
  • Seeing the full range of outputs—successes and failures—is more pedagogically useful than polished case studies
  • Many of them are actually interesting “texts” to read in their own right

The bigger picture

These experiments feed directly into a syllabus I'm developing called "Prompt as Interpretive Instrument." The core pedagogical move: students engineer AI systems by translating theoretical research into executable analytical instructions. The prompt becomes the artifact of assessment, a schema becomes an argument, the iteration becomes visible metacognition. This is part of ongoing AI literacy initiatives at William & Mary Libraries.

These three frameworks I’m working with demonstrate that different theories require different data structures, but they share a common pedagogical core:

  1. Formalization as learning: Learners try to translate theory into executable logic
  2. Schema as argument: Data structures encode ontological commitments
  3. Iteration as metacognition: Debugging the prompt reveals gaps in understanding
  4. Provenance as scholarship: Every analysis is auditable and queryable

The method of operationalization remains constant: deep engagement with theory → translation to operational primitives → schema design → prompt engineering → iterative refinement.

This is the innovation: not so much in the frameworks themselves, but the practice of building them.

About the Outputs

Structure and consistency

Each analysis follows the structure defined by its prompt, with standardized headings that map directly to the analytical tasks. This makes the outputs somewhat auditable. I can generally trace which instruction produced which section of the analysis.

Two forms of output

  1. Prose analyses (what you see in the output examples): These are human-readable analyses generated by the model following the prompt's instructions. I use a framework-specific Node.js processor system to take the JSON output and turn it into readable markdown for this site.
  2. Structured data schemas (used in the web application): Each framework also has a corresponding JSON schema that enforces structured output via the API configuration. These schemas are designed to be normalized, auditable, and scalable—not just for generating a single analysis, but for building a corpus that can be queried systematically.

Reading LLM Outputs Critically

I’m treating all LLM-generated analysis as just more rhetorical artifacts out there in the world. The analytical outputs are themselves texts requiring critical examination:

  • They reflect training data patterns: The LLM's analysis will reproduce interpretive moves it has encountered in its training data, which may include the very anthropomorphic patterns I'm critiquing.
  • They are probabilistically generated: Every phrase is selected based on statistical likelihood, not an any semantic understanding.
  • They embody my prompt design choices: The outputs reveal what my instructions actually communicated, which may differ from what I intended.
  • They are one possible analysis: Different prompt formulations, different temperature settings, even different runs with identical inputs will produce variations.
  • There is no guarantee of factual accuracy
  • This is not about locating authorial intent in the texts analyzed. (Even though you'll notice in the outputs, there sometimes is just that type of claim). Notable exception probably: the “spicy” CDA prompt is a bit heavy handed in its “language is never neutral” and “every text is an exercise of power” framing. 😎

This critical stance toward outputs is essential pedagogy. Part of the AI literacy thrust is having students learn that:

  • AI-generated text is not "objective" or "authoritative"
  • The same analytical standards apply to all text, human or machine-generated
  • Evaluation requires understanding of how the text was produced
  • Using AI tools responsibly means scrutinizing their outputs

The Methodological Commitment

By using LLMs to critique AI discourse while maintaining this critical, transparent stance about what the LLM is doing, the project models the methodological commitment it teaches:

  • Precision in language (I try (and often fail) to describe the LLM's operations accurately, not anthropomorphically)
  • Transparency about process (I try to explain exactly what role the LLM plays and what it doesn't do)
  • Critical evaluation (I treat outputs as artifacts requiring interpretation, not facts)
  • Appropriate attribution (I acknowledge the LLM as a weirdly worlded tool while taking responsibility for its use)

I’m not interested in telling students that "AI is bad, don't use it." I prefer to show them that AI is a tool with specific capabilities and limitations; here's one way to use it responsibly while maintaining your own critical awareness.


Why schemas (kind of) matter

The structured output approach transforms the tool from "text-in, text-out" to a genuine data analysis pipeline:

User Text → Gemini API (with Schema) → Structured JSON → Database

The idea is that this creates the foundation for comparative analysis across multiple texts: tracking metaphor patterns over time, comparing framing strategies across political speeches, or querying how agency is distributed across a corpus of policy documents.

By constraining the model to output JSON shaped by a schema, I can also use the data to build things that can use structured data. Since something like React treats that JSON like any other API response, the plan is to explore the use of lightweight React components to render, sort, filter, and interact with the schemas.

The schema design process is itself pedagogically valuable, it forces students to answer other questions: What are the essential, irreducible components of a metaphorical frame? What data type is "agency"—a string, a boolean, a relation to another object?

The Larger Argument

We're at a moment where AI literacy is still being defined in librarianship and higher education. Many of our approaches focus on:

  • Technical skills (how to write prompts, use tools)
  • Ethical warnings (don't plagiarize, cite AI use)
  • Skepticism (AI is unreliable, always verify)

These are all valuable, but (I think) they miss something foundational: how we conceptualize these systems will shape everything else we do.

My hope is that if students understand generative AI as statistical pattern processors and probabilistic language machines, they'll:

  • Calibrate trust appropriately
  • Design better prompts
  • Evaluate outputs critically
  • Use tools responsibly
  • Engage in informed policy discussions

If students understand generative AI as quasi-conscious "partners," they'll tend to:

  • Over-trust outputs
  • Outsource intellectual work inappropriately
  • Miss system limitations
  • Form parasocial relationships with it
  • Struggle with accountability questions

The project addresses this foundational issue by making language itself the object of study.

This moves beyond "teaching students not to be fooled" to teaching students to be active participants in shaping how we collectively understand and talk about AI.

Behind The Scenes Workflow

The discourse analysis workflow uses a framework-specific Node.js processor system that transforms raw outputs from Google AI Studio or Vertex AI into structured, publication-ready formats. The architecture uses modular, reusable processor classes for each analytical framework (Metaphor Audit, CDA-Soft, CDA-Spicy, Political Framing).

For the corpus analysis and data science workflow, this system helps me to maintain provenance and data integrity. It takes disparate experimental artifacts and consolidates them into a master provenance record, identified by a unique run ID, which includes all input parameters, metadata, and outputs. This process generates three key artifacts for each run: a human-readable JSON for some reproducibility and re-use (for example spinning off some basic apps or dashboards via React), a single-line Database JSONL entry appended to a master file for efficient corpus-level querying and database ingestion (e.g., into Postgres), and a presentation-ready Markdown file for documentation and sharing. By automating these steps, the script reduces a bit of the manual effort, prevents data inconsistency, and prepares the raw analytical outputs for both qualitative review and quantitative downstream analysis.

And I keep changing it, that’s why the outputs look different. I tweak the prompt, then tweak the schema, then tweak the markdown. I recently switched this site from Vitepress to Docusaurus so I could experiment with some React components.

So please note that this project, including the analytical frameworks and the outputs displayed on this site, is an ongoing work in progress. Consequently, you may notice variations in formatting or structure between different analysis pages, especially over time. This reflects the iterative nature of the research and development process as I work to mess around with the prompts, schemas, and presentation in unison. Thanks for understanding!

Extended Processing Summaries Technical Note

Some outputs include an "Extended Processing Summary" section, typically at the end. These are the model's intermediate token generations before producing the final structured response—what Gemini's documentation, in semi-acknowledged metaphor framing, calls “thought summaries.”

Here's an interesting example of one of them:

So-called "Thought Summary"

Deconstructing the Critique

I'm now grappling with the inherent irony of this task. The text I'm analyzing already critiques the anthropomorphic tendencies I'm supposed to be identifying. This creates a fascinating meta-level challenge: how do I analyze a critique of anthropomorphism for anthropomorphism? My approach will need to be nuanced. I'll focus on the specific ways the authors avoid anthropomorphism and the potential implications of those choices.

The LLM struggle is real. 🎩🪄

Why I include them: These summaries could be diagnostically useful for evaluating prompt design. They show:

  • How the prompt instructions were parsed and executed
  • Which analytical tasks the model processed first or devoted more tokens to
  • Where the generation process produced uncertainty markers or reformulations
  • The computational sequence that led to the final output
  • Are great examples of how LLM creators make deliberate choices to maximally project an “inner life” onto their models.

This makes them valuable for evaluating prompt design, and definitely not for understanding "what the AI was thinking." Since it wasn’t thinking and can’t, there’s no thought to summarize.

The AI literacy caveat

Why first-person in these “thought summaries”? Simply put, this language is a design choice by the LLM creators, not a technical requirement. The creators of the LLM could have just as well chose to represent these intermediate outputs with passive voice ("Lexical units are being extracted") or structured logs ("Task: Metaphor identification, Tokens: 1,203") or gibberish or an image gallery of cute kittens.

But they didn’t. They chose to show us, the users, text written as if a subject were narrating its reasoning. It borrows all the baggage and all the grammar of explanation: first-person voice, temporal markers (“now,” “starting to”), worlded verbs (“noting,” “sniff out”), and evaluative stances (“subtle framing,” “underlying assumptions”).These are all linguistic cues we associate with conscious interpretation.

That shift performs an illusion of introspection and the language performs the form of explanation, while the conditions that make explanation meaningful (causality, legibility, accountability) are totally absent. The breach of the explanatory contract happens not in the model but in our interpretation. The output invites us human readers to imagine a mind at work; the interface, tone, and pronouns all reinforce that invitation. We bring our habits of reading narrative and self-reporting into a domain where no such subject exists.

In doing so, the discourse of AI explanation slides from description of process to performance of understanding. This is not “wrong” or “bad” it is just a move that reshapes how agency, risk, and responsibility are understood, projected and attributed to the LLM and distributed between human and machine.

Contact

Browse freely. Questions, feedback, and "hey, try analyzing this" suggestions welcome. TD | William & Mary Libraries| elusive-present.0e@icloud.com


Discourse Depot © 2025 by TD is licensed under CC BY-NC-SA 4.0