Skip to main content

A Note on Using AI

Orchestrating Inquiry: How RTS Uses AI to Support Student Storytelling

A Longer Conversation Elsewhere

The conceptual and critical work behind how RTS thinks about AI language lives in a companion project: Discourse Depot. That site examines the metaphors, framings, and ideological work embedded in how we talk about generative AI systems. What you'll find here is the architectural and pedagogical rationale. What you'll find there is the critique.


What would it look like to use generative AI purely as an orchestration engine?

RTS explores that question. The model's job is to keep the intellectual ball rolling into nextness: generating the next question, reorganizing the student's own words, surfacing a pattern the student can read and push back on.


The Core Insight: Orchestration

RTS is a prototype web application that helps students transform open-ended research into podcast narratives. Generative AI operates here as a constrained text processing system inside a carefully designed pedagogical framework.

The model performs specific computational operations:

  • Generates prompts: Socratic questions based on student writing
  • Reorganizes text: Structured synthesis of student reflections
  • Surfaces patterns: Computational observations about student responses

The result is a choreography of learning that is, at least in design intent, observable and student-centered. Whether the constraint holds in practice is part of what this system is built to investigate.


Three Core Functions

Every AI interaction in RTS serves one of three specific purposes.

1. Generating Follow-Up Questions

After a student completes a reflection round, the AI generates 7 Socratic questions designed to deepen their thinking within specific pedagogical subcategories: "The Origin Scene," "The Emotional Core," "The Disciplinary Bridge," and others. These are rhetorical components I already teach in workshops. Defining them through system instructions for an LLM was the translation work.

How it's constrained:

  • First round: AI receives only the student's topic text
  • Subsequent rounds: AI receives a reflection trail built from the student's previous rounds
  • Questions must fit predefined subcategories aligned to learning outcomes, defined in the system instructions
  • Output is structured JSON following strict schemas enforced by the API
  • No new information is introduced: the model is generating questions about what the student wrote

This is also where the system is most vulnerable. The constraint depends entirely on what the system instructions can and cannot enforce. That gap is an open question.

So what: Students engage with computationally generated prompts built from their own thinking.

Technical pattern:

// Reflection trail builds context from the student's own previous work
let trail = "";
if (round_number > 1) {
const { data: previousReflections } = await supabase
.from("deep_dive_reflection_rounds")
.select("user_reflection")
.eq("session_id", session_id)
.eq("user_id", user.id)
.eq("category_selected", "Spark of Inquiry")
.order("round_number", { ascending: true });

trail = (previousReflections || [])
.map((r, i) => `Round ${i + 1}: ${r.user_reflection}`)
.join("\n\n");
}

// AI receives student context, returns structured JSON questions
const result = await ai.models.generateContent({
model: ACTIVE_MODEL,
config: {
responseMimeType: "application/json",
responseSchema: SparkQuestionsSchema,
systemInstruction: [{ text: SYSTEM_INSTRUCTION }],
thinkingConfig: { thinkingBudget: -1, includeThoughts: true },
},
contents: [{ role: "user", parts: [{ text: userContext }] }],
});

Each question is saved to the database with its subcategory, model used, and full provenance: a queryable record of every computationally generated prompt.


2. Synthesizing Student Reflections

After completing 4 reflection rounds, the LLM reorganizes all the student's writing into a structured synthesis: a mildly sycophantic mentor letter highlighting patterns, tensions, and narrative threads specific to the deep dive category. The tone is a known artifact of how these models are trained. Students are encouraged to read it as one.

How it's constrained:

  • AI receives only the student's own words, their 4 rounds of reflections plus topic text
  • Output is pure markdown, formatted as narrative feedback
  • Zero new content is added: only reorganization and pattern recognition (still working on this)
  • Synthesis is framed as computational reorganization, not expert judgment
  • Prompts are W&M-specific, referencing campus resources like W&M Libraries Research & Instruction services and the Reeder Media Center

Students see their inquiry journey reorganized computationally, which sometimes surfaces patterns they hadn't named.

Technical pattern:

// Handler fetches the student's complete deep dive journey from the database
const reflectionTrail = reflections
.map((r) => `### Round ${r.round_number}\n${r.user_reflection.trim()}`)
.join("\n\n");

// AI generates markdown synthesis — no JSON schema, pure narrative
const result = await ai.models.generateContent({
model: ACTIVE_MODEL,
config: {
systemInstruction: [{ text: SYNTHESIS_INSTRUCTION }],
thinkingConfig: { thinkingBudget: -1, includeThoughts: true },
},
contents: [{ role: "user", parts: [{ text: userPrompt }] }],
});

// Multi-part response handling: separate thought artifacts from final synthesis
for (const part of result.candidates[0].content.parts) {
if (part.thought) {
thinkingContent = part.text; // Intermediate computational artifact
} else {
synthesisText = part.text; // Final markdown synthesis
}
}

The synthesis and all its metadata (token counts, “thought” summary, model used) are saved to the database, creating a basic audit trail.


3. AI Observability and Metadata Capture

Every AI operation generates observability data that is surfaced to students as a pedagogical instrument.

What's captured:

  • Token counts: Prompt tokens, response tokens, thinking tokens, total tokens
  • Model name: Exact model string (e.g., gemini-3-pro-preview)
  • "Thought" summaries: Intermediate computational artifacts extracted from multi-part responses
  • Timestamp: When the operation occurred

In the discourse surrounding generative AI, "transparency" is a loaded term. A large language model has no internal "mind" to be transparent about. What RTS can offer is some vague observability: a look at the computational evidence behind the output, constrained by what is actually available through the API.

Token counts reveal the work. Tokens are the pieces of text the model processes. A high token count means more mathematical operations were performed. That number represents the computational cost of generating a statistically probable response. It says nothing about depth of understanding.

Thought summaries are artifacts. When the model produces intermediate text before its final response, RTS captures and surfaces this as "The Imitation of a Thought Process." The first-person narration in these artifacts creates a convincing illusion of a strategic, conscious agent. However, RTS frames them explicitly as just good ole computational output. The illusion does not disappear with the label. That tension is the point.

Technical pattern:

// Every AI interaction returns complete metadata for observability
const metadata = {
token_usage: {
prompt: usageMetadata?.promptTokenCount || 0,
response: usageMetadata?.candidatesTokenCount || 0,
thinking: usageMetadata?.thoughtsTokenCount || 0,
total: usageMetadata?.totalTokenCount || 0,
},
thought_summary: thinkingContent,
model_used: ACTIVE_MODEL,
};

// Saved to database alongside the question or synthesis it accompanies
// Surfaced to students in the UI and in exported reports with AI Literacy Lens annotations

AI as Constrained Text Processing

RTS treats AI as a mechanical operation.

What AI does in RTS:

  • Generates questions based on pattern matching in student text
  • Reorganizes student writing using computational text processing
  • Surfaces statistical probabilities as structured outputs

What the pipeline is designed to prevent:

  • Providing expertise on research topics
  • Generating original content or arguments
  • Making judgments about research quality
  • Locating sources

This constraint is enforced architecturally: handlers pass only student-authored text to the model, prompts are externalized and auditable, and outputs are saved with full provenance. The pipeline is designed to close off opportunities for the model to introduce outside knowledge. Whether that closure holds completely is, again, something the system is built to help investigate.


Why "Orchestration" Matters

Most educational AI positions an LLM as an expert, assistant, or tutor: a system that provides authoritative information, helps with writing tasks, or teaches concepts. RTS positions AI as:

  • Sequencer: moving students through structured reflection
  • Reorganizer: pattern recognition in student text
  • Prompter: generating questions

The shift is architectural. The constraint is technically enforceable and, by design, pedagogically visible.


What happens when generative AI is used purely for prompt generation and text reorganization? Can these mechanical operations function as thinking tools without becoming thinking partners? That distinction may be harder to hold than the architecture suggests. Whether this constitutes successful orchestration or something more complicated remains an open investigation.