A Note on Using AI
Orchestrating Inquiry: How RTS Uses AI to Support Student Storytelling
Most educational AI tools fall into familiar categories: homework shortcuts, plagiarism risks, or surveillance systems. But what if AI could serve a different purpose entirely?
RTS explores a fourth possibility: AI as an orchestration engine - choreographing the thinking process itself.
The Core Insight: Orchestration, Not Automation
RTS is a prototype web application that helps students transform open-ended research into podcast narratives. It integrates generative AI as a constrained text processing system within a carefully designed pedagogical framework.
The model's job isn't to know anything since it doesn’t really “know” anything. It's to keep the intellectual ball rolling into nextness.
What Makes This Different
Instead of asking AI to produce insights from nothing, RTS asks it to perform specific computational operations:
- Generate prompts — Socratic questions based on student writing
- Reorganize text — Structured synthesis of student reflections
- Surface patterns — Computational observations about student responses
The result is a choreography of learning that remains "kind of" observable and student-centered. At least that’s the idea.
How It Works: Three Core Functions
Every AI interaction in RTS serves one of three specific purposes:
1. Generating Follow-Up Questions
What happens: After a student completes a reflection round, the AI generates 7 Socratic questions designed to deepen their thinking within specific pedagogical subcategories (e.g., "The Origin Scene," "The Emotional Core," "The Disciplinary Bridge"). These are all rhetorical components I already “teach” and talk about in workshops and I just needed to define them through some system instructions for an LLM.
How it's constrained:
- First round: AI receives only the student's topic text — no training data about the subject
- Subsequent rounds: AI receives a reflection trail built from the student's previous rounds
- Questions must fit predefined subcategories aligned to learning outcomes (defined in the system instructions)
- Output is structured JSON following strict schemas enforced by the API
- No new information is introduced (it is not searching the web)— only questions about what the student wrote - and of course, this is where the system can possibly “break” since the
So What?: Students engage with computationally generated prompts based on their own thinking, not with AI's "knowledge" about their research topic.
Technical pattern:
// Reflection trail builds context from the student's own previous work
let trail = "";
if (round_number > 1) {
const { data: previousReflections } = await supabase
.from("deep_dive_reflection_rounds")
.select("user_reflection")
.eq("session_id", session_id)
.eq("user_id", user.id)
.eq("category_selected", "Spark of Inquiry")
.order("round_number", { ascending: true });
trail = (previousReflections || [])
.map((r, i) => `Round ${i + 1}: ${r.user_reflection}`)
.join("\n\n");
}
// AI receives student context, returns structured JSON questions
const result = await ai.models.generateContent({
model: ACTIVE_MODEL,
config: {
responseMimeType: "application/json",
responseSchema: SparkQuestionsSchema,
systemInstruction: [{ text: SYSTEM_INSTRUCTION }],
thinkingConfig: { thinkingBudget: -1, includeThoughts: true },
},
contents: [{ role: "user", parts: [{ text: userContext }] }],
});
Each question is saved to the database with its subcategory, model used, and full provenance — creating a queryable record of every computationally generated prompt.
2. Synthesizing Student Reflections
What happens: After completing 4 reflection rounds, the LLM reorganizes all the student's writing into a structured synthesis — a marginally syncophatic "mentor letter" highlighting patterns, tensions, and narrative threads specific to the deep dive category.
How it's constrained:
- AI receives only the student's own words (their 4 rounds of reflections) and topic text
- Output is pure markdown (not JSON-constrained), formatted as narrative feedback
- Zero new content is added — only reorganization and pattern recognition
- Synthesis is explicitly framed as computational reorganization, not expert judgment
- Prompts are W&M-specific, referencing campus resources like W&M Libraries Research & Instruction services and the Reeder Media Center
Why this matters: Students see their inquiry journey reorganized computationally, helping them recognize patterns they might have missed while maintaining ownership of all content.
Technical pattern:
// Handler fetches the student's complete deep dive journey from the database
const reflectionTrail = reflections
.map((r) => `### Round ${r.round_number}\n${r.user_reflection.trim()}`)
.join("\n\n");
// AI generates markdown synthesis — no JSON schema, pure narrative
const result = await ai.models.generateContent({
model: ACTIVE_MODEL,
config: {
systemInstruction: [{ text: SYNTHESIS_INSTRUCTION }],
thinkingConfig: { thinkingBudget: -1, includeThoughts: true },
},
contents: [{ role: "user", parts: [{ text: userPrompt }] }],
});
// Multi-part response handling: separate thought artifacts from final synthesis
for (const part of result.candidates[0].content.parts) {
if (part.thought) {
thinkingContent = part.text; // Intermediate computational artifact
} else {
synthesisText = part.text; // Final markdown synthesis
}
}
The synthesis and all its metadata (token counts, thought summary, model used) are saved to the database, creating an audit trail.
3. AI Observability & Metadata Capture
What happens: Every AI operation generates some “observability” data that is surfaced to students as a pedagogical instrument.
What's captured:
- Token counts — Prompt tokens, response tokens, thinking tokens, total tokens
- Model name — Exact model string (e.g.,
gemini-3-pro-preview) - “Thought” summaries — Intermediate computational artifacts extracted from multi-part responses
- Timestamp — When the operation occurred
Why this matters: In the discourse surrounding generative AI, "transparency" is a loaded term. A large language model has no internal "mind" to be transparent about. Instead, what I can offer is an attempt, albeit constrained by what is availalble through the API, at observability — a look at the computational evidence behind the output.
Token counts reveal the work: Tokens are the pieces of text the model processes. A high token count doesn't mean "deeper thought"; it means more mathematical operations were performed. This number represents the computational cost of generating a statistically probable response, not the depth of its understanding.
Thought summaries are artifacts, not reasoning: When the model produces intermediate text before its final response, RTS captures and surfaces this as "The Imitation of a Thought Process." The first-person narration in these artifacts creates the illusion of a strategic, conscious agent. RTS frames them explicitly as computational output instead of evidence of understanding.
Technical pattern:
// Every AI interaction returns complete metadata for observability
const metadata = {
token_usage: {
prompt: usageMetadata?.promptTokenCount || 0,
response: usageMetadata?.candidatesTokenCount || 0,
thinking: usageMetadata?.thoughtsTokenCount || 0,
total: usageMetadata?.totalTokenCount || 0,
},
thought_summary: thinkingContent,
model_used: ACTIVE_MODEL,
};
// Saved to database alongside the question or synthesis it accompanies
// Surfaced to students in the UI and in exported reports with AI Literacy Lens annotations
For more detail on how observability functions as pedagogy, see AI Observability. For the database tables that store this metadata, see Database & Living Ledger.
Positioning AI as Constrained Text Processing
RTS treats AI as a mechanical operation rather than an intelligent agent:
What AI does in RTS:
- Generates questions based on pattern matching in student text
- Reorganizes student writing using computational text processing
- Surfaces statistical probabilities as structured outputs
What AI does NOT do in RTS:
- Provide expertise on research topics
- Generate original content or arguments
- Make judgments about research quality
- Replace human mentorship or thinking
- Locate sources
This constraint is enforced architecturally: handlers only pass student-authored text to the model, prompts are externalized and auditable, and outputs are saved with full provenance. The system is designed so that even if the model could introduce outside knowledge, the pipeline is designed to constrain the opportunity as much as possible.
Why "Orchestration" Matters
Traditional educational AI positions the model as:
- Expert — Providing authoritative information
- Assistant — Helping with tasks and writing
- Tutor — Teaching concepts and skills
RTS positions AI as:
- Sequencer — Moving students through structured reflection
- Reorganizer — Pattern recognition in student text
- Prompter — Generating questions, not answers
This is a shift that is architectural. The system is designed to make this constraint technically enforceable and pedagogically visible.
The complete data infrastructure supporting this approach is documented in Database & Living Ledger. The course-centric architecture that scopes all student work is described there as the "walled garden" model.
This is an experiment in computational constraints. What happens when we try to use generative AI purely for prompt generation and text reorganization? Can these mechanical operations function as thinking tools without becoming thinking partners?
Whether this constitutes successful orchestration or something more complex remains an open investigation.