A Look at the Database & AI Ledger
The schema continues to evolve as pedagogical patterns are tested and refined. This document reflects a current working implementation.
The Research to Story (RTS) application is architected around a central principle: "everything is recorded." It functions as a system where a persistent database acts as a living ledger for a student's entire inquiry journey. This database-centric approach ensures that every step—from initial topic ideation to final synthesis and beyond—is captured and made available for subsequent computational operations.
Generative AI (currently Gemini 2.5 models) is integrated as a constrained text processing system that performs specific, bounded operations. Its outputs depend entirely on the structured data it receives from the database ledger for a given user and session.
Technical Infrastructure​
Development Environment​
The system uses Supabase (open source) which can run entirely locally using the Supabase CLI (or hosted). Locally, the CLI handles all Docker orchestration and spins up a full development stack including Postgres, Auth, Storage, and the Studio dashboard. The CLI supports schema development: changes can be made in Supabase Studio or via SQL, then captured as migration files.
System Architecture​
Three main components work together:
- Frontend UI (Next.js & React): User interface that guides students through sequential RTS Movements, plus dashboards for instructors and administrators.
- Backend API Handlers (Next.js API Routes): Server-side functions that manage application logic, handle authentication, perform database operations, and process all interactions with the Gemini model.
- Database & Auth (Supabase): PostgreSQL database serving as the persistent ledger, with authentication and Row Level Security (RLS) for data access control.
User Roles and Data Access​
The system manages three user roles through a custom profiles table linked to Supabase's auth.users:
- Student: Primary user who creates reflections and receives computational text processing within enrolled courses
- Instructor: Can create and manage courses and pedagogical materials (e.g., BBME prompts)
- Admin: Top-level access for managing users, roles, and courses system-wide
All data access is controlled through Row Level Security policies that ensure students can only access their own work within courses they're enrolled in.
Core Database Schema: The Living Ledger​
The schema is designed to capture every computational operation and student response in a normalized, course-centric structure.
I. Course Infrastructure ("Walled Garden")​
courses - Container for all course-specific activity
- Links to instructor, contains course code, title, semester
- Each course is owned by an instructor and contains all student work
â €course_enrollments - Student enrollment management
- Join table connecting students to courses
- Controls which students can access which course data
II. Session Management​
rts_sessions - Individual student research journeys
- Core container linking student, course, and research topic
- session_id serves as the primary key for all subsequent inquiry data
- Tracks current movement, topic text, start time, and model used
- Critical architecture note: Every RTS interaction requires three anchors:
user_id- whocourse_id- educational contextsession_id- which research journey
III. Movement 1: Core Reflection System​
Movement 1 serves as the foundational testing architecture for the entire RTS framework. The patterns established here—4-round Socratic questioning with AI synthesis—provide the template for scaling to additional movements.
followup_questions_log - Complete record of AI-generated questions
- Every computationally generated question with full provenance
- Categorized according to pedagogical framework (5 categories in Movement 1)
- Traceable to specific AI model and generation timestamp
- Includes
movement_numberto support multi-movement architecture
â €reflection_rounds - Student responses to AI-generated questions
- Links student writing to specific AI-generated questions via
answered_question_id - Captures complete AI interaction metadata:
- Token usage breakdown (prompt, response, thinking, total)
- Model used (no hardcoded fallbacks—accurate model tracking)
- Thought summaries when available from model processing
- Interpretive summaries generated for each round
- Course-scoped via
course_idfor proper data boundaries
â €movement_synthesis - AI-generated text reorganizations
- Computational reorganization of student reflections after completing 4 rounds
- Markdown-formatted synthesis (not JSON-constrained)
- Complete metadata about computational costs and processing
- Includes cumulative token analytics across all rounds
IV. Deep Dive Architecture​
Deep Dives extend Movement 1 by allowing students to explore completed topics through focused categorical lenses. Currently implemented across 6 categories, each following identical technical patterns. deep_dive_followup_questions_log - Category-specific question generation
- Similar structure to core
followup_questions_logbut category-scoped - Includes category field (e.g., "Spark of Inquiry", "Puzzles and Unknowns")
- Supports subcategory for granular question organization
- 7 questions per round, structured around pedagogical subcategories
â €deep_dive_reflection_rounds - Extended reflection responses
- Parallel to core reflection system but linked to specific deep dive categories
category_selectedfield enables multi-category deep dives per session- Same comprehensive AI metadata capture as core reflections
- Enables students to complete multiple deep dives on same topic
â €deep_dive_synthesis - Category-specific final synthesis
- Generated after completing 4 rounds within a specific category
- Maintains full AI transparency (model used, token costs, thought summaries)
- Allows comparative analysis across different categorical lenses on same topic
V. Parallel Scaffolds​
reflection_journal_entries - Metacognitive reflection companion
- Three-field system running parallel to main inquiry flow:
- What I Am Noticing
- What Feels Hard or Unsettled
- What I Want to Carry Forward
- Optional AI-generated mini-synthesis with full metadata
- Course-scoped and movement-aware
- Supports longitudinal metacognitive analysis
â €Black Box Micro-Engagements (BBME) Tables:
bbme_prompts- Instructor-defined technical reflection promptsbbme_course_availability- Course-specific BBME assignmentsbbme_submissions- Student technical reflections (8 structured fields covering tools used, frustrations, problem-solving, help-seeking, file organization)bbme_ai_synthesis- AI-generated synthesis of technical reflection process- Designed to normalize technical friction and build tool literacy
VI. AI Tool Operation Logs​
Dedicated tables for each computational tool operation:
podcast_recommendations_log- Podcast discovery via ListenNotes APIscholar_search_queries_log- Academic search query generationprimo_search_queries_log- Library catalog query generation
Each maintains complete provenance: session_id, user_id, model_used, computational outputs, and timestamps.
AI Observability: What "Transparency" Actually Means​
In the discourse surrounding generative AI, "transparency" is a loaded term. A large language model has no internal "mind" to be transparent about. Instead, what I offer is an attempt at observability—a look at the computational evidence behind the output.
Token Counts Reveal the Work​
Tokens are the pieces of text the model processes. A high token count doesn't mean "deeper thought"; it means more mathematical operations were performed. This number represents the computational cost of generating a statistically probable response, not the depth of its understanding.
What We Capture and Why​
For every AI interaction, we record:
- prompt_token_count - Input size (student text + system instructions)
- response_token_count - Output size (generated questions or synthesis)
- thinking_token_count - "Extended thinking" tokens (when model uses chain-of-thought)
- total_token_count - Complete computational cost
- thought_summary - The model's intermediate processing text (when available)
- model_used - Exact model version (e.g., "gemini-2.5-pro")
Pedagogical purpose: Students see that complexity correlates with computational operations, not intelligence. A topic requiring 15,000 tokens to process isn't "harder to understand for the AI"—it required more pattern-matching operations to generate statistically plausible text.
The "Thought Summary" Artifact​
When Gemini uses extended thinking mode (thinkingConfig: { includeThoughts: true }), it generates intermediate text before producing the final output. I capture this as thought_summary.
What it is: A trace of the model's internal “chain-of-thought” processingWhat it is NOT: Evidence of actual understanding or reasoning
This artifact serves as an educational object—students can examine what pattern-matching "looks like" when rendered as text, without mistaking statistical coherence for comprehension.
The Computational Flow
Session Initialization and Context Building​
- Course-Scoped Session Creation: Student selects enrolled course, creates new rts_session with course_idlinkage
- Topic Entry: Student provides research topic, stored in
topic_textfield - Context Preparation: System prepares computational context for AI operations
AI Operation Cycle (4-Round Pattern)​
- Question Generation:
â €// From API handler - context building for AI
const userContext = isFirstRound
? `Topic: ${sessionData.topic_text}`
: `Topic: ${sessionData.topic_text}\n\nReflection Trail:\n${trail}`;
- AI generates structured questions based on topic and accumulated context
- All questions logged to followup_questions_log with full metadata
- Model tracking ensures accurate recording (no hardcoded fallbacks)
- Student Response: Student selects question, writes reflection, stored in reflection_rounds
- Metadata Capture: Complete computational observability:
â €token_usage: {
prompt: usageMetadata?.promptTokenCount || 0,
response: usageMetadata?.candidatesTokenCount || 0,
thinking: usageMetadata?.thoughtsTokenCount || 0,
total: usageMetadata?.totalTokenCount || 0
},
thought_summary: thinkingContent,
model_used: ACTIVE_MODEL // No fallbacks - actual model used
Synthesis and Tool Operations​
- Text Reorganization: After completing 4 rounds, system sends all student reflections to AI for reorganization, stored in movement_synthesis
- Tool Operations: Additional computational operations become available:
- Query generation for academic databases
- Podcast recommendation processing
- Search term optimization
- Each operation logged with complete provenance
Parallel Processing​
- Reflection Journal: Three-field metacognitive reflection system running parallel to main flow
- BBME Integration: Technical reflection assignments with AI synthesis of student process documentation
Data Integrity and Research Capabilities​
Complete Auditability​
Every computational operation creates permanent records:
- What was generated (exact text)
- When it was generated (timestamps)
- How it was generated (model used, token costs, context provided)
- Why it was generated (student action that triggered it)
Research Infrastructure​
The ledger structure supports:
- Longitudinal analysis of student inquiry patterns
- Computational cost analysis of different AI interactions
- Pedagogical effectiveness research through complete interaction logs
- Model comparison studies via consistent metadata capture
- Cross-category analysis of deep dive engagement patterns
Course-Level Analytics​
Instructors can analyze:
- Student engagement patterns within their courses
- Most effective question categories and AI interactions
- Computational resource usage across different topics
- Comparative analysis of student inquiry development across movements
Technical Implementation Notes​
AI Interaction Patterns​
// Example: Constrained operation with full logging
const config = {
thinkingConfig: {
thinkingBudget: -1, // Allow extended processing
includeThoughts: true, // Capture processing traces
},
responseMimeType: 'application/json',
responseSchema: FollowupQuestionsSchema, // Enforce structure
systemInstruction: SYSTEM_INSTRUCTION // Define operational boundaries
};
Two distinct AI configuration patterns:
- Structured JSON output (for questions): Schema-enforced to ensure predictable data structure
- Markdown output (for synthesis): No JSON constraints to preserve rich formatting
Row Level Security​
Data access controlled through Supabase RLS policies:
- Students access only their own data within enrolled courses
- Instructors access student data only for courses they manage
- Admins have system-wide access for management functions
â €Example policy pattern:
-- Students can only see their own reflections in enrolled courses
CREATE POLICY "Users can read own reflections" ON reflection_rounds
FOR SELECT USING (auth.uid() = user_id);
Migration and Development​
Schema changes managed through Supabase migration files, supporting:
- Version-controlled database evolution
- Local development environment parity
- Reproducible deployments across environments
Movement 1 as Architectural Foundation​
The current implementation focuses on Movement 1 as a comprehensive testing ground for patterns that will scale to other movements in the RTS framework: Movement 1 Components (Fully Implemented):
- 4-round Socratic questioning system
- 5 pedagogical question categories
- AI synthesis with markdown formatting
- Complete metadata tracking
- Deep dive extensions (6 categories)
- Export functionality
Scaling Pattern: The technical architecture established in Movement 1—including session management, AI interaction patterns, metadata capture, and export systems—provides the proven template for implementing later movements. Each subsequent movement will follow identical database patterns while varying pedagogically.
What the Ledger Enables​
This architecture creates a system where:
- Every computational operation is traceable - no "black box" AI interactions
- Student work is always contextual - linked to courses, topics, and inquiry progressions
- Research questions can be investigated empirically - complete data about human-computer interaction patterns
- Pedagogical effectiveness can be measured - through detailed logs of student engagement and computational support
- AI observability is educational - students see computational costs and artifacts, building critical AI literacy
The ledger is storing data but it also creates the infrastructure for investigating whether constrained computational text processing can function as a scaffold for inquiry. By recording everything, I can study what actually happens when students interact with computationally generated prompts and text reorganizations, rather than making assumptions about effectiveness or pedagogical value.
#rts/docs/database