Alexander Dean
— — —
2024–2025 · Personal project

AI Learning Platform

Upload your notes, build question banks, study adaptively, learn with friends. A social learning platform with a multi-provider LLM router, spaced-repetition scoring, and collaborative sharing built on Supabase RLS.

Next.js · Supabase · TypeScript · Multi-LLM

The problem

Students accumulate lecture slides, notes, and readings across a term but rarely have a structured way to test themselves against that material. Flashcard apps require manual entry. Past papers cover the wrong syllabus. The gap between “I've read this” and “I know this” goes unmeasured until the exam.

The idea was simple: upload whatever material you have, specify how many questions and at what difficulty, and get a ready-to-use question bank back. Over a term you accumulate banks per module. Then the app helps you study — not just by testing you, but by noticing which areas you actually struggle with and pushing those back to the surface.

My role

Solo build from idea through to production. I designed the data model, built the document ingestion and LLM generation pipeline, wrote the adaptive scoring algorithm, and implemented the sharing system. The project was also a deliberate exercise in working with multiple LLM providers and building resilience at that layer rather than assuming any single provider is always available.

— Architecture

Frontend
Framework
Next.js 14 · TypeScript · App Router
UI
Tailwind CSS · Radix UI primitives
State
React Server Components + client islands
File upload
Supabase Storage — PDF, DOCX, TXT
Auth
Supabase Auth — email, magic link, OAuth
Backend
API
Next.js API routes · Edge-compatible
Database
Supabase PostgreSQL — RLS for sharing
LLM routing
Custom router — GPT-4o, Claude, Gemini
Queue
Supabase pg_cron — async generation jobs
Parsing
Document text extraction pipeline
Adaptive engine
Signals
Response time + correct/incorrect history
Algorithm
Weighted error rate with time-decay factor
Resurfacing
Priority queue — weak questions requeued
Persistence
Per-user, per-question stats in Postgres
Sharing
Row-level security — read/write per user

— Key decisions

LLM router with automatic provider fallback
Context

Coupling to a single provider meant one rate limit or outage could break the entire generation pipeline. Different models also have different cost/quality trade-offs for different question types.

Outcome

A thin routing layer maps question-generation tasks to the most suitable available model. If the primary provider is slow or rate-limited, the router falls back to the next in the preference list transparently. No generation request surfaces a provider error to the user.

Supabase RLS for social sharing instead of a permissions service
Context

Users needed to share individual questions and entire banks with friends — read-only or editable. A separate permissions service would add significant infrastructure for what is fundamentally a data-access problem.

Outcome

Postgres row-level security policies enforce share permissions at the database layer. A shares table records (owner, recipient, resource_id, access_level). Every query is automatically scoped — no application-layer permission checks needed, no risk of forgetting one.

Adaptive selection based on time and accuracy, not just accuracy
Context

Tracking only right/wrong misses questions a user answers correctly but slowly — a sign they're not yet confident. Time alone is noisy (distractions, re-reads). Neither signal is sufficient on its own.

Outcome

A weighted score combines normalised response time with a rolling error rate. Questions above the threshold are promoted into a high-priority pool and resurface more frequently. The effect is that genuinely weak areas receive more repetitions without the user having to identify them.

Question banks as a first-class entity
Context

Early designs treated question banks as simple lists — a folder of questions. That made cross-bank quizzes, sharing, and collaborative editing awkward to model.

Outcome

Banks are their own database entity with many-to-many membership to questions. A question can live in multiple banks. Quiz sessions are composed from one or more banks. Sharing a bank grants access to its member questions without duplicating data.

— Technical depth

Document ingestion pipeline

Users upload PDFs, Word documents, or plain text. The pipeline extracts text, chunks it into context-window-safe segments, and passes each chunk to the LLM with a structured prompt that specifies question count, difficulty level, and output format. Questions are returned as structured JSON, validated against a schema, and written to Postgres in a single transaction — either the entire bank lands or nothing does.

01UploadPDF · DOCX · TXT
02ExtractRaw text
03ChunkContext windows
04GenerateLLM → JSON
05ValidateSchema check
06StorePostgres txn

← scroll →

LLM routing

The router maintains a ranked list of providers with their current status. On each generation request it selects the highest-ranked available provider, sends the request, and monitors the response time. If the provider exceeds a latency threshold or returns a rate-limit error, the router marks it degraded and retries immediately against the next in the list. The caller never sees a provider-specific error — only success or a single unified failure if all providers are unavailable.

This also made it straightforward to route different task types to different models: cheaper, faster models for simple factual questions; stronger models for analytical or application-level difficulty.

All providers healthy
incoming request
01
GPT-4o
Analytical · complex questions
340ms
02
Claude Sonnet
Factual · reasoning tasks
290ms
03
Gemini Pro
Broad coverage · fallback
410ms
response to caller

The caller receives success or a single unified error — never a provider-specific message.

Adaptive scoring

Each question session records two signals per attempt: whether the answer was correct, and the response time normalised against the user's median for that difficulty band. These combine into a confidence score:

score = (error_rate × 0.7) + (slow_rate × 0.3)

Questions above a score threshold are promoted into a high-priority pool. When the quiz engine selects the next question, it samples from the high-priority pool with a higher probability than from the general pool. This means weak questions resurface without any explicit scheduling — the distribution does the work.

Confidence score formula
score = (error_rate × 0.7) + (slow_rate × 0.3)
Questions scoring ≥ 0.40 are promoted to the high-priority pool
Weak question
Missed 3 of 4 attempts, consistently slow
promoted
error_rate × 0.70.50
slow_rate × 0.30.17
score
threshold 0.400.68
Borderline
Some errors, moderate response time
general pool
error_rate × 0.70.28
slow_rate × 0.30.09
score
threshold 0.400.37
Strong question
Rarely wrong, quick and confident
general pool
error_rate × 0.70.07
slow_rate × 0.30.04
score
threshold 0.400.11

Sharing via RLS

Supabase's row-level security lets you attach policies directly to tables. A shares table records who shared what with whom and at what access level. The RLS policies on questions and question_bankscheck for a matching shares row before allowing a read or write. No application code enforces this — the database does. A leaked API route or a missing auth check can't accidentally expose another user's data because the query will return nothing.

— Outcomes

<3s
avg. bank generation time
— —
provider errors surfaced to users
— —
LLM providers in the router
RLS
enforces all share permissions

— What I'd do differently

The document chunking strategy was naive at first — fixed character windows with no regard for semantic boundaries. Chunks split mid-sentence confused the model and produced malformed questions. I'd start with paragraph-aware chunking and add overlap between chunks from the beginning rather than retrofitting it.

The adaptive algorithm weights are hand-tuned constants. They work well in practice but have no principled basis. With more usage data I'd run an offline evaluation against known learning curves to find weights that minimise time-to-mastery rather than guessing.

← All case studies