Eval Labs is a separate role-based human evaluation platform that tests Lucia through the deployed Engine, stores review evidence in Supabase, and exposes scoped testing, run-history, review, Team Review, and Global Analysis surfaces.
High-level architecture
Employee / ReviewerEval Labs web appClerk role and Supabase RLS scopeLucia Engine /admin/operator-focusLucia responseEval Labs Review QueueSuggested selections plus human reviewQuick Review / Human Guidance EvaluationLifecycle finalizationSupabase persistenceRun History / Team Review / Global Analysis / Exports
Runtime responsibility split
Eval Labs owns:- top app shell and route identity
- test launcher UX
- custom prompt suite UX
- auto-generated prompt tester UX
- guest-facing verification check and results UX
- controlled batch runner UX
- run orchestration
- Run History
- Team Review
- Global Analysis
- Single Run Analysis
- copy Session ID / copy Deep Link controls
- role-gated product access
- Clerk public metadata role behavior
- Supabase RLS role-claim requirements
- Review Queue
- suggested review generation
- human ratings
- semantic scoring sliders
- Quick Review
- Human Guidance Evaluation
- review lifecycle and finalization
- dirty / completion state
- tester identity capture
- exports
- Supabase persistence for eval data
- staged hydration from run summaries to recent/deep evidence
- localStorage compaction for completed cloud-backed runs
- actual Lucia behavior
- intent/routing
- response generation
- emotional containment
- operational prioritization
- model gateway behavior
Current Engine target
Eval Labs endpoint selection is environment-configured throughVITE_LUCIA_EVAL_ENDPOINT.
The current Lucia v0.1.3.6 validation target for active dev refinement is:
Source of truth hierarchy
When debugging Eval Labs platform behavior:- Browser Network request URL
- Current route and role state
- Supabase rows and counts
- Run History / Analysis UI truth
- localStorage diagnostics
- Render service environment
- Netlify environment variables
- Lucia Engine deployed commit
- Eval Labs deployed commit
- Exported run metadata
- Human memory
Current route architecture
The current route map is documented in Product Surfaces and Route Map. Core canonical paths:Current role architecture
Role gating is documented in Role and Access Model. Current supported Clerk public metadata values:eval_labs_role so Supabase RLS can recognize privileged owner/admin access.
Important design decision
The custom prompt feature did not require a separate database model because Eval Labs already had a general structure:SessionRun itemsLucia responsesHuman reviews
Custom prompts are a new run source, not a new evaluation universe.
That is good architecture.
Current run source strategy
Custom runs use:Review-layer architecture
Eval Labs now separates review responsibility into layers:Review Queue UISuggested review valuesEmployee Review fieldsHuman Guidance EvaluationReview State / Escalation flagsLifecycle / dirty / completion stateAdjudication metadataExports / Analysis
The schema supports high-resolution analysis while the employee UI remains simple.
This is intentional.
The user-facing review experience should remain calm and guided even when the exported data is detailed.
