Skip to main content
Human onboarding is role-based. The AI-reviewed platform readiness gate proves platform lifecycle readiness, while first human cohorts still require role assignment, scoped access checks, persistence proof, and clear active-hardening caveats.

Current readiness status

AI-reviewed platform readiness gate: PASSED
Human Lucia-quality approval: NOT PASSED / NOT CLAIMED
Human onboarding: ROLE-BASED / CONTROLLED
Owner/admin oversight: IMPLEMENTED
Tester lane: IMPLEMENTED
Evaluator workbench: IMPLEMENTED / ACTIVE HARDENING
Registry Diagnostics: OWNER/ADMIN DIAGNOSTIC SURFACE
Behavioral Observatory: OWNER/ADMIN LABEL SURFACE

Required before onboarding

Human employees/evaluators should not be onboarded until:
  1. AI-reviewed platform readiness gate is passed.
  2. Clerk auth works for the cohort.
  3. Clerk public metadata has the correct eval_labs_role.
  4. The Clerk session token includes the role claim required by Supabase RLS.
  5. The user’s visible surfaces match Eval Labs Roles and Access Matrix.
  6. Real runs persist to Supabase and reload from the correct scoped view.
  7. Owner/admin can inspect shared persisted evidence through Team Review or Global Analysis where oversight applies.
  8. Human review process is clearly distinguished from AI-reviewed platform testing.
  9. Behavioral Observatory persistence and access boundaries are verified before assigning saved-label work to employees.
  10. Any active-hardening caveats are named before work begins.
Status as of this Canon update:
  • Steps 1-4 are implemented.
  • Steps 5-7 must be verified for the specific role/cohort.
  • Steps 8-10 are onboarding responsibilities.

Tester boundary

Tester is the entry-level prompt-testing role. Testers can use:
  • Custom Prompt Test
  • Auto-generated Prompt Test
Testers cannot use Verification Check, Verification Results, Controlled Batch Runner, Team Review, Global Analysis, Registry Diagnostics, Behavioral Observatory, or owner/admin tools.

Evaluator boundary

Evaluator is the full evaluator workbench role. Evaluators can use evaluator-safe test surfaces and their own run/review/history routes. Evaluators cannot use Team Review or Global Analysis. Those are owner/admin oversight surfaces. Evaluator workspace polish remains active hardening; do not describe the evaluator UX as fully final while that work continues.

Do not overclaim

Do not describe Eval Labs as broadly production-mature or open-access. Do not describe Lucia as human-approved from AI-reviewed gate results. Correct language:
The platform readiness gate passed.
Human Lucia-quality approval remains pending.
Human onboarding is role-based and controlled.
Evaluator workspace polish remains active hardening.