Skip to main content
New employees should read these pages before running or reviewing Eval Labs sessions, but reading the Canon is not the same as being onboarded into the product.

Evaluator shortcut

Approved evaluators should start with the focused mini-Canon:
  1. START HERE - Evaluator Onboarding
  2. What Eval Labs Is
  3. Your Role and Access
  4. AI-Reviewed vs Human Review
  5. Running Your First Custom Eval
  6. Reviewing Lucia
  7. Good Feedback Examples
  8. What Not To Do
  9. First Assignment Checklist
  10. Getting Help
This shortcut does not grant access, launch employee onboarding, or claim human Lucia-quality approval. Tester cohorts should start with the same access page, but their product lane is narrower: Custom Prompt Test and Auto-generated Prompt Test only.

Broader Canon reading

  1. START HERE — Eval Labs
  2. What Eval Labs Is
  3. Current System State
  4. Employee Onboarding Gate
  5. Role and Access Model
  6. Eval Labs Roles and Access Matrix
  7. Evaluation Philosophy
  8. Human Grading Is the Product
  9. Reviewer Cognitive Load Doctrine
  10. Review Architecture
  11. Employee Review Layer
  12. Dataset Registry and Registry Diagnostics
  13. Behavioral Observatory
  14. Behavioral Label Persistence
  15. Eval Labs Step-by-Step Operator Guide
  16. Review Workflow
  17. Quality Bar
  18. Team Usage Guidelines
AI-reviewed platform readiness has passed. That does not mean Lucia is human-approved, and it does not remove role-specific access boundaries.

Required before creating custom suites

  1. Custom Prompt Suites
  2. Designing Strong Prompt Suites
  3. Intent Layer Refinement Workflow
  4. Regression Suite Design
Evaluators and testers should only create or run prompt tests within their approved scope.

Required before debugging runs

  1. Live Wiring and Environment Truth
  2. Known Debug Playbook
  3. Release and Validation History
  4. AI-Reviewed Platform Readiness Gate
  5. Supabase and localStorage Verification
  6. Behavioral Label Persistence

Required before exporting results

  1. Exporting and Using Results
  2. Review Notes Template
  3. Run Summary Template

Employee rule

Do not review by instinct alone. Use the Canon language. Review suggested selections before saving, but do not treat them as truth. Use the semantic scoring sliders, Quick Review controls, and Human Guidance Evaluation to create structured signal. The goal is not to create lots of scores. The goal is to create reliable behavioral evidence.

Required before senior review or adjudication

  1. Adjudication Doctrine
  2. Structured Human Judgment Capture
  3. Senior Reviewer and Adjudicator SOP
  4. Exporting and Using Results
  5. Behavioral Observatory

Updated employee rule

Employees should not invent training language. Use the guided Quick Review controls. Add Human Guidance Evaluation scores when useful. Add short notes only when context is needed. Escalate when uncertain. Finalize only after the run review is complete.

Required before owner/admin readiness work

  1. Product Surfaces and Route Map
  2. Role and Access Model
  3. Controlled Batch Runner Protocol
  4. AI-Reviewed Platform Readiness Gate
  5. Analysis and Run Truth
  6. Dataset Registry and Registry Diagnostics
Do not use the Controlled Batch Runner as a casual prompt-testing workflow. Tester users should not use the Controlled Batch Runner.

Required before Behavioral Observatory work

  1. Behavioral Observatory
  2. Behavioral Label Persistence
  3. Structured Human Judgment Capture
  4. Eval Labs Step-by-Step Operator Guide
Do not treat derived suggestions as saved labels.