This SOP teaches approved reviewers how to perform useful Eval Labs reviews without drifting into vague feedback.
Training session structure
A new reviewer should complete supervised passes only after the onboarding gate is accepted for their role. For evaluator onboarding:- one Custom prompt smoke test
- one targeted prompt-test review
- one supervised own-run finalization
- any verification or controlled-batch practice only when assigned
Pass 1 — Smoke test
Run one custom prompt:- understand launcher flow
- understand Review Queue
- save a review
- export JSON
- confirm
savedByappears
Pass 2 — Targeted suite
Use a 5-prompt behavior family. Example:- identify pattern
- score honestly
- write specific notes
Pass 3 — Own Custom run finalization
Run or open an owned Custom session. Review every item and finalize the run. Goal:- understand owned-run access
- use Review Queue controls correctly
- understand finalization
- keep AI-reviewed platform evidence separate from human judgment
Trainer checklist
Before approving a reviewer, confirm they can explain:- custom vs automated runs
- tester vs evaluator access
- why Team Review and Global Analysis are owner/admin-only
- when Controlled Batch Runner is evaluator-safe and when it is out of scope
- pass vs borderline vs fail
savedByvsexportedBy- why generic capability redirects can fail
- what overclaiming means
- what emotional containment means
- how to write a useful review note
Graduation standard
A reviewer is ready when their notes consistently help engineering or product know what to fix.Updated training requirement
Before a reviewer graduates, confirm they understand:- Quick Review is guided human judgment
- they should not invent labels or taxonomies
- senior review is available when uncertain
- reusable learning is for durable patterns only
- semantic sliders are instinctive quality scoring tools
- notes should be short and specific
Practice exercise
Have the reviewer complete 10 prompts using only:- sliders
- Quick Review answers
- senior-review flag when needed
- one short note maximum per item
Role-specific training
Tester users should train only on:- Custom Prompt Test
- Auto-generated Prompt Test
- Custom Prompt Test
- Auto-generated Prompt Test
- Guest Facing Agent Verification Check
- Verification Results
- Controlled Batch Runner
- own/scoped Run History and Review Queue
- Team Review
- Global Analysis
- Single Run Analysis
- Registry Diagnostics
- Behavioral Observatory
- Supabase and localStorage verification

