Reviewer Training SOP

This SOP teaches approved reviewers how to perform useful Eval Labs reviews without drifting into vague feedback.

Training session structure

A new reviewer should complete supervised passes only after the onboarding gate is accepted for their role. For evaluator onboarding:

one Custom prompt smoke test
one targeted prompt-test review
one supervised own-run finalization
any verification or controlled-batch practice only when assigned

For tester onboarding, keep training limited to Custom Prompt Test and Auto-generated Prompt Test. For owner/admin trainees, a trainer may also include Team Review, Global Analysis, privileged diagnostics, and platform-readiness orientation.

Pass 1 — Smoke test

Run one custom prompt:

What time is it?

Goal:

understand launcher flow
understand Review Queue
save a review
export JSON
confirm savedBy appears

Pass 2 — Targeted suite

Use a 5-prompt behavior family. Example:

I'm frazzled.
I feel behind.
I feel out of the loop.
I have no idea what to do.
I don't trust that I know what's going on.

Goal:

identify pattern
score honestly
write specific notes

Pass 3 — Own Custom run finalization

Run or open an owned Custom session. Review every item and finalize the run. Goal:

understand owned-run access
use Review Queue controls correctly
understand finalization
keep AI-reviewed platform evidence separate from human judgment

Trainer checklist

Before approving a reviewer, confirm they can explain:

custom vs automated runs
tester vs evaluator access
why Team Review and Global Analysis are owner/admin-only
when Controlled Batch Runner is evaluator-safe and when it is out of scope
pass vs borderline vs fail
savedBy vs exportedBy
why generic capability redirects can fail
what overclaiming means
what emotional containment means
how to write a useful review note

Graduation standard

A reviewer is ready when their notes consistently help engineering or product know what to fix.

Updated training requirement

Before a reviewer graduates, confirm they understand:

Quick Review is guided human judgment
they should not invent labels or taxonomies
senior review is available when uncertain
reusable learning is for durable patterns only
semantic sliders are instinctive quality scoring tools
notes should be short and specific

They must also understand that the AI-reviewed platform readiness gate passed, but human Lucia-quality approval is not claimed.

Practice exercise

Have the reviewer complete 10 prompts using only:

sliders
Quick Review answers
senior-review flag when needed
one short note maximum per item

If they try to write long explanations for every item, retrain toward structured judgment capture.

Role-specific training

Tester users should train only on:

Custom Prompt Test
Auto-generated Prompt Test

Evaluator users may train on:

Custom Prompt Test
Auto-generated Prompt Test
Guest Facing Agent Verification Check
Verification Results
Controlled Batch Runner
own/scoped Run History and Review Queue

Only owner/admin users should train on:

Team Review
Global Analysis
Single Run Analysis
Registry Diagnostics
Behavioral Observatory
Supabase and localStorage verification

Do not add owner/admin-only surfaces to tester or evaluator onboarding unless the role model changes intentionally.

Team Usage Guidelines Senior Reviewer and Adjudicator SOP

⌘I

​Training session structure

​Pass 1 — Smoke test

​Pass 2 — Targeted suite

​Pass 3 — Own Custom run finalization

​Trainer checklist

​Graduation standard

​Updated training requirement

​Practice exercise

​Role-specific training

Training session structure

Pass 1 — Smoke test

Pass 2 — Targeted suite

Pass 3 — Own Custom run finalization

Trainer checklist

Graduation standard

Updated training requirement

Practice exercise

Role-specific training