Skip to main content
Eval Labs must make reviewing Lucia feel simple, calm, and low-pressure because reviewer fatigue damages evaluation quality.

Doctrine

Reviewer cognitive load is not a UI detail. It is a data-quality risk. If employees feel confused, intimidated, or forced to think like AI experts, the review signal becomes weaker.

What reviewers should not have to do

Reviewers should not have to:
  • understand AI training theory
  • invent intent labels
  • decide taxonomy names
  • write long analytical notes
  • explain model behavior
  • know what adjudication metadata means

What reviewers should do

Reviewers should:
  • read the prompt
  • read Lucia’s response
  • use guided controls
  • answer quickly and honestly
  • flag senior review when uncertain
  • add a short note only when it helps

Interface rule

Eval Labs should prefer:
guided choices
over
open-ended interpretation
and:
visual cognition
over
form-filling

Why semantic UI matters

Semantic controls such as confidence sliders reduce translation burden. The reviewer should feel the difference between:
weak / concerning
uncertain / mixed
strong / confident
without needing to reread a scoring guide every time.

Canon rule

A review interface that creates psychological paralysis will produce worse training signal, even if the schema is technically correct.