Platform readiness requires Supabase truth, UI truth, and compact client persistence to agree.
Supabase truth check
For the 60-run gate, Supabase returned:- status:
ready - runs:
60 - expected prompts:
3000 - run items:
3000 - Lucia responses:
3000 - reviews:
3000
localStorage compactness target
The target diagnostic is:Final verified diagnostic
Interpretation
This proves the tested owner context did not persist full item-level payloads locally after the gate. It also proves no other owner or ownerless sessions were visible in the diagnostic for that tested context. It does not prove:- all future accounts are safe
- backend/RLS authorization is complete
- external evaluator security boundaries are ready by themselves
Behavioral label persistence check
Behavioral Observatory labels use a separate table:- the
eval_behavioral_labelstable exists - RLS policies exist
- the one-label-per-reviewer-per-run-item constraint exists
- a saved label reloads after refresh
- failed saves are not counted as persisted labels

