Skip to main content
Behavioral Observatory labels persist through public.eval_behavioral_labels. The current narrow contract stores one label per reviewer per run item and assumes the reviewer is the signed-in owner/user for this privileged label surface.

Status

Canonical migration: implemented in supabase/migrations
Human SQL helper: implemented in scripts/sql
Behavioral labels table: required for persisted labels
Saved label reload: implemented when Supabase persistence is available
Reviewer = signed-in owner/user assumption: current narrow contract
Evaluator reviewing owner run: future / policy changes required
Do not treat a helper SQL file as proof that the migration was applied to live Supabase.

Table

Behavioral Observatory labels are stored in:
public.eval_behavioral_labels
Purpose:
Store first-class Behavioral Observatory labels for Eval Labs run items.

Key fields

The table stores:
run_id
run_item_id
owner_user_id
reviewer_user_id
intent
guest_affect
response_strategy
humanness
notes
status
payload
created_at
updated_at
The important product distinction:
eval_item_reviews = Review Queue review evidence
eval_behavioral_labels = Behavioral Observatory label evidence
Do not collapse those two meanings.

Label values

Current intent values:
Booking Help
Check-In
Checkout
Billing
Noise
Room Issue
Concierge
Other
Current guest_affect values:
Neutral
Mildly Upset
Upset
Grateful
Current response_strategy values:
Acknowledge
Apology
Offer
Escalation
Current humanness range:
1 through 7
Current status values:
draft
saved

One-label rule

There is one label per reviewer per run item. The current uniqueness rule is:
unique (run_item_id, reviewer_user_id)
Saving again updates the same reviewer’s label for that run item instead of creating a second label.

RLS assumptions

Current RLS uses the Clerk JWT subject:
auth.jwt() ->> 'sub'
The current policies require the signed-in user to match:
owner_user_id
reviewer_user_id
eval_runs.created_by
This is intentionally narrow. The current narrow contract assumes:
reviewer = signed-in owner/user
That means current policies are not yet the broader evaluator-reviewing-owner-run workflow.

Current limitation

The current persistence model is good for owner/user-owned run labeling on the privileged Behavioral Observatory surface. It is not yet sufficient for a future workflow where:
an evaluator reviews an owner/admin run
That future workflow requires policy and product changes so the reviewer can read/write the correct row without pretending to own the run. Do not claim this is already implemented.

Saved / unsaved / error states

The UI should preserve these truth states:
  • derived: suggested from existing run/review context, not saved
  • unsaved: reviewer changed values but did not save
  • saving: save is in progress
  • saved: Supabase confirmed the label
  • error: save/load failed
Only saved should be treated as persisted Behavioral Observatory label data.

SQL workflow

Canonical migrations live in:
supabase/migrations
Human copy/paste SQL helpers live in:
scripts/sql
scripts/sql exists so an operator can open SQL in Finder/CotEditor and copy/paste into the Supabase SQL Editor. scripts/sql is not proof that a migration was applied. After applying SQL, run the matching verify SQL and confirm the live schema, policies, and counts.

Current helper files

Apply helper:
scripts/sql/20260603_apply_eval_behavioral_labels.sql
Verify helper:
scripts/sql/20260603_verify_eval_behavioral_labels.sql
Canonical migration:
supabase/migrations/20260603000000_eval_behavioral_labels.sql

Future support

Future evaluator workflows may require:
  • owner/admin assignment records
  • evaluator-to-owner review grants
  • RLS policies that allow assigned reviewers to label owner runs
  • clearer reviewer/owner separation in UI copy
  • audit views for who saved which Behavioral Observatory label
Treat these as future/deferred until implemented and verified.