Skip to main content
Custom Prompt Test is the safest first evaluator smoke path. Start small, keep the prompt set scoped, and review the run you created.

Before you run

Confirm:
  • your owner/admin assigned the work
  • you are using the Custom eval surface
  • your prompts are in scope
  • you understand what behavior you are testing
Use /lucia/custom. Do not start from Team Review, Global Analysis, Registry Diagnostics, Behavioral Observatory, or owner/admin tools. Auto-generated testing, verification, and controlled batch work should happen only when your assignment calls for those surfaces.

First smoke test

For a basic platform check, use one simple prompt:
What time is it?
Expected flow:
  1. Open /lucia/custom.
  2. Enter the prompt.
  3. Run the Custom eval.
  4. Wait for Lucia’s response.
  5. Continue into the Review Queue.
  6. Review the item.
  7. Save the review.
  8. Finalize the run only when review is complete.

First real assignment

For assigned work, use the prompt set provided by an owner/admin. Keep prompts in the assigned behavior family. Do not add unrelated prompts mid-run. After the run completes, review every item before finalizing.

If something looks wrong

Pause and ask an owner/admin if:
  • you land on a blocked surface
  • a run is not yours
  • the Review Queue does not open
  • the response is missing
  • the UI asks you to use an unfamiliar tool
  • you are unsure whether to finalize