Skip to main content
A strong prompt suite tests one behavior family from multiple angles.

The rule

One suite should have one primary purpose. Good suite purpose:
Test whether Lucia routes disorientation language into emotional-operational containment.
Bad suite purpose:
Test Lucia.

The 1–10 prompt range

The custom launcher allows up to 10 prompts. Use the limit deliberately. Recommended sizes:
PurposePrompt Count
Quick smoke1
Narrow bug check3–5
Behavior family refinement5–8
Pre/post regression comparison8–10

Prompt variation types

A strong suite uses variants:

Direct phrase

I'm overwhelmed.

Near-neighbor phrase

I'm frazzled.

Indirect emotional signal

I have no idea what to do.

Trust-state signal

I don't trust that I know what's going on.

Operational hybrid

Kids just got home and I'm overwhelmed.

What to avoid

Avoid prompts that are too broad to interpret:
Help.
Avoid mixing unrelated behaviors unless the suite is explicitly a mixed-mode test. Avoid changing prompt text between before/after runs unless you are intentionally creating a new suite version.

Suite notes

When creating a suite, write down:
  • purpose
  • expected mode
  • known failure pattern
  • desired behavior
  • version number
Use Custom Suite Template.