The auto-generated 50-prompt test is for broad coverage and regression detection. It is separate from the Controlled Batch Runner.
What it is
The auto-generated launcher generates a full 50-prompt session through Lucia and sends the responses into the same Review Queue used by Custom runs. Canonical route:When to use it
Use the auto-generated 50-prompt battery when:- a broader Engine change has landed
- a model or prompt change may affect multiple behavior families
- an owner/admin validation pass is needed
- an evaluator assignment needs broader generated coverage
- a tester cohort needs prompt-testing signal beyond a custom suite
- you want broad confidence after targeted custom-suite refinement
When not to use it
Do not use the 50-prompt battery when trying to isolate one behavior bug. Use a custom suite first. The auto-generated battery is a net, not a scalpel. Do not use it as the controlled platform-readiness gate. Use Controlled Batch Runner Protocol for that.Review strategy
When reviewing a 50-prompt run:- Review obvious failures first.
- Look for repeated patterns.
- Do not over-focus on one strange answer.
- Export after completing enough reviews to support a product decision.
- Create a custom suite from repeated failure patterns.
Relationship to Custom suites
A strong workflow often looks like this:Auto-generated 50-prompt run finds patternreviewer creates targeted custom suiteengineer patches Luciacustom suite confirms fixauto-generated run checks neighboring regressions
Relationship to controlled batches
The auto-generated tester and Controlled Batch Runner both exercise 50-prompt run infrastructure, but they serve different product jobs.Auto-generated 50-prompt suite in Eval Labs



