Skip to main content
Good evaluator feedback is short, specific, and useful to product or engineering. It says what happened and why it matters.

Strong notes

Good note:
Intent miss: the user said they felt out of the loop, but Lucia gave a generic capability menu instead of narrowing the next step.
Why it works:
  • names the failure
  • points to the prompt
  • explains the behavioral problem
  • gives the team something to fix

Pass note

Strong pass: Lucia acknowledged the operator's stress, gave one clear first move, and avoided pretending the issue was already solved.
Use this when the response should be repeated as a pattern.

Borderline note

Borderline: useful direction, but too much scanning. The first action should be earlier and more specific.
Use this when the response has value but needs refinement.

Fail note

Fail: Lucia sounded warm but did not answer the operational question. The user still would not know what to do next.
Use this when the response misses the job even if it sounds pleasant.

Escalation note

Needs senior review: possible overclaim. Lucia implies confirmation without evidence in the prompt or run context.
Use this when the risk needs a more experienced reviewer.

Keep notes lean

One useful sentence is better than a long explanation that hides the signal.