0% helpful
0 likes 0 dislikes
Inputs Required
Golden set; expected behavior specs; thresholds; change log (what changed).
Output Artifact
Regression suite; diff report; pass/fail outcome; rollback recommendation.
When to use this
When you’re changing prompts/models/tools and need to ensure you didn’t quietly break performance.
Link copied