Creating a profile
Fill in details
Enter a Name and optional Description. A good name describes the workload being tested,
e.g. “Contract Analysis — Accuracy”, “Code Review — Completeness”.
Assigning evals to a profile
After creating a profile, expand it to see a checklist of all your eval definitions. Check the box next to each eval you want in this profile. Unchecking removes it. Alternatively, when editing an eval definition in the Eval Editor, theprofileIds field is updated automatically when you toggle it from the profile card.
Activating a profile
Click Activate on a profile card to make it active. Only active profiles trigger automatic evaluation on new responses. You can have multiple profiles active at the same time — each one evaluates responses independently.Organising evals
Eval definitions have built-in organisation fields you can set in the Eval Editor:- Priority —
normalorcritical. A failing critical eval sets the composite score to 0 regardless of other scores. - Scope —
chat,skill,workflow, orall. Scopes an eval to only run against responses produced in a specific context. - Weight — relative weight (1–10) used in the composite score formula.
- Status —
draft(not yet running),active(auto-evaluates), orarchived.
Importing and exporting profiles
Profiles can be exported as JSON and shared with teammates. Click ⋮ → Export Profile on the profile card. The export includes all profile metadata but not run history. To import: click Import Profile in the profile list and select the JSON file.Profile best practices
- Keep profiles focused — one profile per feature area or eval dimension
- Mark business-critical evals as priority: critical so one failure triggers immediate visibility
- Start with a small set of rule and metric evals before adding LLM judge evals (lower cost, faster feedback)
Edit this page — Open a pull
request