The Eval Dashboard surfaces all automatically captured eval results so you can track quality over time and quickly identify regressions.Documentation Index
Fetch the complete documentation index at: https://docs.mira-app.dev/llms.txt
Use this file to discover all available pages before exploring further.
Opening the dashboard
Click the Flask icon in the sidebar to open the Eval Studio. The Dashboard view opens by default.Dashboard tabs
Conversations
A card for each captured agent response. Each card shows:- The conversation input (truncated)
- A pass/fail summary across all eval definitions that fired on that response
- A timestamp
Eval Health
Per-eval-definition pass rate trends over time. Use this tab to see whether a specific eval definition is consistently passing or regressing, independent of which conversation triggered it.Compare
A/B comparison view for comparing results across two conversations or time windows. See A/B Comparison for details.Run detail view
Opening a conversation card shows a breakdown per eval definition:| Field | Description |
|---|---|
| Eval name | The eval definition that ran |
| Type | rule, llm_judge, similarity, or metric |
| Score | Numerical score (0–1 for llm_judge/similarity; 1/0 for rule; ms/count for metric) |
| Pass / Fail | Whether the score met the eval’s pass threshold |
| Reasoning | For LLM judge evals, the judge model’s explanation |
| Human override | If a reviewer submitted a manual score, shown here |
Exporting
Click Export in the dashboard to download results as CSV or JSON. See Exporting Results.Edit this page — Open a pull
request