Skip to main content
MIRA evals run automatically — there is no manual “run” button for profiles. Every time the active engine produces a response, MIRA captures that response and runs it through all activated eval definitions. Results are stored immediately and visible in the dashboard.

Enabling automatic evaluation

1

Open Settings

Click the gear icon in the sidebar to open Settings, then select the Evals tab.
2

Toggle Enable Automatic Evaluation

Turn on Enable Automatic Evaluation. When this is off, no evals fire regardless of what profiles are active.
3

Activate eval profiles

Open the Eval Studio (Flask icon), go to the Profiles tab, and click Activate on the profiles you want running. An active profile means all eval definitions assigned to that profile will fire on each captured response.

How evaluation works

Agent response captured → fast queue (rule/metric/similarity) → result stored
                        → LLM queue (llm_judge)               → result stored
  1. Every agent response triggers the eval capture hook.
  2. MIRA runs rule, similarity, and metric evals immediately — these are fast local computations.
  3. llm_judge evals are processed via a separate queue with configurable concurrency to avoid overwhelming the judge provider.
  4. All results are stored in the local database and accessible in the dashboard.

Eval settings

Configure evaluation behaviour in Settings → Evals:
SettingDescription
Enable Automatic EvaluationMaster toggle — disables all eval capture when off
Local-Only ModeSuspends llm_judge evals; only rule, similarity, and metric evals run
LLM ConcurrencyNumber of simultaneous LLM judge calls (1–4)
Data RetentionRetention period for stored results: 7 / 30 / 90 / 180 days, or Forever
Run Cleanup NowImmediately purge results older than the retention window

Monitoring results

After chatting with the engine, open the Eval Studio dashboard:
  • Conversations tab — one card per captured agent response, showing pass/fail summary across all active eval definitions
  • Eval Health tab — per-eval-definition pass rate trends over time
  • Compare tab — A/B comparison across two conversations or time windows

Cancelling / pausing evaluation

To stop evals from firing temporarily, toggle Enable Automatic Evaluation off in Settings → Evals. Active profiles and eval definitions remain configured — re-enabling resumes capture from the next response onward.
Edit this page — Open a pull request