How it works
MIRA computes TF-IDF bag-of-words cosine similarity entirely locally — no external embedding model or API call is required. Both the output and the reference are converted into TF-IDF vectors, then their cosine similarity is computed in-process. This means:- No embedding provider configuration needed
- No network latency or API cost
- Works fully offline / in Local-Only Mode
When to use
- The correct answer can be expressed in many ways (synonyms, paraphrases with overlapping vocabulary)
- You have a gold-standard reference answer to compare against
- You want a continuous quality score (0–1) rather than binary pass/fail
Configuring a semantic similarity eval
Open the Eval Studio and click + New
Click the Flask icon in the sidebar, then click + New in the left panel.
Enter a name and reference answer
Give the eval a name. In the Reference Answer field, type the ideal output. This does not need to be the exact wording — just the correct content with overlapping key terms.
Set the pass threshold
Set the minimum cosine similarity score to consider a pass. Default: 0.80.
| Threshold | Strictness |
|---|---|
| 0.90+ | Very strict — near-identical phrasing required |
| 0.80–0.89 | Strict — same content, different words acceptable |
| 0.70–0.79 | Moderate — paraphrasing and some omissions acceptable |
| < 0.70 | Lenient — general topic alignment |
Scoring
Each run produces a similarity score between 0 and 1:- ≥ threshold → ✅ Pass
- < threshold → ❌ Fail
Edit this page — Open a pull
request