Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.mira-app.dev/llms.txt

Use this file to discover all available pages before exploring further.

Using Ollama? No API key or cloud account is required. Install Ollama, pull any model (ollama pull llama3), then point MIRA at http://localhost:11434. All inference runs fully on your machine — zero outbound network calls for queries.
Ollama lets you run open-source LLMs entirely on your machine. No API key required — your data never leaves your computer. Both the RLM and Native Agent engines support Ollama.

Prerequisites

  • macOS/Linux: Install Ollama from ollama.ai
  • Windows: Ollama is available as a preview for Windows — see ollama.ai/download
  • Minimum 8 GB RAM for 7B models; 16 GB recommended for 13B+ models

Install Ollama and pull a model

# Install (macOS with Homebrew)
brew install ollama

# Pull any model you want to use
ollama pull llama3.1
ollama pull mistral
ollama pull qwen2.5:14b     # 14B — better quality, needs 16 GB RAM
Verify Ollama is running:
ollama list
curl http://localhost:11434/api/tags

Connect MIRA to Ollama

1

Start Ollama

Ollama starts automatically at login after installation, or run ollama serve manually.
2

Select Ollama in the Engine tab

Press ⌘, → Engine tab → in the Provider & Model section, click the Ollama (local) button.
3

Enter a model name

Type the name of any model you have pulled with ollama pull into the Model ID field. MIRA shows a few suggestions as chips, but any model from ollama list works — just type the exact name.
Suggested modelRAM needed
llama3.28 GB
llama3.18 GB
mistral8 GB
qwen2.5-coder8 GB
deepseek-r18 GB
4

Set the Base URL (optional)

The Base URL field appears automatically for Ollama. Default: http://localhost:11434. Change it only if Ollama is running on a different host or port.
5

Save

Click Save & restart bridge. No API key is needed.

Performance tips

  • Use a Mac with Apple Silicon (M1/M2/M3/M4) — Ollama uses Metal GPU acceleration
  • Reduce Context Budget in the Engine → NAE settings when using smaller models (e.g. 20 000–40 000 for 7B models)
  • Keep other heavy applications closed when running 13B+ models

Troubleshooting

ErrorFix
ECONNREFUSED localhost:11434Ollama is not running — run ollama serve
Model returns garbled outputPull the model again: ollama pull <name> to ensure a complete download
Out of memoryUse a smaller model or increase system swap
Edit this page — Open a pull request