Skip to main content
Using Ollama? No API key or cloud account is required. Install Ollama, pull any model (ollama pull llama3), then point MIRA at http://localhost:11434. All inference runs fully on your machine — zero outbound network calls for queries.
Ollama lets you run open-source LLMs entirely on your machine. No API key required — your data never leaves your computer. Both the RLM and Native Agent engines support Ollama.

Prerequisites

  • macOS/Linux: Install Ollama from ollama.ai
  • Windows: Ollama is available as a preview for Windows — see ollama.ai/download
  • Minimum 8 GB RAM for 7B models; 16 GB recommended for 13B+ models

Install Ollama and pull a model

# Install (macOS with Homebrew)
brew install ollama

# Pull any model you want to use
ollama pull llama3.1
ollama pull mistral
ollama pull qwen2.5:14b     # 14B — better quality, needs 16 GB RAM
Verify Ollama is running:
ollama list
curl http://localhost:11434/api/tags

Connect MIRA to Ollama

1

Start Ollama

Ollama starts automatically at login after installation, or run ollama serve manually.
2

Select Ollama in the Engine tab

Press ⌘, → Engine tab → in the Provider & Model section, click the Ollama (local) button.
3

Enter a model name

Type the name of any model you have pulled with ollama pull into the Model ID field. MIRA shows a few suggestions as chips, but any model from ollama list works — just type the exact name.
Suggested modelRAM needed
llama3.28 GB
llama3.18 GB
mistral8 GB
qwen2.5-coder8 GB
deepseek-r18 GB
4

Set the Base URL (optional)

The Base URL field appears automatically for Ollama. Default: http://localhost:11434. Change it only if Ollama is running on a different host or port.
5

Save

Click Save & restart bridge. No API key is needed.

Performance tips

  • Use a Mac with Apple Silicon (M1/M2/M3/M4) — Ollama uses Metal GPU acceleration
  • Reduce Context Budget in the Engine → NAE settings when using smaller models (e.g. 20 000–40 000 for 7B models)
  • Keep other heavy applications closed when running 13B+ models

Troubleshooting

ErrorFix
ECONNREFUSED localhost:11434Ollama is not running — run ollama serve
Model returns garbled outputPull the model again: ollama pull <name> to ensure a complete download
Out of memoryUse a smaller model or increase system swap
Edit this page — Open a pull request