RLM Engine - MIRA

The RLM Engine (Recursive LLM) approaches problems the way a careful analyst does: by writing Python code to investigate, executing it, observing what actually happened, and refining until the result is verified.

How it works

Question received
      │
      ▼
  Write Python code to investigate
      │
      ▼
  Execute in a local Python REPL
      │
      ▼
  Observe the actual output
      │
      ▼
  Verified? ──No──► Refine reasoning, rewrite code, repeat
      │
     Yes
      │
      ▼
  Return a confident, evidence-backed response

Every iteration streams to you in real time via the REPL Console. Every piece of code that ran, every output it produced — visible, auditable, honest.

Why code execution matters

When a language model answers a quantitative question without executing code, it generates a statistically likely answer — which may or may not be numerically correct. RLM removes that ambiguity: the Python interpreter computes the answer, not the model’s internal statistics. What this means in practice:

Calculations are computed, not inferred
Cross-referenced data is matched by code, not by pattern matching
Statistical results come from actual code calls, not approximations
If the code raises an exception, the engine sees the actual error and corrects its approach

Iteration limit

RLM iterates until either:

The answer is verified (the code runs successfully and produces a result the model judges correct)
The max iterations limit is reached (default: 30)

At the limit, RLM returns its best answer based on all iterations completed, along with a note that it did not reach a verified conclusion.

Execution environment

Python code executes inside the bundled Python virtualenv (mira-venv/) on your local machine. There is no sandbox — the code runs with the same OS user permissions as the MIRA process:

It can read and write any file your OS user can access
It can make outbound network calls
It can import any library installed in the virtualenv

This is intentional: unrestricted execution is what allows RLM to work with real local data and external APIs. The tradeoff is that you should only run queries you trust — or review the generated code in the REPL Console before it executes.

Provider support

Provider	Models (examples)
AWS Bedrock	eu/us.anthropic.claude-sonnet-4, claude-3-5-sonnet-v2, claude-3-5-haiku
Anthropic	claude-3-5-sonnet, claude-3-5-haiku, claude-3-opus
OpenAI	gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo
Ollama	llama3.2, mistral, qwen2.5-coder, deepseek-r1, and any local model

Default: AWS Bedrock with eu.anthropic.claude-sonnet-4-20250514-v1:0

Concurrency

RLM supports up to 3 concurrent sessions by default (configurable). If you send a query while 3 sessions are already in flight, MIRA shows a concurrency-limit error and queues the request.

Configuration reference

See RLM Settings Reference for all configurable parameters with defaults and allowed ranges.

Best used for

Complex data analysis — “analyse this CSV and find the top 3 factors driving churn”
Multi-step calculations — financial modelling, statistical tests, simulations
Cross-referencing — “find every row in this dataset that contradicts this policy document”
Anything where being provably correct matters more than being fast

If speed matters more than verifiability, switch to the Native Agent Engine — it reaches conclusions faster for open-ended questions where code execution isn’t needed.

​How it works

​Why code execution matters

​Iteration limit

​Execution environment

​Provider support

​Concurrency

​Configuration reference

​Best used for

How it works

Why code execution matters

Iteration limit

Execution environment

Provider support

Concurrency

Configuration reference

Best used for