Skip to main content
The RLM Engine (Recursive LLM) approaches problems the way a careful analyst does: by writing Python code to investigate, executing it, observing what actually happened, and refining until the result is verified.

How it works

Question received


  Write Python code to investigate


  Execute in a local Python REPL


  Observe the actual output


  Verified? ──No──► Refine reasoning, rewrite code, repeat

     Yes


  Return a confident, evidence-backed response
Every iteration streams to you in real time via the REPL Console. Every piece of code that ran, every output it produced — visible, auditable, honest.

Why code execution matters

When a language model answers a quantitative question without executing code, it generates a statistically likely answer — which may or may not be numerically correct. RLM removes that ambiguity: the Python interpreter computes the answer, not the model’s internal statistics. What this means in practice:
  • Calculations are computed, not inferred
  • Cross-referenced data is matched by code, not by pattern matching
  • Statistical results come from actual code calls, not approximations
  • If the code raises an exception, the engine sees the actual error and corrects its approach

Iteration limit

RLM iterates until either:
  • The answer is verified (the code runs successfully and produces a result the model judges correct)
  • The max iterations limit is reached (default: 30)
At the limit, RLM returns its best answer based on all iterations completed, along with a note that it did not reach a verified conclusion.

Execution environment

Python code executes inside the bundled Python virtualenv (mira-venv/) on your local machine. There is no sandbox — the code runs with the same OS user permissions as the MIRA process:
  • It can read and write any file your OS user can access
  • It can make outbound network calls
  • It can import any library installed in the virtualenv
This is intentional: unrestricted execution is what allows RLM to work with real local data and external APIs. The tradeoff is that you should only run queries you trust — or review the generated code in the REPL Console before it executes.

Provider support

ProviderModels (examples)
AWS Bedrockeu/us.anthropic.claude-sonnet-4, claude-3-5-sonnet-v2, claude-3-5-haiku
Anthropicclaude-3-5-sonnet, claude-3-5-haiku, claude-3-opus
OpenAIgpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo
Ollamallama3.2, mistral, qwen2.5-coder, deepseek-r1, and any local model
Default: AWS Bedrock with eu.anthropic.claude-sonnet-4-20250514-v1:0

Concurrency

RLM supports up to 3 concurrent sessions by default (configurable). If you send a query while 3 sessions are already in flight, MIRA shows a concurrency-limit error and queues the request.

Configuration reference

See RLM Settings Reference for all configurable parameters with defaults and allowed ranges.

Best used for

  • Complex data analysis — “analyse this CSV and find the top 3 factors driving churn”
  • Multi-step calculations — financial modelling, statistical tests, simulations
  • Cross-referencing — “find every row in this dataset that contradicts this policy document”
  • Anything where being provably correct matters more than being fast
If speed matters more than verifiability, switch to the Native Agent Engine — it reaches conclusions faster for open-ended questions where code execution isn’t needed.