Adding an Eval Type

The four built-in eval types (rule, similarity, llm_judge, metric) cover most use cases. If you need a custom scoring strategy, you can add a new executor function following the same pattern as the existing ones in src/main/eval-executors/.

1. Understand the existing pattern

Each eval type is a standalone async (or sync) function in src/main/eval-executors/. There is no shared class hierarchy — each executor receives a typed config and returns a typed result:

// src/main/eval-executors/rule-executor.ts (simplified)
export async function executeRule(
  config: RuleConfig,
  output: string
): Promise<RuleResult> { … }

// src/main/eval-executors/metric-executor.ts (simplified)
export function executeMetric(
  config: MetricConfig,
  metrics: MetricInput
): MetricResult { … }

2. Define the config and result types

Add your new config and result types to src/shared/eval-types.ts:

// src/shared/eval-types.ts — extend EvalType union
export type EvalType = 'llm_judge' | 'rule' | 'similarity' | 'metric' | 'my_custom'

// Add config interface
export interface MyCustomConfig {
  expectedFormat: string // example field
  caseSensitive: boolean
}

// EvalConfig union — add your config
export type EvalConfig =
  | RuleConfig
  | LlmJudgeConfig
  | SimilarityConfig
  | MetricConfig
  | MyCustomConfig

3. Implement the executor

Create a new file in src/main/eval-executors/:

// src/main/eval-executors/my-custom-executor.ts
import type { MyCustomConfig } from '../../shared/eval-types'
import log from 'electron-log/node'

export interface MyCustomResult {
  passed: boolean
  detail: string
  error?: string
}

export function executeMyCustom(config: MyCustomConfig, output: string): MyCustomResult {
  try {
    const target = config.caseSensitive
      ? config.expectedFormat
      : config.expectedFormat.toLowerCase()
    const actual = config.caseSensitive ? output : output.toLowerCase()

    const passed = actual.includes(target)
    return {
      passed,
      detail: passed
        ? `Output contains expected format "${config.expectedFormat}"`
        : `Output does not contain "${config.expectedFormat}"`,
    }
  } catch (err) {
    log.error('MyCustomExecutor error:', err)
    return { passed: false, detail: '', error: String(err) }
  }
}

4. Wire it into the worker

The src/workers/eval-worker.ts file dispatches to each executor based on eval.type. Add a case for your new type:

// src/workers/eval-worker.ts — inside the switch on eval.type
case 'my_custom': {
  const { executeMyCustom } = await import('../main/eval-executors/my-custom-executor')
  const result = executeMyCustom(eval.config as MyCustomConfig, run.outputResponse)
  score = result.passed ? 1 : 0
  passed = result.passed
  detail = result.detail
  error = result.error
  break
}

5. Add the config UI

In src/renderer/src/components/evals/, add a settings component for your config fields and register it in the eval case form — follow the pattern of the existing RuleConfig form component.

6. Add a DB migration (if needed)

If your executor needs additional columns in the eval_definitions settings blob, the config is stored as JSON so no column migration is required — just add your fields to the TypeScript interface.

7. Write tests

// src/main/eval-executors/my-custom-executor.test.ts
import { executeMyCustom } from './my-custom-executor'

describe('executeMyCustom', () => {
  it('passes when output contains expected format', () => {
    const result = executeMyCustom(
      { expectedFormat: 'Conclusion', caseSensitive: false },
      'In Conclusion, the analysis shows…'
    )
    expect(result.passed).toBe(true)
  })

  it('fails when expected format is absent', () => {
    const result = executeMyCustom(
      { expectedFormat: 'Conclusion', caseSensitive: false },
      'The analysis shows mixed results.'
    )
    expect(result.passed).toBe(false)
  })
})

8. Update docs

Add your eval type to the Eval Framework section in docs/docs.json and create a new page at docs/eval-framework/my-custom.mdx following the pattern of the existing eval type pages.

Edit this page — Open a pull request

​1. Understand the existing pattern

​2. Define the config and result types

​3. Implement the executor

​4. Wire it into the worker

​5. Add the config UI

​6. Add a DB migration (if needed)

​7. Write tests

​8. Update docs

1. Understand the existing pattern

2. Define the config and result types

3. Implement the executor

4. Wire it into the worker

5. Add the config UI

6. Add a DB migration (if needed)

7. Write tests

8. Update docs