# Agent engines: Codex and Hermes

> Two engines power Agent Mode. Codex is the polished default. Hermes is the open, fully-local alternative. Here's when each one wins.

URL: <https://heyyykatchy.com/docs/agent-engines>

When you hand a long task to Katchy — clean up the Desktop, summarise this 60-page PDF, rename every screenshot by content — an agent engine spawns in the background, plans the work, executes step by step, and reports back on a small dock card. Katchy ships with two engines you can swap in **Settings → Agent → Engine**. Same UX, same hotkey, same permission model. Different brain underneath.

## Codex — the default

Codex is what runs the first time you open Katchy. It routes each agent task to a frontier OpenAI model for reasoning, and uses Katchy's own bundled runtime to handle the planning loop, tool calls, and dock-card lifecycle. The model picks itself based on the task — quick edits go to faster models, long-context work routes to the deeper ones.

- Polished out of the box — the engine the app was built around first.
- Talks to OpenAI's API; needs your own `OPENAI_API_KEY` pasted into Settings.
- Best on multi-step tasks that benefit from a top-tier reasoning model: research, refactors, long-document work.
- Pay-per-token to OpenAI — typically cents per task, depending on length.

## Hermes — the open alternative

Hermes wraps the `hermes-agent` CLI from Nous Research. Katchy ships the entire Python runtime (~360 MB) + the hermes-agent install **bundled inside Katchy.app** — there's no `pip install`, no Terminal, no setup wizard. Switch to Hermes in Settings and the next agent task runs through the local Python instead of the OpenAI API.

- Fully local agent loop — the planning, tool selection, and self-checking all happen on your Mac.
- Zero install. The runtime ships with the app and is signed + notarised as part of the build.
- No OpenAI key required for the agent itself; bring your own model endpoint if you want frontier-class reasoning, or run a smaller local model.
- Best for privacy-sensitive work, offline-friendly workflows, and anyone curious about where open agent tooling is headed.

> **Note.** Hermes here is the open-source agent loop from Nous Research — same project name, completely unrelated to Facebook's old JavaScript engine. The bundled CLI lives at `Katchy.app/Contents/Resources/HermesRuntime` if you ever want to inspect it.

## Side by side

Both engines obey the same agent contract — same hotkey, same permissions, same dock card, same destructive-action prompt, same Cmd-Z. What differs is everything below the UX layer.

- **Setup** — Codex needs an OpenAI key; Hermes works the moment Katchy launches.
- **Network** — Codex sends each task to OpenAI's servers; Hermes runs the loop locally and only reaches the network if you've wired it to a remote model.
- **Cost** — Codex is pay-per-token; Hermes has zero per-task cost (you pay once, in download size).
- **Speed** — Codex is usually faster on first-token because OpenAI's models are larger and warmer; Hermes is competitive for short tasks and unmatched for offline use.
- **Ceiling** — Codex inherits OpenAI's latest reasoning capability; Hermes inherits whatever open model you point it at.

## Which one should you pick

1. **Start with Codex.** It's the default for a reason — it's the engine the rest of Katchy was tuned against, and the one most likely to handle whatever you throw at it on the first try.
2. **Switch to Hermes** if any of the following hits: you don't want to paste an OpenAI key, you're working on something private enough that you'd rather not send it over the wire, you're on a flaky connection, or you're curious about open agent tooling and want to test-drive it without leaving the app.
3. **You can flip between them per session** — the choice lives in Settings → Agent → Engine and takes effect on the next agent task. In-flight tasks keep running on whichever engine started them.

## What stays the same either way

Same Control + Option hotkey starts and stops the agent. Same dock card surfaces progress. Same macOS permissions gate every file read, click, and shell command. The engine choice is plumbing — your habits don't change when you switch.

## Related pages

- [Agent Mode](https://heyyykatchy.com/docs/agent-mode) — the broader Agent Mode tour.
- [Privacy](https://heyyykatchy.com/docs/privacy) — what each engine sends, and where.
- [Keyboard shortcuts](https://heyyykatchy.com/docs/keyboard-shortcuts) — the single hotkey both engines respect.

Made by Rokas Rudzianskas and Vakarė.
