# Voice mode

> Push-to-talk audio in, friendly TTS reply out. The everyday loop.

Voice mode is the loop you'll spend 80% of your Katchy time in. Hold the hotkey, ask, listen. No chat window, no copy-paste, no tab-switching. Eyes stay on your work.

## The loop

1. Hold **Control + Option**.
2. Speak one sentence — a question, a task, a clarification.
3. Release. Katchy snapshots your screen, picks the right model, replies in your ear.
4. If a UI element matters to the answer, the cursor flies to it.

## Where it shines

- Quick "what's this" questions about whatever's on screen.
- Catching yourself before a typo or a wrong click.
- Reading something dense — "summarise this paragraph."
- Pair-programming style nudges — "is this the right hook to use?"
- Hands-free moments — kitchen, gym, sketchpad.

## When to switch to Agent Mode instead

Voice mode is conversational and synchronous — Katchy reads, thinks, replies, and the work is over. The moment you find yourself saying "and then" twice, you want [Agent Mode](/docs/agent-mode) — it can run for minutes, touch files, open apps, and report back when it's done.

## Transcription choices

By default Katchy uses Apple's on-device speech recogniser — no audio ever leaves the Mac. You can switch to a cloud provider (Deepgram or AssemblyAI) in Settings → Voice if you want faster transcription on longer questions; bring your own key.

---

- Next up: [Agent Mode](/docs/agent-mode)
- Full docs index: <https://heyyykatchy.com/docs>
