You type, on-device context is read
As you write in any text field, Shadowtype reads the surrounding text locally through macOS accessibility — never by uploading it anywhere.
Shadowtype runs an actual language model on your Apple Silicon chip — not in the cloud. It reads what you’re typing and predicts the rest as inline ghost text in any app; Tab accepts a word, ⌥Tab a whole line. Inference happens locally with llama.cpp + Metal, so completions land in under ~150ms on an M2 and no keystroke is ever sent to a server.
Cloud autocomplete streams your keystrokes to a server, runs the model there, and sends predictions back. Local AI autocomplete flips that: the model is downloaded once and every prediction is computed on your own silicon. Here’s the loop.
As you write in any text field, Shadowtype reads the surrounding text locally through macOS accessibility — never by uploading it anywhere.
A GGUF model (Gemma 4 or Qwen3.5, your pick) runs through llama.cpp with Metal acceleration on your Apple Silicon GPU. No API call, no round-trip latency.
The prediction shows as dimmed inline ghost text right where you’re typing — typically in under ~150ms on an M2, fast enough to feel like part of the keyboard.
Tab accepts a word, ⌥Tab a whole line. Don’t like it? Just keep typing and it dissolves. You stay in control of every character.
No network round-trip means no waiting on a server. Ghost text typically lands in under ~150ms on an M2 because the LLM runs right on your GPU via Metal. Lighter models go faster still.
Because nothing is sent for completion, your words can’t leak. Shadowtype is private by design with zero telemetry and no account — the model simply never talks to the internet.
There’s no inference API metering tokens behind the scenes, so there’s no per-use cost and no subscription to fund a server. You pay $79 once (Founders from $39) and own it.
Pick from a catalog of free GGUF models — Gemma 4 and Qwen3.5 variants. Choose a small model for instant ghost text or a larger one for richer completions. Swap any time.
Mail, Slack, Notes, your editor, the browser — continuous inline completion anywhere you can type, plus selection rewrite on ⌥⌘K when you want to reshape text you’ve already written.
Once the model is on disk, completion works offline — on a plane, behind a firewall, anywhere. There’s no server to be down and no account to expire.
Plenty of apps say “AI” while quietly calling a cloud API. Shadowtype runs the whole model on your machine and personalizes to you — locally. You can add per-app instructions (a different voice for Mail than for Slack), and it adapts to your phrasing over time, all without sending a single byte off-device. If you’re weighing it against a hosted-completion tool, see how it compares or the full feature list.
Download Shadowtype free, accept your first word with Tab, and upgrade once — never monthly.