Does Shadowtype really work offline?

Yes, completely. Predictions run on a local model with llama.cpp and Metal on your own Apple Silicon. There is no inference network call — ever. You can turn off Wi-Fi, switch to Airplane Mode, or work air-gapped, and Shadowtype keeps finishing your sentences with no loss of quality or speed.

Is it good for flights and secure environments?

Yes. Because inference is fully local, Shadowtype is ideal for flights, trains, spotty connections, and air-gapped or secure environments where cloud autocomplete simply can't run. There is no per-API-call cost and no network latency — it's the same instant ghost text whether you're online or offline.

Do I need an account to use it?

No. Shadowtype has no account and no sign-in — it is free and open source, with no login at all and no activation. Zero telemetry, no analytics backend.

Offline text prediction · macOS

Offline text prediction for Mac:
autocomplete with no internet.

Shadowtype finishes your sentences as inline ghost text in any app — and it does it without ever calling a server. There is no inference network request, period. Pull the Wi-Fi, board the flight, work air-gapped: it keeps predicting at full speed because the model runs 100% on your own Apple Silicon with llama.cpp and Metal.

Download Shadowtype free View source

Works fully offline
No inference call ever
Zero telemetry
Free & open source

Why offline matters

The autocomplete that doesn’t need the cloud

Cloud autocomplete dies the moment your connection does — and it ships every keystroke to someone else’s server. Shadowtype runs the whole prediction loop on your Mac, so the network is never in the path between you and your next word.

No inference network call — ever

Predictions are computed locally with a downloaded model. There is no API request to OpenAI, to us, or to anyone, for any completion. Turn off Wi-Fi and watch the ghost text keep flowing — same words, same instant speed.

Built for flights & spotty Wi-Fi

Airplane Mode, a train tunnel, a hotel connection that drops every minute — none of it matters. Latency-free local inference means there’s no spinner, no “reconnecting,” no degraded suggestions when the bars disappear.

Safe for air-gapped & secure work

If your machine can’t talk to the internet by policy, Shadowtype still works. Nothing you type is transmitted because there’s nowhere for it to go — no telemetry backend, no analytics, no account. Set it up once, then disconnect forever.

No per-call cost, no rate limits

Cloud completion bills per token and throttles you under load. Local inference is free to run as hard as your Mac allows — type all day, every day, with no usage meter on the model itself.

Local API + MCP for developers

An OpenAI-compatible HTTP endpoint runs on 127.0.0.1 so your scripts, editors, and agents can call the same on-device model — no key, no cloud. A built-in MCP bridge plugs into Claude Code and other MCP clients, and the BYOM picker loads any GGUF.

What actually touches the network

Three optional connections. None of them your text.

Being honest about “offline” means naming every byte that could leave the Mac. Here is the complete list — and your keystrokes, prompts, and completions are on none of it.

One-time model download

Pick a GGUF model from the catalog (Qwen3 Base, Gemma 3, MoE variants — or bring your own) and it downloads once to your Mac. After that the file is local forever — re-launch on a plane and it loads straight from disk.

No account, no activation

There is no license and no activation. Shadowtype is free and open source — it runs offline indefinitely with no recheck and no phone-home, from the moment you install it.

Optional update check

Shadowtype can check for a new version periodically so you get fixes and new models. It’s the only recurring outbound traffic, it carries none of your content, and you can simply ignore it.

Want the why behind the architecture? See how the local on-device LLM runs entirely on your hardware, why that makes it private by construction, and the full feature list.

Same app, online or off

Everything works the same with the cable unplugged

Continuous ghost text in any app

Mail, Notes, Slack, your editor, a web form — Shadowtype predicts inline as you type. Tab accepts a word, ⌥Tab accepts the whole line. Offline changes none of it.

On-device selection rewrite

Select any text and press ⌥⌘K to rewrite it — shorter, clearer, a different tone — all computed locally. No connection required, your draft never leaves the machine.

Your choice of local model

Run a compact Qwen3 Base or Gemma 3 for speed, or an MoE variant for richer predictions — all downloaded once and executed on-device. Swap models anytime, or drop in your own GGUF; the inference path stays fully offline.

Questions

Offline autocomplete FAQ

Does it really work offline?

Completely. Predictions run on a local model with llama.cpp and Metal on your Apple Silicon — there is no inference network call, ever. Turn off Wi-Fi or go to Airplane Mode and the ghost text keeps coming at the same speed and quality. The connection is never in the path between your keystroke and your next word.

What network calls happen at all?

Exactly two, and neither carries your text: a one-time model download when you first choose a model, and an optional periodic update check you can ignore. Your keystrokes, prompts, and completions never leave the Mac — there’s no telemetry or analytics backend.

Is it good for flights or secure environments?

Yes — that’s where local inference shines. On a plane, a train, a flaky hotel connection, or an air-gapped secure machine, cloud autocomplete can’t run at all. Shadowtype doesn’t care: same instant suggestions, no spinner, no per-API-call cost, no latency. Set it up once, then disconnect for good.

Do I need an account?

No account, no sign-in. Shadowtype is free and open source — it needs no login and no activation, and runs offline indefinitely on as many of your Macs as you like. Requires macOS 14+ on Apple Silicon.

Can my own scripts, editors, and agents use the same local model?

Yes. Shadowtype exposes an OpenAI-compatible HTTP endpoint on 127.0.0.1 with no key required, so your terminal, editor plugins, and AI agents can call /v1/chat/completions against the same on-device model. A built-in MCP server plugs into Claude Code and other MCP clients, and the BYOM picker loads any GGUF you drop in. Everything is bound to localhost — nothing leaves your Mac.

Ready when you are

Autocomplete that works with the Wi-Fi off.

Download Shadowtype free, accept your first word with Tab, then pull the connection and watch it keep going.

Download for macOS View source

Works offline
No inference call
No account
Zero telemetry
Open source

Offline text prediction for Mac:autocomplete with no internet.