oi
local llms for daily coding
oi is the local-first way to use an llm for everyday coding. Point it at a model running on your own machine — through ollama or llama.cpp — and get chat, code generation, inline completions, and commit messages without a cloud account, an api key, or a per-token bill.
It ships in two halves that work together: oi-cli, which detects your local model and exposes it on a small localhost bridge, and a vs code extension that picks that bridge up automatically — a docked chat panel, inline suggestions in the editor, and a one-click commit message from your staged diff.
The wedge is control. Your code never leaves the machine, the bill is zero, and the model is yours to swap. oi stays deliberately small and local: no telemetry, no server-side history, no lock-in. Start with codellama or qwen2.5-coder and change your mind whenever you like.
how it works
- 01
install oi
Grab the cli (npm i -g oi-cli) and the vs code extension — both free.
- 02
link your local model
Run oi setup; it finds your ollama install (or a llama.cpp server) and picks a coding model.
- 03
code locally
The extension auto-detects oi — chat, completions, and commit messages, all on-device.
a look inside
a few of the screens you'll actually use.
- cli + vs code extension
- runs on ollama or llama.cpp
- private · on-device · free
message
feat(auth): add slug validation + 422 on bad input
validate route params against the slug regex and reject malformed requests early instead of failing in the handler.
+ src/lib/validators.ts
~ src/routes/posts.ts
~ src/routes/users.ts
works fully offline · $0 per token
oi guides
Ways to use oi, and how it compares.
- comparisonLocal LLM for coding vs cloud models: where running your own beats Claude and GPTLocal LLM for coding vs cloud Claude/GPT: a plain comparison on privacy, per-token cost, control, and offline use, and where each one actually wins. oi runs local models in your editor.
- how toHow to run a local coding LLM in your editor, step by stepRun a local coding LLM with oi: install the CLI, run oi setup, point it at Ollama or llama.cpp, pull a coding model, and use it from the terminal and the VS Code extension. Free and offline.
- use caseThe best local LLMs for coding, and how to run them with oiWhich open models are worth running locally for coding: DeepSeek-Coder, Qwen2.5-Coder, Code Llama, and general models like Llama 3. Sizes, strengths, and how oi runs them.
- use caseA private, offline AI coding assistant for code that can't leave the buildingWhen your code can't go to a cloud vendor: a private, offline AI coding assistant. oi runs an open model locally so prompts and source never leave your machine — works air-gapped.