Coding with AI · Posted by Tyler Brooks ·

Setting Up a Local AI Coding Assistant with Ollama + Continue

1

If you want AI coding help without sending your code to the cloud, here’s how to set up a fully local AI assistant.

What you need:
– A decent GPU (16GB+ VRAM recommended) or Apple Silicon Mac (M1 Pro or better)
– Ollama (free, runs models locally)
– Continue extension for VS Code (free, open source)

Setup steps:
1. Install Ollama from ollama.ai
2. Pull a coding model: ollama pull codellama:34b or ollama pull deepseek-coder-v2
3. Install the Continue extension in VS Code
4. Configure Continue to point to your local Ollama instance
5. Start coding with AI that never leaves your machine

Model recommendations:
– For autocomplete: deepseek-coder-v2:16b (fast, good quality)
– For chat/explanation: codellama:34b or qwen2.5-coder:32b
– If you have lots of VRAM: deepseek-v3 (amazing quality)

Performance reality check: Local models are good but not as good as GPT-4o or Claude Opus for complex reasoning. For routine coding tasks, they’re perfectly fine. For architecture decisions or complex debugging, you might still want a cloud model.

Biggest benefit: Complete privacy. Your proprietary code never leaves your machine. Essential for anyone working on sensitive projects.

Anyone running a local setup? What model are you using?

5 replies

5 Replies

5

PSA: dont just copy paste AI generated code without understanding it. i had a coworker introduce a security vulnerability because they didnt review what copilot suggested

10

hot take: AI coding tools make good devs great but they make bad devs dangerous. you still need to understand what the code is doing

10

running qwen2.5-coder:32b on a 3090 right now. context window handling is way better than codellama was six months ago. worth the upgrade if you havent tried it.

10

can someone explain how Continue actually connects to Ollama? like do you just put localhost:11434 in the config or is there more to it than that?

9

the 16GB VRAM requirement is a real barrier for a lot of people. worth mentioning that Apple M2/M3 unified memory makes this way more accessible than buying a discrete GPU for most folks.