When you need AI assistance for development but find yourself offline (whether you're on a flight, gone camping, or facing the inevitable zombie apocalypse) you'll appreciate having local LLMs ready in your workflow!
Figure: Will local LLMs catch up with cloud-based?
Running LLMs locally unlocks ultimate freedom, through privacy, offline use, and control.
Local LLMs ensure your data never leaves your machine, providing consistent performance without internet dependencies, and offering cost savings for high-volume usage.
| ⭐️ Ollama | ⭐️ LM Studio | Microsoft Foundry Local | |
| Open Source? | Yes 👍 | No 👎 | No 👎 |
| UI | Simple chat and model management interface | Full desktop UI | CLI |
| Models | Large open‑source library (Llama, Mistral, Qwen, etc.) | Supports most models from Hugging Face | Microsoft‑curated selection |
| Endpoint/API (OpenAI schema?) | Yes | Yes | Yes |
| Cost | Free | Free | Free; enterprise licensing may apply |
| Best for | Simple and lightweight, great for backends | Polished UX, great for experimentation | Enterprise integration in .NET ecosystem |
Figure: Chat interfaces in LM Studio (left), and Ollama (right)
Local LLMs can be used for code completion and assistance. This is especially handy when you want to perform AI Assisted Development without an internet connection.
⭐️ Cline is an open source VSCode extension that adds AI enhanced workflows to your IDE, with comprehensive support for various model providers. We can link it up with LM Studio or Ollama without any complex configuration, simply clicking a button.
Animated GIF: Using Cline locally with Qwen-3
GitHub Copilot - You've probably heard of it:
Continue - Open-source VS Code and JetBrains extension:
Tabby - Self-hosted AI coding assistant:
The open-source model landscape evolves rapidly, with new models released weekly that often surpass current leaders.
Rather than recommending specific models that may become outdated, consider these resources for current information:
Model size (B = billion parameters) directly impacts hardware requirements:
Note: Without a powerful GPU, locally-run models may not produce code quality suitable for development work.
codellama:7b or llama3:8b to establish baseline performance, then upgrade or downgrade to meet that performance sweet-spotRunning LLMs locally provides developers with powerful AI capabilities while maintaining control over their data and environment.
Local LLMs have only recently been able to compete with closed source, cloud hosted ones.
Whether you choose Ollama for simplicity or Foundry Local for enterprise features, local LLMs provide ultimate freedom, and they're just getting started.
What excites you about local LLMs?