Table of Contents ▾
Local AI Models Explained —
Which One Should You Run?
Choosing a local AI model shouldn't require a PhD. This guide explains the most popular free models in plain English — what each one is good at, how much RAM you need, and which one fits your computer right now.
About These Models
All models on this page are free and open-source. You download them once and run them locally — no subscription, no data sent to a company, and they keep working offline. They run through tools like Ollama or LM Studio, which handle the technical setup for you.
The biggest factor in which model you can run is RAM (your computer's working memory). More RAM = bigger, smarter models. A GPU helps a lot with speed but is not required — all models here work on CPU-only hardware.
Numbers like "7B" or "3.8B" mean billions of parameters — essentially the size of the model's brain. More parameters = more capable, but also more RAM required. A 7B model needs roughly 6–8 GB RAM; a 70B model needs 40 GB+. Start with what your hardware supports.
Check Your RAM
Not sure what you can run?
Click below and we'll automatically detect your RAM and tell you which models are a good fit.
Llama 3.2
Best all-around beginner model — versatile, well-supported, and one of the most popular models in the entire local AI community.
- Free and open-source with a massive community — thousands of guides and tutorials exist
- Excellent for general chat, writing, and summarising documents
- The 3B version runs on just 4 GB RAM — works well on older or budget laptops
- Coding is not its strongest suit compared to Mistral or Qwen 2.5
- The most capable 70B+ variants need 40 GB+ RAM
Mistral 7B
Punches well above its size — fast, efficient, and surprisingly capable at coding and following complex instructions.
- Remarkably capable for a 7B model — outperforms older 13B models on many tasks
- Fast responses even on CPU-only hardware with 8 GB RAM
- Good at writing, instruction-following, and light-to-medium coding tasks
- Smaller knowledge base than 13B+ models — less depth on niche topics
- Can lose track in very long documents or complex multi-step reasoning chains
Phi-3 Mini
The best choice for low-end PCs — Microsoft engineered this 3.8B model specifically to squeeze maximum capability out of minimal hardware.
- Runs comfortably on 4 GB RAM — good for older laptops and budget PCs
- Very fast on CPU-only hardware — responses feel snappy compared to larger models
- Designed by Microsoft specifically for efficiency, not just shrunk down from a larger model
- Less capable on complex multi-step reasoning compared to 7B+ models
- Smaller context window — can lose track in long conversations
Gemma 2
Google's polished open model — strong at research, analysis, and structured tasks like document summarisation and report writing.
- Strong reasoning and document analysis — great for research and summarisation tasks
- Well-tested and refined by Google — consistent and reliable output quality
- Good at producing structured, well-organised responses
- The capable 9B version needs 8 GB+ RAM — the smaller 2B version is more limited
- Coding is weaker than Qwen 2.5 or Mistral
Hermes 3
Built specifically for AI agents and task automation — the natural choice if you're running Hermes Agent on your Windows PC.
- Purpose-built for agentic tasks — excels at multi-step planning, tool use, and following complex workflows
- Exceptional at following multi-part instructions without losing track of context
- Designed to pair with Hermes Agent — they work together out of the box
- Overkill for simple chat — lighter models are faster for basic Q&A
- Needs 8 GB+ for the 8B version; best performance at 16 GB+
Qwen 2.5
Top-tier coding model in the local AI space — exceptional at programming, math, and logic with strong multilingual support built in.
- Best-in-class coding performance among local models — especially the Coder variant
- Excellent at math, logic, and structured problem solving
- Strong multilingual support — one of the best options for non-English use cases
- Newer than Llama or Mistral — smaller community and fewer third-party guides
- Most capable variants (14B+) need 16 GB+ RAM
How to Install These Models
All models on this page can be installed through Ollama or LM Studio — free tools that handle the technical setup for you. No code required.
Running Hermes Agent? Read the Hermes Windows install guide — it covers how to connect Hermes 3 as your local model.