Table of Contents ▾

About These Models
Check Your RAM
Llama 3.2 — Meta
Mistral 7B
Phi-3 Mini — Microsoft
Gemma 2 — Google
Hermes 3 — Nous Research
Qwen 2.5 — Alibaba
How to Install

Local AI Models Explained —
Which One Should You Run?

Updated May 2026 📊 6 models compared ⭐ Beginner friendly

Choosing a local AI model shouldn't require a PhD. This guide explains the most popular free models in plain English — what each one is good at, how much RAM you need, and which one fits your computer right now.

About These Models

All models on this page are free and open-source. You download them once and run them locally — no subscription, no data sent to a company, and they keep working offline. They run through tools like Ollama or LM Studio, which handle the technical setup for you.

The biggest factor in which model you can run is RAM (your computer's working memory). More RAM = bigger, smarter models. A GPU helps a lot with speed but is not required — all models here work on CPU-only hardware.

ℹ What does "7B" mean?

Numbers like "7B" or "3.8B" mean billions of parameters — essentially the size of the model's brain. More parameters = more capable, but also more RAM required. A 7B model needs roughly 6–8 GB RAM; a 70B model needs 40 GB+. Start with what your hardware supports.

Check Your RAM

Your PC

Not sure what you can run?

Click below and we'll automatically detect your RAM and tell you which models are a good fit.

Showing 6 of 6 models

Llama 3.2

Min 4 GB RAM

Best all-around beginner model — versatile, well-supported, and one of the most popular models in the entire local AI community.

Strengths

✓Free and open-source with a massive community — thousands of guides and tutorials exist
✓Excellent for general chat, writing, and summarising documents
✓The 3B version runs on just 4 GB RAM — works well on older or budget laptops

Weaknesses

✗Coding is not its strongest suit compared to Mistral or Qwen 2.5
✗The most capable 70B+ variants need 40 GB+ RAM

Best for: Writing General Chat

📋 Guide coming soon

Mistral AI

Mistral 7B

Min 8 GB RAM

Punches well above its size — fast, efficient, and surprisingly capable at coding and following complex instructions.

Strengths

✓Remarkably capable for a 7B model — outperforms older 13B models on many tasks
✓Fast responses even on CPU-only hardware with 8 GB RAM
✓Good at writing, instruction-following, and light-to-medium coding tasks

Weaknesses

✗Smaller knowledge base than 13B+ models — less depth on niche topics
✗Can lose track in very long documents or complex multi-step reasoning chains

Best for: Coding Writing

📋 Guide coming soon

Microsoft

Phi-3 Mini

Min 4 GB RAM

The best choice for low-end PCs — Microsoft engineered this 3.8B model specifically to squeeze maximum capability out of minimal hardware.

Strengths

✓Runs comfortably on 4 GB RAM — good for older laptops and budget PCs
✓Very fast on CPU-only hardware — responses feel snappy compared to larger models
✓Designed by Microsoft specifically for efficiency, not just shrunk down from a larger model

Weaknesses

✗Less capable on complex multi-step reasoning compared to 7B+ models
✗Smaller context window — can lose track in long conversations

Best for: Low RAM General Chat

📋 Guide coming soon

Google

Gemma 2

Min 8 GB RAM

Google's polished open model — strong at research, analysis, and structured tasks like document summarisation and report writing.

Strengths

✓Strong reasoning and document analysis — great for research and summarisation tasks
✓Well-tested and refined by Google — consistent and reliable output quality
✓Good at producing structured, well-organised responses

Weaknesses

✗The capable 9B version needs 8 GB+ RAM — the smaller 2B version is more limited
✗Coding is weaker than Qwen 2.5 or Mistral

Best for: Research Writing

📋 Guide coming soon

Nous Research

Hermes 3

Min 8 GB RAM

Built specifically for AI agents and task automation — the natural choice if you're running Hermes Agent on your Windows PC.

Strengths

✓Purpose-built for agentic tasks — excels at multi-step planning, tool use, and following complex workflows
✓Exceptional at following multi-part instructions without losing track of context
✓Designed to pair with Hermes Agent — they work together out of the box

Weaknesses

✗Overkill for simple chat — lighter models are faster for basic Q&A
✗Needs 8 GB+ for the 8B version; best performance at 16 GB+

Best for: General Chat Research

View Install Guide →

Alibaba

Qwen 2.5

Min 8 GB RAM

Top-tier coding model in the local AI space — exceptional at programming, math, and logic with strong multilingual support built in.

Strengths

✓Best-in-class coding performance among local models — especially the Coder variant
✓Excellent at math, logic, and structured problem solving
✓Strong multilingual support — one of the best options for non-English use cases

Weaknesses

✗Newer than Llama or Mistral — smaller community and fewer third-party guides
✗Most capable variants (14B+) need 16 GB+ RAM

Best for: Coding Research

📋 Guide coming soon

How to Install These Models

All models on this page can be installed through Ollama or LM Studio — free tools that handle the technical setup for you. No code required.

Ollama (Recommended)

A command-line tool that downloads and runs local models with a single command. Works on Windows, Mac, and Linux. Our guide walks you through every step.

Ollama Install Guide →

LM Studio (No Terminal)

A desktop app with a graphical interface — find and download a model by clicking, no terminal ever needed. Best option if you want to avoid the command line entirely.

Guide coming soon

Running Hermes Agent? Read the Hermes Windows install guide — it covers how to connect Hermes 3 as your local model.

Local AI Models Explained —Which One Should You Run?

About These Models

Check Your RAM

Not sure what you can run?

Llama 3.2

Mistral 7B

Phi-3 Mini

Gemma 2

Hermes 3

Qwen 2.5

How to Install These Models

Ollama (Recommended)

LM Studio (No Terminal)

Get notified when new guides are published

Local AI Models Explained —
Which One Should You Run?