How do I connect n8n running in Docker to Ollama on the host machine?

This is a common networking issue. Inside Docker, "localhost" refers to the container itself, not the host machine. Use "host.docker.internal" as the hostname instead. In your n8n Ollama credential, set the Base URL to "http://host.docker.internal:11434" instead of "http://localhost:11434". On Linux, you may also need to add --add-host=host.docker.internal:host-gateway to your Docker run command.

How I Set Up Ollama With n8n and Brought My AI API Costs to Zero#

Q: Can I run Ollama on a VPS server without a GPU?

Yes. Ollama runs on CPU-only servers. On a Hetzner CX31 (4 vCPU, 8GB RAM, €8/month), Llama 3.1 8B runs at about 3-5 tokens per second — slow but functional for background automations. For interactive workflows where speed matters, a machine with a GPU or Apple M-series chip is much better. CPU-only is fine for overnight batch jobs.

Q: Which Ollama model should I start with for n8n automation?

Start with Llama 3.1 8B (ollama pull llama3.1). It's about 4.9GB, runs on most modern computers with 16GB RAM, and handles the majority of automation tasks well. If you find it too slow, try Phi-3 Mini (2.3GB, very fast). If you find quality insufficient for specific tasks, step up to Mistral 7B or Llama 3.1 70B if your hardware supports it.

In December I opened my OpenAI billing dashboard and saw $54. Not because I'd built something impressive — because my n8n automations had been quietly calling the API thousands of times across several workflows, and the costs had crept up while I wasn't paying attention.

The automations were worth it. But $54/month purely for API calls to classify emails and summarize text felt wrong when I knew local models existed that could do the same thing.

I spent a weekend switching to Ollama. My January OpenAI bill was $0.

This post is the honest version of how that went — what worked immediately, what tripped me up, what I had to compromise on, and whether the quality difference is actually noticeable.

What I Was Using AI For in My Workflows#

Before explaining the switch, it helps to know what I was actually using OpenAI for. My n8n workflows were calling GPT-3.5-turbo for:

Email classification: Is this email urgent, routine, or can I ignore it? (200-300 tokens per email, runs 30-40 times daily)
Document type detection: Is this a contract, invoice, receipt, or proposal? (runs on every PDF I receive)
Lead scoring summaries: Given this form submission, write a 2-sentence summary and suggest a follow-up priority
Weekly report drafting: Turn a list of completed tasks into natural language paragraphs

None of these required GPT-4. I was using GPT-3.5-turbo for everything, which is cheap — but cheap multiplied by hundreds of daily calls adds up.

The math: roughly 800 API calls per day × 500 average tokens × $0.002 per 1K tokens = ~$0.80/day = ~$24/month. Add some larger calls and occasional GPT-4 tests and you land at $50-60/month.

Installing Ollama#

I have two machines I work from: a MacBook Pro M2 (32GB RAM) and a Windows desktop with an RTX 3080. I installed Ollama on both.

MacBook (Apple Silicon):

bash

# Download from ollama.com or via Homebrew
brew install ollama

# Pull models
ollama pull llama3.1
ollama pull phi3

# Start the service
ollama serve

Ollama automatically uses the Apple Neural Engine and GPU on M-series chips. Performance is excellent — Llama 3.1 8B generates around 30-40 tokens per second on M2. Fast enough to feel instantaneous in a workflow.

Windows with NVIDIA GPU: Download the installer from ollama.com. It detects your GPU automatically and uses CUDA. Same experience, similar performance.

After installation, test it:

bash

# Chat directly
ollama run llama3.1

# Test the API
curl http://localhost:11434/api/generate \
  -d '{"model": "llama3.1", "prompt": "Classify this email as urgent or routine: Meeting at 3pm tomorrow?", "stream": false}'

If you see a response, you're ready.

The Docker Networking Problem#

My n8n instance runs in Docker. When I set up the Ollama credential in n8n with http://localhost:11434, every AI node failed with a connection error.

The issue is obvious once you know it, but it cost me an hour: inside a Docker container, localhost refers to the container's own loopback address, not the host machine. The Ollama server running on my Mac was invisible to the n8n container.

Fix on Mac/Windows:

Docker Desktop automatically creates a special hostname host.docker.internal that resolves to the host machine's IP. Change your Ollama credential in n8n from:

http://localhost:11434

to:

http://host.docker.internal:11434

That's it. Saved and tested — the connection worked immediately.

Fix on Linux (where host.docker.internal isn't automatic):

Add this to your docker-compose.yml:

yaml

services:
  n8n:
    image: n8nio/n8n
    extra_hosts:
      - "host.docker.internal:host-gateway"
    # ... rest of config

Or find your host's Docker bridge IP (172.17.0.1 on most systems) and use that directly.

Setting Up the n8n Credential#

In n8n, go to Settings → Credentials → New. Search for "Ollama".

Fields:

Base URL: http://host.docker.internal:11434
Name it: "Local Ollama"

Save. Then in any AI node, select "Ollama" as the provider and "Local Ollama" as the credential.

Replacing Each Workflow — What Happened#

Email Classification#

Before (OpenAI):

Model: gpt-3.5-turbo
Prompt: "Classify this email as urgent/routine/ignore. Return JSON: {category, reason}"
Average response time: 800ms
Cost: ~0.001 per call

After (Ollama / Llama 3.1 8B):

Model: llama3.1
Same prompt
Average response time: 1.2 seconds (local, M2 Mac)
Cost: $0

Quality difference: Essentially none for this task. Llama 3.1 8B classifies emails correctly about 94-96% of the time in my testing, compared to GPT-3.5-turbo at around 96-97%. The 2% difference means maybe one misclassified email per week. Completely acceptable.

Document Type Detection#

After switch quality: Very good. The model correctly identifies invoice/contract/receipt/proposal 97%+ of the time when I give it the first page of text. Better than I expected, honestly.

One thing I had to change: my prompt for OpenAI could be relatively loose. Local models respond better to more explicit, structured prompts.

OpenAI prompt (loose):

What type of document is this? Invoice, contract, receipt, or proposal?
[document text]

Ollama prompt (structured):

You are a document classification assistant. Analyze the document text below and 
identify its type.

IMPORTANT: Respond with ONLY a valid JSON object. No explanation. No markdown.

Format: {"type": "invoice|contract|receipt|proposal|other", "confidence": "high|medium|low"}

Document text:
[document text]

The more explicit instruction to return valid JSON was necessary. GPT-3.5 would usually return JSON without being told. Llama 3.1 8B needed the instruction reinforced. Once I updated my prompts, the outputs were reliable.

Lead Scoring Summaries#

This one was trickier. The task involves some nuanced judgment — reading a form submission and writing a professional 2-sentence summary that captures the key details and suggests a follow-up priority.

My honest assessment: GPT-3.5 wrote better summaries. The language was more natural, the summaries were more insightful, and it made better judgment calls about priority.

Llama 3.1 8B was fine — the summaries were accurate and useful — but they had a slightly more mechanical feel. For internal workflow use where I'm the only one reading them, it's completely adequate. If these summaries were going to clients, I'd use GPT-4.

I kept this workflow on Ollama but added a "review" flag for any lead scored above a certain threshold, where I personally review the AI summary before using it.

Weekly Report Drafting#

This was the biggest quality gap. Report writing requires fluent prose and the ability to weave a coherent narrative from a list of completed tasks. Llama 3.1 8B produced grammatically correct text that covered all the facts, but it lacked the natural flow that GPT-4 could produce.

For this specific workflow, I switched from Ollama to a compromise: I use Ollama for the data aggregation and structuring step (which is mechanical), and GPT-4 for the final prose generation step (which benefits from the quality difference). This reduced my OpenAI costs by about 80% for this workflow while keeping the output quality high.

My Current Model Setup#

After a month of testing, here's what I actually use:

| Task | Model | Reason | |---|---|---| | Email classification | Phi-3 Mini | Fast, accurate enough, very low resource | | Document detection | Llama 3.1 8B | Good accuracy, handles longer text | | Data extraction from docs | Llama 3.1 8B | Good JSON output with proper prompts | | Lead summaries | Llama 3.1 8B | Adequate quality for internal use | | Report prose writing | GPT-4 (via API) | Quality matters, small volume | | Research summaries | Llama 3.1 70B (server) | Better reasoning, justify the resource use |

The 70B model runs on a rented GPU server (Lambda Labs, used only for batch jobs), not my laptop. For occasional heavy tasks it's cost-effective — a few dollars for an hour of GPU time is far less than equivalent GPT-4 API calls.

What My Costs Look Like Now#

Before switch:

OpenAI API: ~$54/month
n8n VPS: €3.79/month
Total: ~$59/month

After switch:

OpenAI API: ~$4/month (only report prose writing, low volume)
n8n VPS: €3.79/month
Lambda Labs (occasional): ~$2/month average
Total: ~$10/month

Monthly savings: ~$49

Annual savings: ~$588.

Yes, with these caveats:

Do it if:

Your AI workflows run high volumes (100+ calls per day)
You're processing sensitive data that shouldn't leave your network
Your tasks are classification, extraction, or structured output generation
You have decent hardware (16GB RAM minimum, GPU helps a lot)

Think carefully if:

You need GPT-4-level prose quality in client-facing outputs
Your hardware is old or underpowered (CPU-only inference is slow)
You're doing tasks that require broad world knowledge or nuanced reasoning

The middle path (what I do): Use Ollama for the bulk of high-volume, mechanical tasks. Keep a minimal OpenAI subscription for the small percentage of tasks where quality genuinely matters. You get most of the cost savings with minimal quality compromise.

Frequently Asked Questions#

Does Ollama work as well as OpenAI GPT-4 for n8n automations?#

For most automation tasks — classification, summarization, data extraction — Llama 3.1 8B performs at roughly GPT-3.5 quality, sufficient for the vast majority of workflow automation. For complex reasoning or nuanced writing, GPT-4 still has an edge. Many users run Ollama for 90% of tasks and keep OpenAI as a fallback for the other 10%.

Can I run Ollama on a VPS server without a GPU?#

Yes. On a Hetzner CX31 (4 vCPU, 8GB RAM, €8/month), Llama 3.1 8B runs at about 3-5 tokens per second — slow but functional for background automations. CPU-only is fine for overnight batch jobs. For real-time workflows, GPU or Apple M-series is much better.

How do I connect n8n in Docker to Ollama on the host machine?#

Use host.docker.internal instead of localhost. In your n8n Ollama credential, set Base URL to http://host.docker.internal:11434. On Linux, add --add-host=host.docker.internal:host-gateway to your Docker run command.

Which Ollama model should I start with for n8n automation?#

Start with Llama 3.1 8B (ollama pull llama3.1). It handles the majority of automation tasks well. If too slow, try Phi-3 Mini. If quality is insufficient for specific tasks, try Mistral 7B or Llama 3.1 70B if your hardware supports it.

Production Notes#

Root cause to verify: inspect trigger history, credential scope, retry behavior, and the exact execution payload.
Production fix pattern: make workflow steps idempotent, log handoff points, and isolate flaky external services.
Verification step: replay one successful and one failed execution before enabling the workflow on a schedule.

How I Set Up Ollama With n8n and Brought My AI API Costs to Zero

How I Set Up Ollama With n8n and Brought My AI API Costs to Zero#

What I Was Using AI For in My Workflows#

Installing Ollama#

The Docker Networking Problem#

Setting Up the n8n Credential#

Replacing Each Workflow — What Happened#

Email Classification#

Document Type Detection#

Lead Scoring Summaries#

Weekly Report Drafting#

My Current Model Setup#

What My Costs Look Like Now#

Frequently Asked Questions#

Does Ollama work as well as OpenAI GPT-4 for n8n automations?#

Can I run Ollama on a VPS server without a GPU?#

How do I connect n8n in Docker to Ollama on the host machine?#

Which Ollama model should I start with for n8n automation?#

Production Notes#

Frequently Asked Questions

Does Ollama work as well as OpenAI GPT-4 for n8n automations?

Can I run Ollama on a VPS server without a GPU?

How do I connect n8n running in Docker to Ollama on the host machine?

Which Ollama model should I start with for n8n automation?

Written by

Mahdi Br

How I Fixed My n8n Workflow That Was Failing Silently for Three Weeks

How I Migrated from Zapier to n8n and Cut My Automation Bill to Zero

How I Built a Client Reporting System That Runs Itself (With n8n)

Browse by Topic

Technology

technology

automation

Productivity

Business

How I Set Up Ollama With n8n and Brought My AI API Costs to Zero

How I Set Up Ollama With n8n and Brought My AI API Costs to Zero#

What I Was Using AI For in My Workflows#

Installing Ollama#

The Docker Networking Problem#

Setting Up the n8n Credential#

Replacing Each Workflow — What Happened#

Email Classification#

Document Type Detection#

Lead Scoring Summaries#

Weekly Report Drafting#

My Current Model Setup#

What My Costs Look Like Now#

Would I Recommend This?#

Frequently Asked Questions#

Does Ollama work as well as OpenAI GPT-4 for n8n automations?#

Can I run Ollama on a VPS server without a GPU?#

How do I connect n8n in Docker to Ollama on the host machine?#

Which Ollama model should I start with for n8n automation?#

Related Debugging Notes#

Production Notes#

Frequently Asked Questions

Does Ollama work as well as OpenAI GPT-4 for n8n automations?

Can I run Ollama on a VPS server without a GPU?

How do I connect n8n running in Docker to Ollama on the host machine?

Which Ollama model should I start with for n8n automation?

Written by

Mahdi Br

Continue Reading

How I Fixed My n8n Workflow That Was Failing Silently for Three Weeks

How I Migrated from Zapier to n8n and Cut My Automation Bill to Zero

How I Built a Client Reporting System That Runs Itself (With n8n)

Browse by Topic

Technology

technology

automation

Productivity

Business