n8n AI Agent Workflows with Ollama: Run Local AI Automations in 2026#

Q: What is Ollama and why use it with n8n?

Ollama is an open-source tool that lets you run large language models (LLMs) locally on your computer. Pairing it with n8n means your AI automation workflows run entirely on your own hardware — no API costs, no data sent to external services, and full privacy. It's ideal for businesses handling sensitive data or developers who want to experiment without per-token charges.

Q: Which Ollama models work best with n8n AI agents?

For general-purpose AI agents in n8n, Llama 3.1 8B is an excellent balance of speed and capability on most hardware. If you have a powerful GPU, Llama 3.1 70B or Mistral Large delivers near-GPT-4 quality. For coding tasks, Deepseek Coder V2 performs exceptionally well. Phi-3 Mini is a great option on limited hardware.

Q: Do I need a GPU to run Ollama?

No. Ollama runs on CPU as well, though it's significantly slower. For practical use, a modern CPU with 16GB RAM can run smaller models like Phi-3 Mini or Llama 3.1 8B at usable speeds (a few tokens per second). A GPU with 8GB+ VRAM (like an NVIDIA RTX 3060 or Apple M-series chip) dramatically improves performance.

Q: How do I connect Ollama to n8n?

n8n has a native Ollama credential and integration since version 1.28. Install Ollama on your machine, pull a model, then in n8n go to Credentials → New → Ollama, enter your Ollama base URL (http://localhost:11434 for local). Then use the "Ollama Model" node or the AI Agent node with Ollama as the language model provider.

Q: What can n8n AI agents actually do?

n8n AI agents can use "tools" — which are other n8n nodes or external APIs. This means an agent can search the web, read and write files, query databases, send emails, update CRM records, and more — all autonomously based on instructions you give it. The agent decides which tools to use and in what order to complete the task.

Q: Is n8n + Ollama production-ready for businesses?

Yes, with caveats. For internal business workflows, document processing, and data enrichment, n8n + Ollama is very production-ready in 2026. For customer-facing AI features requiring highest accuracy, you may still want OpenAI as a fallback. The combination shines for privacy-sensitive automations where cloud AI is not acceptable.

Here's something that felt like science fiction three years ago: my laptop is running an AI that reads my emails, decides which ones need urgent replies, drafts responses in my writing style, and flags anything that looks like a potential client lead — all automatically, all locally, without a single request leaving my network.

The monthly AI bill? Zero.

This is what n8n AI agents powered by Ollama look like in 2026. If you've been watching the explosion of local AI tools and wondering how to actually plug them into real workflows, this guide is your answer.

We'll cover how to install Ollama, connect it to n8n, understand how AI agents work, and build three complete agent workflows you can run today. I'm assuming you have basic familiarity with n8n — if not, read the n8n Complete Beginner Guide first.

Why Local AI + Automation Is the Setup to Have in 2026#

Before we get technical, let me explain why this combination matters.

The Problem with Cloud AI in Automation#

When you connect Zapier or Make to OpenAI, every piece of data you process gets sent to OpenAI's servers. For most personal projects, that's fine. But consider:

Processing client contracts or invoices (sensitive business data)
Analyzing customer support tickets (potentially personal information)
Working with health records, legal documents, or financial data
Anything covered by GDPR, HIPAA, or similar regulations

Cloud AI in these contexts is either legally problematic or creates business risk. Your automation tool shouldn't be sending client data to a third-party AI service without explicit consent.

Local AI solves this entirely. With Ollama, the language model runs on your hardware. Data stays in your network.

The Problem with Cost#

ChatGPT's API is not expensive for occasional use. But in automation workflows that run hundreds or thousands of times per day, token costs stack up fast. A workflow that processes 500 documents per day at 1,000 tokens each costs real money at commercial API rates.

With Ollama, the same workflow costs the electricity to run your server.

The Quality Has Caught Up#

This point deserves emphasis. In 2024, local models were noticeably worse than GPT-4 for complex reasoning. In 2026, that gap has shrunk dramatically. Llama 3.1 70B matches GPT-4 on many benchmarks. Llama 3.1 8B — which runs comfortably on a laptop — is roughly GPT-3.5-turbo quality.

For most business automation tasks (classification, summarization, extraction, drafting), the 8B models are entirely sufficient.

Setting Up Ollama#

Installation#

Ollama installs in seconds. Go to ollama.com and download the installer for your OS.

After installation, open your terminal:

bash

# Verify installation
ollama --version

# Pull your first model (Llama 3.1 8B — about 5GB)
ollama pull llama3.1

# Or the smaller, faster Phi-3 Mini (about 2GB)
ollama pull phi3

# For coding tasks, Deepseek Coder V2
ollama pull deepseek-coder-v2

Models download to ~/.ollama/models. The first download takes a few minutes depending on your connection.

Running Ollama#

After installation, Ollama runs as a background service automatically. It exposes an API at http://localhost:11434.

Test it works:

bash

# Chat with a model directly in the terminal
ollama run llama3.1

# Test the API directly
curl http://localhost:11434/api/generate -d '{
  "model": "llama3.1",
  "prompt": "Summarize this in one sentence: n8n is a workflow automation tool.",
  "stream": false
}'

Choosing the Right Model for Your Hardware#

| Model | Size | RAM Needed | Best For | |---|---|---|---| | Phi-3 Mini | 2.3GB | 8GB RAM | Fast responses, limited hardware | | Llama 3.1 8B | 4.9GB | 16GB RAM | General purpose, good balance | | Mistral 7B | 4.1GB | 16GB RAM | Instruction following | | Llama 3.1 70B | 40GB | GPU with 48GB+ | Near-GPT-4 quality | | Deepseek Coder V2 | 8.9GB | 16GB RAM | Code generation | | Gemma 2 9B | 5.4GB | 16GB RAM | Summarization, Q&A |

For most n8n automation tasks, Llama 3.1 8B is the sweet spot in 2026. It's fast enough for real-time workflows and capable enough for complex instructions.

Connecting Ollama to n8n#

n8n added native Ollama support in version 1.28. Here's how to set it up:

Step 1: Create an Ollama Credential#

In n8n, go to Settings → Credentials → New
Search for "Ollama"
Set the Base URL: http://localhost:11434 (or your server's IP if n8n is on a different machine)
Save the credential as "Local Ollama"

If n8n is running in Docker and Ollama is on the host machine, use http://host.docker.internal:11434 instead of localhost.

Step 2: Test the Connection#

Create a new workflow. Add an Ollama Model node. Select your "Local Ollama" credential and choose llama3.1 as the model.

Run a quick test: connect a Manual Trigger → Basic LLM Chain (set to use your Ollama model) → run it with a test prompt. If you get a response, you're connected.

Understanding n8n AI Agents#

Before building, let's understand what an "AI agent" actually means in n8n. This is different from just sending text to an LLM.

The Difference Between LLM Chains and Agents#

An LLM Chain is a simple pipeline: you send a prompt, you get a response. The model doesn't take any actions. It just generates text.

[Input] → [LLM] → [Output text]

An AI Agent can use tools. It receives a goal, decides what tools to use, executes them, observes results, and continues until the goal is achieved.

[Goal] → [Agent: thinks about what to do]
              ↓ [uses tool: web search]
              ↓ [reads result]
              ↓ [uses tool: send email]
              ↓ [task complete]

The agent loop is: Think → Act → Observe → Repeat.

How n8n Implements Agents#

n8n's AI Agent node implements this loop natively. You give it:

A language model (our Ollama model)
A system prompt (what the agent's job is)
Tools (other nodes it can call: search, database queries, HTTP requests, etc.)
Memory (optional: conversation history so it can maintain context)

The agent runs the LLM, sees if it wants to use a tool, calls that tool, feeds the result back to the LLM, and repeats until it has a final answer.

Tools in n8n Agents#

The real power of n8n agents is that any node can be a tool. You can give your agent access to:

HTTP Request tool: Call any external API
SerpAPI / Brave Search tool: Search the web
Calculator tool: Do math
Code tool: Execute JavaScript
Supabase / Postgres tool: Query your database
Gmail tool: Read and send emails
Custom Function tool: Any n8n logic you define

Building AI Agent Workflow #1: Email Triage Agent#

This agent reads your inbox, categorizes emails by priority and type, drafts replies for routine messages, and creates a daily digest. This is the workflow I use personally.

What It Does#

Every morning at 7 AM:

Fetches emails from the last 24 hours via Gmail
For each email, the AI agent classifies it (urgent/normal/newsletter/spam)
Drafts a reply for emails marked "normal" that follow patterns (meeting confirmations, quote requests)
Creates a prioritized summary in Notion
Sends you a Slack message with urgent items only

Building It#

Trigger: Schedule Trigger (daily at 7 AM)

Gmail Node:

Operation: Get Many
Filters: After yesterday's date, unread only
Return limit: 50

Loop Over Items Node: Process each email individually

AI Agent Node (the core):

Model: Ollama / llama3.1
System prompt:

You are an email triage assistant. For each email, you must:
1. Classify priority: "urgent", "normal", "newsletter", "spam"
2. Classify type: "client", "meeting", "invoice", "newsletter", "personal", "other"
3. Write a brief 2-sentence summary
4. If type is "meeting" or "client" and priority is "normal", draft a polite reply

Always respond in valid JSON format:
{
  "priority": "urgent|normal|newsletter|spam",
  "type": "client|meeting|invoice|newsletter|personal|other",
  "summary": "...",
  "draft_reply": "..." or null
}

Input: From: {{ $json.from }}\nSubject: {{ $json.subject }}\nBody: {{ $json.snippet }}

IF Node: Split by priority

Urgent path → format for Slack notification
Normal with draft → save draft to Gmail drafts
Newsletter/spam → archive automatically

Notion Node (create page):

Database: "Email Digest"
Properties: date, priority, type, summary, original sender

Slack Node: Send urgent items summary

The Result#

You wake up to a Slack message with only the things that actually need your attention today. Everything else is already sorted, archived, or has a draft reply waiting for your review. This saves 20-30 minutes of inbox management every morning.

Building AI Agent Workflow #2: Document Intelligence Agent#

This agent processes documents (PDFs, contracts, invoices) dropped into a folder, extracts structured data, and populates your database — automatically.

What It Does#

When you drop a PDF into a designated Google Drive folder:

n8n detects the new file
Extracts the text content
AI agent identifies the document type (invoice, contract, receipt, proposal)
Extracts structured data based on document type (vendor, amount, dates, parties, terms)
Creates a record in your Airtable or Notion database
Tags the original file and moves it to the right subfolder

Building It#

Trigger: Google Drive Trigger (file created in specific folder)

Google Drive Node: Download the file content

Extract from File Node: Extract text from PDF (n8n has a built-in "Extract from File" node for PDFs, docx, etc.)

AI Agent Node:

Model: Ollama / llama3.1
System prompt:

You are a document processing specialist. Extract structured data from business documents.

First, identify the document type: invoice, contract, receipt, proposal, or other.

Based on the type, extract relevant fields:
- Invoice: vendor_name, invoice_number, date, due_date, total_amount, line_items (array), currency
- Contract: parties (array), effective_date, expiration_date, key_terms (array), contract_value
- Receipt: vendor, date, total_amount, items (array), payment_method
- Proposal: client_name, project_title, total_value, validity_date, key_services (array)

Return ONLY valid JSON. If a field is not found, use null.

Input: {{ $json.text }} (the extracted PDF text)
Tools: Calculator (for validating totals)

Switch Node: Route based on document type returned by agent

Airtable/Notion Node: Create record with extracted fields

Google Drive Node: Move file to appropriate subfolder and update metadata

Why This Matters#

If you process 20+ documents per week (invoices, contracts, client proposals), manually entering data takes significant time and introduces errors. This workflow does it in seconds, with no per-page AI costs, on your own hardware.

Building AI Agent Workflow #3: Research and Briefing Agent#

This agent conducts research on a topic, synthesizes information from multiple sources, and produces a structured briefing document — triggered by a simple form submission.

What It Does#

You submit a topic via an n8n Form. The agent:

Searches the web for recent, relevant information (using SerpAPI or Brave Search)
Fetches and reads the top 5 articles
Identifies key themes, contrasting viewpoints, and notable data points
Writes a structured 500-word briefing with sources
Saves to Notion and emails it to you

The Multi-Step Agent Reasoning#

What makes this impressive is watching the agent chain its reasoning:

Goal: "Research the current state of solid-state batteries for EVs"

Agent thinks: I should search for recent news first
→ Tool: Web Search("solid-state batteries EV 2026")
→ Gets: 5 search results with URLs and snippets

Agent thinks: I should read the most relevant articles
→ Tool: HTTP Request (fetch article 1)
→ Tool: HTTP Request (fetch article 2)
→ Tool: HTTP Request (fetch article 3)

Agent thinks: I have enough information to write the briefing
→ Produces: Structured 500-word document

Building It#

Trigger: n8n Form (with a single "Topic" text field)

AI Agent Node (with tools):

Model: Ollama / llama3.1 (or llama3.1:70b if you have the hardware, for better research quality)
System prompt:

You are a research analyst. When given a topic, follow these steps:
1. Search for recent information using the search tool
2. Read the 3 most relevant articles using the fetch tool
3. Synthesize the information into a structured briefing

Your briefing must include:
- Executive Summary (2-3 sentences)
- Key Findings (5 bullet points)
- Notable Data/Statistics
- Key Players/Companies
- Outlook and Implications
- Sources (list the URLs you read)

Be factual, cite specific data where available, and note any conflicting information.

Tools:
- SerpAPI Search Tool (for web searches)
- HTTP Request Tool (for fetching article content)
- Code Tool (for text cleaning/formatting)

Notion Node: Create new page in "Research Briefings" database

Gmail Node: Send briefing to your email

A Note on Agent Reliability#

Local models are impressive but not perfect. For research tasks, Llama 3.1 8B sometimes hallucinates sources or invents data. Two mitigation strategies:

Use a larger model: Llama 3.1 70B or Mistral Large makes significantly fewer errors on research tasks
Validate with grounding: After the agent produces its briefing, run a second step that cross-checks the cited URLs actually exist (HTTP check) and flags any that don't

Advanced Patterns: Memory and Multi-Agent Systems#

Giving Your Agent Memory#

By default, n8n AI agents don't remember previous runs. Add memory to create agents that learn from past interactions:

In-workflow memory (within one execution): Use the Window Buffer Memory node. The agent remembers the last N conversation turns. Good for multi-step conversations.

Cross-workflow memory (persists between runs): Store relevant facts in a Supabase or Postgres database. At the start of each run, fetch relevant memories and inject them into the system prompt.

Example: an email agent that remembers "User prefers replies under 50 words" after you tell it once.

javascript

// Fetch memories from database
const memories = await queryDatabase(`
  SELECT content FROM agent_memories 
  WHERE category = 'email_preferences'
  ORDER BY created_at DESC 
  LIMIT 5
`);

// Inject into system prompt
const systemPrompt = `
You are an email assistant.

User preferences from memory:
${memories.map(m => `- ${m.content}`).join('\n')}

Current task: ...
`;

Multi-Agent Orchestration#

Complex tasks benefit from multiple specialized agents working together. n8n supports this through the Execute Workflow node.

Example: a content creation system:

Orchestrator Agent
├── Research Agent (fetches information)
├── Outline Agent (structures the article)
├── Writing Agent (writes each section)
└── Editor Agent (reviews and improves)

Each sub-agent is a separate n8n workflow. The orchestrator calls them in sequence, passing outputs as inputs. This mirrors how professional teams work — and produces significantly better results than a single "write me an article" prompt.

Performance Optimization Tips#

Running local AI in production workflows requires some care:

Use the right model size: Don't use 70B when 8B works. The 8B model is 5-10x faster. Profile your tasks — most classification and extraction tasks work perfectly with smaller models.

Enable GPU acceleration: On NVIDIA GPUs, set OLLAMA_NUM_GPU=1 in Ollama's environment. On Apple Silicon, GPU acceleration is automatic.

Batch small tasks: Instead of running the agent once per email, batch 10 emails into one prompt when the tasks are independent. Fewer LLM calls = faster overall execution.

Cache common prompts: For repeated classification tasks (same type of document, same type of email), cache results in a simple key-value store. Many inputs are near-identical.

Set timeouts: n8n AI agent nodes have a timeout setting. Set it to 60-120 seconds for complex tasks. Without a timeout, a stuck model can block your entire workflow.

Monitor with execution logs: n8n's execution history shows you exactly how long each node takes. Use this to identify bottlenecks.

Connecting Ollama to Cloud AI as a Fallback#

For mission-critical workflows, consider a fallback setup: try Ollama first, fall back to OpenAI if it fails or takes too long.

In n8n, implement this with:

Try/Catch: Wrap your Ollama node in error handling
IF node: Check if response quality meets threshold (e.g., response length > 50 chars)
OpenAI node: Used as fallback

This gives you the cost savings of local AI 95% of the time while maintaining reliability for the 5% of edge cases where the local model struggles.

Security Considerations#

Running AI locally is generally more secure than cloud, but some hygiene matters:

Prompt injection: If user-submitted data goes directly into your agent's prompt, a malicious user could manipulate the agent's behavior. Sanitize inputs and use structured prompt formats that separate instructions from data.

API exposure: Never expose your Ollama port (11434) to the public internet. It has no authentication by default. Keep it on a private network or behind a reverse proxy with auth.

Sensitive data in logs: n8n's execution logs store all data that flows through nodes, including AI inputs/outputs. If you're processing sensitive data, either disable execution log storage or use n8n's data masking features.

Model updates: Keep your Ollama models updated. Run ollama pull llama3.1 periodically to get security and quality improvements.

Frequently Asked Questions#

What is Ollama and why use it with n8n?#

Ollama is an open-source tool that runs large language models locally on your computer. Pairing it with n8n means your AI workflows run entirely on your own hardware — no API costs, no data sent to external services, full privacy. It's ideal for businesses handling sensitive data or developers who want to experiment without per-token charges.

Which Ollama models work best with n8n AI agents?#

For general-purpose agents, Llama 3.1 8B is an excellent balance of speed and capability. If you have a powerful GPU, Llama 3.1 70B or Mistral Large delivers near-GPT-4 quality. For coding tasks, Deepseek Coder V2 performs exceptionally well. Phi-3 Mini is great on limited hardware.

Do I need a GPU to run Ollama?#

No. Ollama runs on CPU, though it's slower. A modern CPU with 16GB RAM can run smaller models like Phi-3 Mini or Llama 3.1 8B at usable speeds. A GPU with 8GB+ VRAM dramatically improves performance.

How do I connect Ollama to n8n?#

n8n has a native Ollama credential since version 1.28. Install Ollama, pull a model, then in n8n go to Credentials → New → Ollama and enter your Ollama base URL (http://localhost:11434 for local). Use the AI Agent node with Ollama as the language model provider.

What can n8n AI agents actually do?#

n8n AI agents can use tools — other n8n nodes or external APIs. This means an agent can search the web, read files, query databases, send emails, update CRM records, and more — all autonomously based on your instructions. The agent decides which tools to use and in what order to complete the task.

Is n8n + Ollama production-ready for businesses?#

Yes, with caveats. For internal workflows, document processing, and data enrichment, n8n + Ollama is very production-ready in 2026. For customer-facing AI requiring highest accuracy, you may want OpenAI as a fallback. The combination shines for privacy-sensitive automations where cloud AI isn't acceptable.

Where to Go from Here#

Local AI + n8n automation is still early enough that being competent with it puts you genuinely ahead. Most businesses haven't figured out how to deploy AI in their workflows at all, let alone privately and cost-effectively.

The three workflows in this guide — email triage, document intelligence, and research briefing — are real systems that save real hours. Start with one that matches your biggest pain point. Get it working, run it for a week, and watch what happens to your workload.

Once you have the pattern down (trigger → process data → AI agent → action), you'll start seeing automation opportunities everywhere.

Next steps: Explore n8n workflow templates for freelancers and solopreneurs for ready-to-deploy automations, or return to the n8n beginner's guide to solidify the fundamentals.

n8n AI Agent Workflows with Ollama: Run Local AI Automations in 2026

n8n AI Agent Workflows with Ollama: Run Local AI Automations in 2026#

Why Local AI + Automation Is the Setup to Have in 2026#

The Problem with Cloud AI in Automation#

The Problem with Cost#

The Quality Has Caught Up#

Setting Up Ollama#

Installation#

Running Ollama#

Choosing the Right Model for Your Hardware#

Connecting Ollama to n8n#

Step 1: Create an Ollama Credential#

Step 2: Test the Connection#

Understanding n8n AI Agents#

The Difference Between LLM Chains and Agents#

How n8n Implements Agents#

Tools in n8n Agents#

Building AI Agent Workflow #1: Email Triage Agent#

What It Does#

Building It#

The Result#

Building AI Agent Workflow #2: Document Intelligence Agent#

What It Does#

Building It#

Why This Matters#

Building AI Agent Workflow #3: Research and Briefing Agent#

What It Does#

The Multi-Step Agent Reasoning#

Building It#

A Note on Agent Reliability#

Advanced Patterns: Memory and Multi-Agent Systems#

Giving Your Agent Memory#

Multi-Agent Orchestration#

Performance Optimization Tips#

Connecting Ollama to Cloud AI as a Fallback#

Security Considerations#

Frequently Asked Questions#

What is Ollama and why use it with n8n?#

Which Ollama models work best with n8n AI agents?#

Do I need a GPU to run Ollama?#

How do I connect Ollama to n8n?#

What can n8n AI agents actually do?#

Is n8n + Ollama production-ready for businesses?#

Where to Go from Here#

Frequently Asked Questions

What is Ollama and why use it with n8n?

Which Ollama models work best with n8n AI agents?

Do I need a GPU to run Ollama?

How do I connect Ollama to n8n?

What can n8n AI agents actually do?

Is n8n + Ollama production-ready for businesses?

Related Guides

n8n Complete Beginner Guide 2026: Build Your First Automation Workflow

18 n8n Workflow Templates for Freelancers and Solopreneurs (2026)

AI Integration for Next.js + Supabase Applications