Why Run DeepSeek Locally for n8n AI Workflows
Using n8n for AI workflows with a locally running model is one of the best infrastructure decisions you can make in 2026. I've been running n8n self-hosted for over a year and the moment you connect it to a local LLM via Ollama, the whole stack becomes both cheaper and more private. No API tokens, no per-request costs, no data leaving your server. You pay once for the VPS — the inference runs free.
The competitor approach (cloud DeepSeek API) costs ~$0.014–$0.55 per million tokens and sends your prompts to an external server. For sensitive data pipelines or high-volume automation, that adds up fast and creates compliance risk. This tutorial covers the self-hosted path: Ollama + DeepSeek on your VPS, wired into n8n's native AI Agent — no code, no community plugins, no API billing.
Prerequisites
- A VPS with at least 8 GB RAM for DeepSeek-R1 7B (16 GB for the 14B model)
- Ubuntu 22.04 or Debian 12 — the commands below assume a Debian-based system
- Root or sudo SSH access
- Docker installed (we'll use it for n8n; Ollama installs natively)
- n8n v1.22 or later — the AI Agent nodes require this version minimum
VPS recommendation: Hetzner's General Purpose tier (dedicated vCPUs, from €16.49/mo) handles AI inference without throttling. On a tighter budget, Contabo's Cloud VPS 10 (4 vCPU, 8 GB RAM, $4.32/mo) runs DeepSeek-R1 7B fine — the slower CPU matters less for inference than available RAM.
Hetzner — Best Value · 4.9/5
Dedicated vCPU tiers from €16.49/mo — right-sized for running local AI models without shared-CPU throttling.
What We're Building
By the end of this tutorial you'll have a working n8n AI Agent workflow that sends prompts to a locally running DeepSeek model and returns structured responses — entirely on your own VPS. Use it as a base for email summarization, webhook payload classification, content generation, or any LLM-in-the-loop automation you can wire through n8n.
Step 1: Install Ollama on Your VPS
SSH into your VPS and run the official Ollama install script. It registers Ollama as a systemd service and starts it automatically:
curl -fsSL https://ollama.com/install.sh | sh
Verify Ollama is running and accessible on its default port:
systemctl status ollama
curl http://localhost:11434
You should see Ollama is running from the curl response. If the service isn't active, run systemctl start ollama.
Step 2: Pull the DeepSeek Model
Pull DeepSeek-R1. Start with the 7B variant if you have 8 GB RAM — it's ~4.7 GB to download:
ollama pull deepseek-r1:7b
Once the pull completes, do a quick sanity check in the terminal before touching n8n:
ollama run deepseek-r1:7b "Summarize what n8n is in one sentence."
If the command hangs or the process is killed, the VPS is out of free RAM. Check with free -h and stop anything consuming memory before retrying. For the 14B model, use deepseek-r1:14b and a 16 GB RAM instance.
Step 3: Start n8n with Docker
If n8n isn't running yet, the fastest path is a single Docker command. This mounts a persistent volume so your workflows survive container restarts:
docker run -d \
--name n8n \
--restart unless-stopped \
-p 5678:5678 \
-v ~/.n8n:/home/node/.n8n \
docker.n8n.io/n8nio/n8n
Open http://YOUR_VPS_IP:5678 in a browser and complete the initial setup. I run n8n behind an Nginx reverse proxy with HTTPS — worth the extra 10 minutes of setup for any production workflow.
Networking note: Because n8n runs inside Docker, localhost resolves to the container, not your VPS host. To reach Ollama from inside Docker, use http://172.17.0.1:11434 (the default Docker bridge gateway IP on Linux) instead of localhost:11434.
n8n Cloud — Most Flexible · 4.7/5
Open-source and self-hostable for free. Cloud plan from €24/mo. 400+ integrations, AI Agent builder, code nodes.
Step 4: Configure the Ollama Credential in n8n
In the n8n UI, navigate to Settings → Credentials → Add Credential. Search for Ollama and select it. Set the Base URL to the correct address for your setup:
- n8n in Docker (Linux):
http://172.17.0.1:11434 - n8n running natively (not Docker):
http://localhost:11434
Save the credential. n8n will not test the connection until you use it in a workflow — that's expected behavior.
Step 5: Build an AI Agent Workflow
Create a new workflow in n8n. Add a Manual Trigger node as the starting point (swap this for a Webhook or Schedule Trigger once you're happy with the output).
Add an AI Agent node and connect it to the trigger. Inside the Agent node:
- Click Add Chat Model → select Chat Model (Ollama)
- Select the Ollama credential you just created
- Set Model to
deepseek-r1:7b(must match the model name you pulled) - In the Agent's System Message field, enter your instruction: e.g., "You are a concise assistant. Summarize the input text in 3 bullet points."
- Map the trigger's output to the Agent's Text input field
Connect the AI Agent output to a Set node or directly to a downstream action (Slack, email, database write — whatever your workflow needs).
Step 6: Test the Workflow
Click Test Workflow in the n8n editor. The first run will be slower as the model loads into memory — typically 5–15 seconds. Subsequent runs in the same session are much faster once the model is resident in RAM.
Check the output panel for the AI Agent node. You should see a text field containing DeepSeek's response. If the node shows a connection error, double-check the Ollama base URL in your credential — the Docker bridge IP is the most common gotcha.
Tips and Common Mistakes
- Wrong Ollama URL in Docker:
localhostinside Docker is the container, not the host. Always use172.17.0.1on Linux unless you've configured a custom Docker network. - Model not found error: The model name in n8n must exactly match what Ollama has pulled. Run
ollama liston the host to see pulled models and their exact names. - Out of RAM: Ollama will silently fail or produce garbled output if the model can't fit in RAM. Run
free -hbefore testing — you need ~6 GB free for the 7B model. - Ollama listening on 127.0.0.1 only: By default, Ollama binds to
127.0.0.1. To expose it to Docker, setOLLAMA_HOST=0.0.0.0in/etc/systemd/system/ollama.serviceand reload:systemctl daemon-reload && systemctl restart ollama. - Keep the system message short: DeepSeek-R1 follows instructions well — a one-sentence system prompt is usually enough. Lengthy system prompts increase latency noticeably on smaller instances.
Next Steps
With the base workflow running, the most useful extensions for a DevOps stack:
- Add a Webhook trigger — point any external service at your n8n endpoint and process incoming payloads with DeepSeek in real time
- Chain multiple AI nodes — run DeepSeek for classification, then route branches based on its output using n8n's Switch node
- Connect to a vector store — n8n supports Qdrant, Pinecone, and Supabase pgvector natively for RAG workflows; pair one with DeepSeek for document Q&A over your own data
- Schedule summarization jobs — trigger the workflow on a cron to summarize daily logs, Slack digests, or monitoring alerts before they hit your inbox
This setup — open source workflow automation self-hosted on a single VPS — replaces what would otherwise cost $50–200/mo in LLM API fees for a moderately busy automation pipeline. The n8n self-hosted footprint on top of Ollama is under 1 GB of RAM at idle, leaving the rest for the model itself.