Category: General

General posts and updates

Local Voice Assistant: Whisper + Ollama + Piper on the Raspberry Pi
!Microphone – symbol of local voice processing

Photo: Holger.Ellgaard, Wikimedia Commons, CC BY-SA 3.0

Voice Control Without the Cloud: Is That Really Possible?

Alexa, Google Home, Siri — they all work well, but share one common denominator: your voice commands end up on someone else’s servers. In the smart home context, where commands like “turn off the bedroom light” or “unlock the front door” are the norm, that’s unnecessary data sharing.

The good news: in 2026, a fully local voice assistant is no longer a DIY project requiring hours of configuration. With Home Assistant’s Wyoming protocol, Whisper for speech recognition, Ollama as the LLM backend, and Piper for text-to-speech, you have all the ingredients — and the Raspberry Pi 5 is powerful enough to process everything locally.

The Architecture: Four Building Blocks, One System
```
Microphone → [STT] Whisper → [LLM] Ollama → [TTS] Piper → Speaker
                 ↕                ↕               ↕
             Wyoming Protocol ← Home Assistant Assist
```
The Wyoming protocol is the glue: it defines how Home Assistant communicates with external STT, TTS, and wake word services. All components run as Docker containers on the home server and are automatically discovered by HA.

!Speech-to-Text process

Illustration of the speech-to-text pipeline. Wikimedia Commons, LGPL

Block 1: Whisper (Speech-to-Text)

wyoming-faster-whisper is the recommended STT component for HA. It’s built on faster-whisper — a CTranslate2 reimplementation that runs up to 4x faster than the original PyTorch model on CPU.

Model recommendations for Pi 5:

Model RAM Required Quality

tiny ~1 GB for testing

small ~2 GB good balance

large-v3-turbo ~3 GB recommended (almost as good as large-v3)

The large-v3-turbo model is OpenAI’s smart reduction of large-v3 from 32 to 4 decoder layers — nearly the same recognition accuracy, significantly less RAM. For non-English languages it’s clearly the first choice.

Block 2: Ollama (Language Model)

Ollama runs as a Docker container and exposes an OpenAI-compatible API. Home Assistant has had a native Ollama integration since 2024, which uses the LLM directly as a conversation agent in the Assist pipeline.

Model recommendations for voice assistants:
- Pi 5, 8 GB RAM: llama3.2:3b (~2 GB) or gemma3:1b (~800 MB)
- Pi 5, 16 GB RAM: llama3.1:8b (~5 GB) for better language comprehension quality
- Specifically for HA control: fixt/home-3b-v3 — a model fine-tuned for home automation commands that returns device states as function calls
Response times are honest: llama3.2:3b takes 5–15 seconds on the Pi 5 for a short reply. That’s borderline for a voice assistant, but acceptable — especially when the server is dedicated and has no other load spikes.

Block 3: Piper (Text-to-Speech)

Piper is the Open Home Foundation’s neural TTS system, optimized for embedded hardware. A typical sentence is synthesized on the Pi 5 in under one second. Many languages are supported with multiple voices at different quality levels:
- en_US-lessac-high — high quality American English male voice
- en_US-amy-medium — medium quality female voice
- en_GB-alan-medium — British English male voice
Block 4: wyoming-satellite (optional, but useful)

Don’t want to plug a microphone directly into the home server, but want voice input and output in multiple rooms? wyoming-satellite turns an inexpensive Raspberry Pi Zero 2W with a USB microphone into a distributed voice satellite. Audio streams go over LAN to the server; processing stays centralized.

Setup: Docker Compose in 30 Minutes

All components can be set up as a Docker stack. Here’s a complete Compose file for the home server:
```
version: "3.8"

services:
  wyoming-whisper:
    image: rhasspy/wyoming-faster-whisper
    ports:
      - "10300:10300"
    volumes:
      - whisper-data:/data
    environment:
      - WHISPER_MODEL=large-v3-turbo
      - WHISPER_LANGUAGE=en
    restart: unless-stopped

  ollama:
    image: ollama/ollama
    ports:
      - "11434:11434"
    volumes:
      - ollama-data:/root/.ollama
    restart: unless-stopped

  wyoming-piper:
    image: rhasspy/wyoming-piper
    ports:
      - "10200:10200"
    volumes:
      - piper-data:/data
    environment:
      - PIPER_VOICE=en_US-lessac-high
    restart: unless-stopped

volumes:
  whisper-data:
  ollama-data:
  piper-data:
```
After startup:
```
# Pull the Ollama model
docker exec ollama ollama pull llama3.2:3b
# or for HA-specific control:
docker exec ollama ollama pull fixt/home-3b-v3
```
Configuring Home Assistant

1. Add Wyoming integration: Settings → Integrations → Wyoming Protocol

– STT: :10300

– TTS: :10200

2. Add Ollama integration: Settings → Integrations → Ollama

– URL: http://:11434

– Model: fixt/home-3b-v3 or llama3.2:3b

3. Create Assist pipeline: Settings → Voice → Assist

– Speech recognition: Wyoming (wyoming-faster-whisper)

– Conversation agent: Ollama

– Text-to-speech: Wyoming (wyoming-piper)

After that, any HA device with a microphone — or a satellite Pi in the hallway — can accept voice commands.

Realistic Performance Numbers

On a Raspberry Pi 5 with 16 GB RAM and SSD:

Phase Duration

Wake word detection <100 ms (openwakeword)

STT (large-v3-turbo, short sentence) 1–3 s

LLM response (llama3.2:3b) 5–15 s

TTS (Piper, one sentence) <1 s

Total ~7–20 s

That’s not Alexa-fast. Users wanting shorter latency can drop to smaller models (gemma3:1b + small Whisper) and accept slightly lower quality — or use Speaches as a combined STT/TTS server that mimics the OpenAI API and integrates with n8n and other tools.

When Is It Worth It?

A local voice assistant is worth it primarily when:
- Privacy is a priority — not a single voice fragment leaves the home network
- Reliability without internet matters — no outage during cloud disruptions
- HA integration is the focus — no other assistant knows your HA entities as well as a directly integrated local LLM
- You already run a home server — the additional load is manageable
Those who primarily want to play music, set timers, or check the weather will find a cloud solution more convenient day-to-day. The local assistant shines with complex HA commands: “Turn all lights on the ground floor to 30 percent and close the blinds” — a sentence that locally trained models like home-3b-v3 can directly translate into HA actions.

Sources and Further Reading
March 13, 2026
AI Agents in 2026: From Chatbots to Autonomous Systems at Home
From Concept to Reality: AI Agents in Everyday Life

Just two years ago, autonomous AI agents were mostly a topic for research labs and tech giants. In 2026, that has fundamentally changed. Models run locally on a Raspberry Pi 5, Home Assistant talks directly to a self-hosted LLM, and n8n workflows use agents that make decisions independently.

But what exactly is an AI agent, and why is now the right time to start exploring them?

What Makes an AI Agent?

A classic language model answers a question — and that’s it. An AI agent, on the other hand, can:
- Pursue goals, not just respond to prompts
- Use tools (call APIs, read files, execute code)
- Plan multiple steps and pass results between them
- Collaborate with other agents
The key paradigm shift: instead of specifying *how* to do something, you tell the agent *what* the goal is. Intent-based computing instead of instruction-based computing.

The Most Important Trends in 2026

1. Multi-Agent Systems Become the Standard

Single agents hit limits quickly. The answer: teams of specialized agents that solve complex tasks together. Frameworks like CrewAI (44,000+ GitHub stars) and Microsoft Research’s AutoGen (54,000+ GitHub stars) make it possible to coordinate agents with clearly defined roles — researcher, writer, reviewer — into a coherent workflow.

For home users, this is especially interesting: these systems can run entirely locally, with no dependency on cloud APIs.

2. Local LLMs Have Reached Maturity

Ollama has established itself as the de facto standard for local model management. A single command is enough to start models like Llama 3.2, Mistral 7B, or DeepSeek-R1 — with an OpenAI-compatible API that works with virtually every tool.

Hardware requirements in 2026 are manageable:

Model RAM Required Best For
Llama 3.2 3B 4 GB Simple tasks, fast responses
Mistral 7B 8 GB Good all-round model
Llama 3.1 8B 8–10 GB More complex reasoning
Qwen 2.5 Coder 8 GB Code generation

A Raspberry Pi 5 with 8 GB RAM can run Llama 3.2 3B without issue — not lightning fast, but completely adequate for many home automation tasks.

3. Home Assistant Becomes the AI Control Center

Home Assistant has evolved into the natural integration platform for local AI agents. Since the blog post from September 2025, HA supports full AI Task Entities, tool-calling, and agentic loop support.

The home-llm integration goes even further: a local model gets access to all HA entities and can control devices autonomously, without having to explicitly program every command. The model understands context — “it’s getting cold” can result in the heat being turned up and the blinds closing.

Practical Example: Local Voice Assistant with Whisper + Ollama
```
Microphone → Whisper (Speech-to-Text, local)
           → Ollama Llama 3.2 (Intent + Tool-Calling)
           → Home Assistant REST API
           → Device is controlled
```
Latency: under 2 seconds on a Raspberry Pi 5 with 16 GB RAM. Completely offline — no data leaves the home network.

4. Raspberry Pi AI HAT+ 2: Dedicated AI Hardware for the Pi

Since January 2026, the Raspberry Pi AI HAT+ 2 has been available. The Hailo-10H accelerator brings up to 40 TOPS (INT4) and 8 GB of its own LPDDR4X memory. This significantly offloads the main processor and enables faster inference at noticeably lower power consumption.

For home automation, this means: a Pi 5 with AI HAT+ 2 can continuously evaluate sensor data, detect anomalies, and act proactively — without noticeable performance overhead for other tasks.

5. n8n + Ollama: Visual Agent Workflows Without Coding

n8n has established itself as the ideal platform for agentic workflows that don’t require programming. Combined with a local Ollama server, powerful automations emerge:
- Energy reporting: Sensor data from Home Assistant → Ollama analyzes → WhatsApp summary
- Smart alerts: Anomaly in consumption data → Agent evaluates context → Push notification only when truly relevant
- Shopping assistant: Inventory sensor drops below threshold → Agent checks calendar and prices → Shopping list in Notion
A simple n8n setup for local AI:
```
{
  "nodes": [
    { "type": "n8n-nodes-base.scheduleTrigger" },
    { "type": "@n8n/n8n-nodes-langchain.lmOllama",
      "parameters": { "model": "llama3.2", "baseUrl": "http://localhost:11434" }
    },
    { "type": "@n8n/n8n-nodes-langchain.agent" }
  ]
}
```
Getting Started: Practical Recommendations

Level 1 — Experiment locally (doable right now):
1. Install Ollama: curl -fsSL https://ollama.com/install.sh | sh
2. Pull a model: ollama pull llama3.2
3. Start Open WebUI as a chat interface via Docker
Level 2 — Connect to Home Assistant:
1. Install home-llm Custom Integration via HACS
2. Configure the Ollama endpoint in HA
3. Set Assist to use your local LLM as the conversation agent
Level 3 — Build your own agents:
1. Connect n8n with the Ollama node
2. Build first agentic workflows for energy reporting or notifications
3. Optional: CrewAI or LangGraph for more complex multi-agent scenarios
What’s Coming Next?

Development continues to accelerate. A few trends taking shape in 2026:
- Embedded agents: Smaller, specialized models running directly on microcontrollers — first experiments with ESP32 and Cortex-M are underway
- Persistent memory: Agents that learn across sessions and permanently store personal preferences
- Local computer use: Agents that operate the desktop — currently cloud-only, but first local implementations are on the horizon
Conclusion

2026 is the year AI agents found their way from data centers into the living room. The combination of powerful local hardware (Raspberry Pi 5, AI HAT+), mature frameworks (Ollama, Home Assistant, n8n), and improved models makes it possible to run real agent systems without any cloud dependency.

The barrier to entry has never been lower — and full control over your own data stays entirely with you.

Sources:
March 13, 2026

Model	RAM Required	Quality
`tiny`	~1 GB	for testing
`small`	~2 GB	good balance
`large-v3-turbo`	~3 GB	recommended (almost as good as large-v3)

Phase	Duration
Wake word detection	<100 ms (openwakeword)
STT (large-v3-turbo, short sentence)	1–3 s
LLM response (llama3.2:3b)	5–15 s
TTS (Piper, one sentence)	<1 s
Total	~7–20 s

Model	RAM Required	Best For
Llama 3.2 3B	4 GB	Simple tasks, fast responses
Mistral 7B	8 GB	Good all-round model
Llama 3.1 8B	8–10 GB	More complex reasoning
Qwen 2.5 Coder	8 GB	Code generation

Category: General

Local Voice Assistant: Whisper + Ollama + Piper on the Raspberry Pi

Voice Control Without the Cloud: Is That Really Possible?

The Architecture: Four Building Blocks, One System

Block 1: Whisper (Speech-to-Text)

Block 2: Ollama (Language Model)

Block 3: Piper (Text-to-Speech)

Block 4: wyoming-satellite (optional, but useful)

Setup: Docker Compose in 30 Minutes

Configuring Home Assistant

Realistic Performance Numbers

When Is It Worth It?

Sources and Further Reading

AI Agents in 2026: From Chatbots to Autonomous Systems at Home

From Concept to Reality: AI Agents in Everyday Life

What Makes an AI Agent?

The Most Important Trends in 2026

1. Multi-Agent Systems Become the Standard

2. Local LLMs Have Reached Maturity

3. Home Assistant Becomes the AI Control Center

4. Raspberry Pi AI HAT+ 2: Dedicated AI Hardware for the Pi

5. n8n + Ollama: Visual Agent Workflows Without Coding

Getting Started: Practical Recommendations

What’s Coming Next?

Conclusion