Project Gambit
JARVIS-inspired personal AI assistant. Named after Remy LeBeau.
Architecture Overview
Two-machine orchestration — desktop and homelab working in concert.
Colossus
Drew's DesktopHomelab Server
Integration HubData Flow
Voice Pipeline Deep Dive
End-to-end voice processing — from wake word to spoken response.
OpenWakeWord
Wake Word Detection
Always-on listener using ONNX Runtime. Detects custom wake phrase with configurable sensitivity threshold. Minimal CPU footprint — runs inference on small audio chunks.
Whisper
Speech-to-Text
OpenAI Whisper model for accurate transcription. Handles ambient noise, varied speaking speeds, and technical vocabulary. Streams audio chunks for low-latency results.
Claude API
LLM Processing
Anthropic Claude with the Remy personality system injected via system prompt. Context-aware — knows about running services, system state, and conversation history.
Kokoro TTS
Text-to-Speech · am_onyx voice
High-quality neural TTS with the am_onyx voice model. Includes a sanitization layer that strips markdown, code fences, and formatting that causes phonemizer issues.
Pipeline States
The Dashboard
Widget-based UI — every piece of state visible at a glance.
Personality System
Remy LeBeau isn't a gimmick — it's a tunable character engine.
Tunable Parameters
How often Cajun French phrases appear. 0 = standard English, 1 = heavy Cajun dialect.
Controls wit density. Low = serious professional, High = playful banter with card metaphors.
Response length preference. Concise for voice, detailed for complex technical topics.
Register control. Casual for daily use, formal when guests or professional context detected.
Generated Behavior Preview
You are Remy — Gambit's voice personality.
Inspired by Remy LeBeau from X-Men.
Behavioral guidelines (auto-generated):
- Use Cajun French phrases in ~65% of responses
("mon ami", "cher", "laissez les bons temps rouler")
- Humor: HIGH — include card/poker metaphors,
playful observations, light teasing
- Keep responses concise for voice delivery
- Casual register — Drew is a friend, not a boss
- When discussing tech, stay sharp and precise
but wrap it in personality
TTS sanitization rules:
- Strip markdown bold/italic markers
- Convert code fences to verbal descriptions
- Remove URLs (spell out domain if needed)
- Replace emoji with spoken equivalents Tech Stack
Three languages, two machines, one cohesive system.
Frontend
- Electron — desktop shell
- Svelte 5 — reactive UI with runes
- Tailwind CSS — utility-first styling
- Vite — build tooling + HMR
Backend
- Go — sidecar process (:9300)
- Python — voice pipeline runtime
- WebSocket — real-time bridge
AI / ML
- Claude API — LLM reasoning
- OpenWakeWord — wake detection
- Whisper — speech-to-text
- Kokoro TTS — neural voice synthesis
- ONNX Runtime — model inference
Testing
- 330 voice pipeline tests
- 104 gateway integration tests
- 21 Go sidecar tests
- 455 total — all passing
Infrastructure
- Docker — containerized services
- systemd — process management
- Tailscale — mesh networking
- SSE push — event streaming
By the Numbers
Hard metrics from a solo-built system.