Active Development · 100+ PRs · 455 Tests

Project Gambit

JARVIS-inspired personal AI assistant. Named after Remy LeBeau.

Electron Go Python Svelte 5 Claude API Voice Pipeline
01

Architecture Overview

Two-machine orchestration — desktop and homelab working in concert.

🖥

Colossus

Drew's Desktop
Electron App Svelte 5 + Tailwind
Go Sidecar :9300 · WebSocket bridge
Python Voice Runtime Wake word · STT · TTS
Tailscale · SSE · WebSocket
🏠

Homelab Server

Integration Hub
Integration Gateway Adapter registry · routing
Docker Management Container orchestration
Media Stack Awareness Plex · Sonarr · Radarr

Data Flow

🎤
Wake Word
🔊
STT
🧠
LLM
🗣
TTS
🔈
Speaker
02

Voice Pipeline Deep Dive

End-to-end voice processing — from wake word to spoken response.

01

OpenWakeWord

Wake Word Detection

Always-on listener using ONNX Runtime. Detects custom wake phrase with configurable sensitivity threshold. Minimal CPU footprint — runs inference on small audio chunks.

02

Whisper

Speech-to-Text

OpenAI Whisper model for accurate transcription. Handles ambient noise, varied speaking speeds, and technical vocabulary. Streams audio chunks for low-latency results.

03

Claude API

LLM Processing

Anthropic Claude with the Remy personality system injected via system prompt. Context-aware — knows about running services, system state, and conversation history.

04

Kokoro TTS

Text-to-Speech · am_onyx voice

High-quality neural TTS with the am_onyx voice model. Includes a sanitization layer that strips markdown, code fences, and formatting that causes phonemizer issues.

Pipeline States

IDLE Waiting for wake word. Orb pulses slowly.
LISTENING Recording audio. Orb expands with volume.
PROCESSING STT + LLM inference. Orb spins amber.
SPEAKING TTS playback. Orb ripples teal.
ERROR Failure recovery. Orb flashes red.
03

The Dashboard

Widget-based UI — every piece of state visible at a glance.

StatusOrb

Multi-layered animated orb. Four render layers — core, inner glow, aura, and ripple — each responding to pipeline state. Color shifts between violet, blue, amber, teal.

Voice Activity

Real-time volume meter, wake word detection events, and live STT transcription preview as words are recognized.

Personality Controls

cajun_freq
0.65
humor
0.80
verbose
off

Cajun frequency slider, humor level, verbosity toggle — tune how Remy speaks in real time.

System Status

CPU
34%
RAM
62%
GPU
15%
Disk
48%

Live system metrics — CPU, RAM, GPU utilization, and disk usage.

Integration Health

Gateway Connection
Media Adapter
Home Adapter
Voice Runtime

Gateway connection status and individual adapter health indicators. Green for connected, amber for degraded.

Recent Conversations

14:32 "Hey Gambit, what's playing on Plex?"
14:32 "Looks like Arcane Season 2, mon ami. Episode 6."
14:28 "Check the Docker containers."

Scrollable conversation history with timestamps — review what Gambit has been up to.

04

Personality System

Remy LeBeau isn't a gimmick — it's a tunable character engine.

Tunable Parameters

cajun_frequency 0.65

How often Cajun French phrases appear. 0 = standard English, 1 = heavy Cajun dialect.

humor_level 0.80

Controls wit density. Low = serious professional, High = playful banter with card metaphors.

verbosity concise

Response length preference. Concise for voice, detailed for complex technical topics.

formality casual

Register control. Casual for daily use, formal when guests or professional context detected.

Generated Behavior Preview

remy.md — system prompt
You are Remy — Gambit's voice personality.
Inspired by Remy LeBeau from X-Men.

Behavioral guidelines (auto-generated):
- Use Cajun French phrases in ~65% of responses
  ("mon ami", "cher", "laissez les bons temps rouler")
- Humor: HIGH — include card/poker metaphors,
  playful observations, light teasing
- Keep responses concise for voice delivery
- Casual register — Drew is a friend, not a boss
- When discussing tech, stay sharp and precise
  but wrap it in personality

TTS sanitization rules:
- Strip markdown bold/italic markers
- Convert code fences to verbal descriptions
- Remove URLs (spell out domain if needed)
- Replace emoji with spoken equivalents
05

Tech Stack

Three languages, two machines, one cohesive system.

Frontend

  • Electron — desktop shell
  • Svelte 5 — reactive UI with runes
  • Tailwind CSS — utility-first styling
  • Vite — build tooling + HMR
⚙️

Backend

  • Go — sidecar process (:9300)
  • Python — voice pipeline runtime
  • WebSocket — real-time bridge
🧠

AI / ML

  • Claude API — LLM reasoning
  • OpenWakeWord — wake detection
  • Whisper — speech-to-text
  • Kokoro TTS — neural voice synthesis
  • ONNX Runtime — model inference
🧪

Testing

  • 330 voice pipeline tests
  • 104 gateway integration tests
  • 21 Go sidecar tests
  • 455 total — all passing
🏗

Infrastructure

  • Docker — containerized services
  • systemd — process management
  • Tailscale — mesh networking
  • SSE push — event streaming
06

By the Numbers

Hard metrics from a solo-built system.

100+ PRs Merged
455 Tests Passing
3 Languages Go · Python · TS/Svelte
6 Gap Analysis Cycles
5 Pipeline States
1 Solo Developer