The Work
// 12+ years building platforms
Day 1 platform hire at a healthcare AI company building ambient clinical documentation. Founded and grew the Platform Engineering organization, established the infrastructure and developer experience that scaled the product from a proof of concept to full production scaling to over 250 healthcare systems.
Builders Portal — Internal Developer Platform
Nov 2025 – PresentSingle-handedly designed, built, and shipped Abridge's Internal Developer Portal — a full-stack platform (Go/Fiber backend, React/Vite frontend, Cloud SQL) that gives every engineer self-service infrastructure visibility, push-button deployments, and AI agent integration. 187 merged PRs across ~4 months, from first commit to production platform used by all of engineering.
A Backstage-alternative IDP running on Cloud Run in a dedicated operations project. 16 backend plugins, 24 frontend pages, full RBAC, immutable audit logging, and an MCP server for AI agent access.
Infrastructure Visibility
Unified runs view for Terraform plan/apply/cost across all infra-atmos stacks. PR-grouped workflow runs with fuzzy search, section cards (plan diffs, cost impact, AI analysis), and 30s auto-refresh.
Push-Button Deployments
4-step deployment wizard: pick a component → configure variables (auto-populated from stack context) → preview merged YAML → one-click PR creation. Batch multi-component deploys with dependency graphs, templates for common patterns (Slack apps, Go microservices, static sites), full deployment history, and rollback.
Service Catalog v2
Backstage-style service registry with ownership, tiers, compliance tracking, SLOs, and dependency mapping. Automatic GitHub sync, enrichment, filters, scorecard, and dark mode — the missing piece for ownership and compliance tracking at scale.
ArgoCD Dashboard
Real-time sync status and health for all ArgoCD-managed K8s apps. Drill-down to resource tree, sync triggers from the portal, and IAP-protected instance support.
Cloud Asset Inventory
Live GCP resource browser across compute, networking, data, and storage with change tracking and multi-scope support.
Image Catalog
Container image registry browser with SBOM integration (package name, version, license, purl), version history, and staleness tracking.
MCP Server — AI Agent Integration
Auto-generates 102 MCP tools from swagger.json. AI agents (APEX, Claude Code, Cursor) connect via Private Service Connect tunnel — fully private, no public internet. Per-user API keys (bpk_*), RBAC-scoped, SHA-256 hashed.
Innovation Lab + Roadmap
Bidirectional GitHub Issues sync, feature voting, threaded comments, and a 3-column roadmap board (Planned → In Progress → Shipped) with label-driven status.
Audit, RBAC & Security
Immutable audit log capturing every mutation. Three-role model with auto-provisioning from IAP headers, Google Group sync, and FedRAMP/HIPAA-ready retention. Slack DM notifications for deployment outcomes.
Gamification
Drift remediation leaderboard with points, streaks, badges, and speed multipliers — turning infrastructure hygiene into a competitive sport.
// architecture & platform maturity
Plugin Architecture
16 modular plugins that register their own routes and auto-upgrade storage from in-memory to PostgreSQL. New capabilities are added without touching core code.
Production Migration — Zero Downtime
Moved from dev project to abridge-operations with zero downtime (ADR-004). Separate prod/dev environments, dedicated runtime SAs, and independent Cloud SQL instances.
31-Issue Roadmap — 5 Tiers
Next priorities: self-serve secrets management, runbook execution, environment promotion, incident context linking, and cost dashboard — all tracked in Linear.
Marquee Wins
Founded & Scaled Platform Engineering
Day 1 hire with a mandate to build the platform organization from scratch. Grew the team from 1 to 25+ engineers over three years, establishing ownership boundaries, on-call rotations, paved-road tooling, and a culture of public recognition. Ran regular "State of the Platform" communications and engineering office hours.
ML Inference Platform
Stood up private GKE clusters for NVIDIA Triton-based inference workloads — took the first production ML inference system from zero to running. Built Terraform modules with proper networking, security groups, node pools, and service mesh. Partnered with MLOps on model deployment, promotion workflows, and runtime telemetry.
Enterprise Healthcare Partner Onboarding
Unblocked a major healthcare enterprise partner by standing up a complete, isolated environment spanning 8+ repositories in a single week. Immediately enabled deal-critical integration testing and became the template for all subsequent enterprise onboarding.
FedRAMP & Regulated Market Readiness
Led onboarding of GCP Security Command Center Enterprise as the foundation for regulated market entry. Contributed to SSP and POA&M, defined logging and monitoring controls, and supported 3PAO audits. Positioned the platform to pursue regulated government healthcare contracts.
Multi-region Reliability & Cost Governance
Designed multi-region foundations, DR runbooks, and release strategies (blue/green, canary). Governed 8-figure/month cloud spend without slowing delivery, scaling from early PoC to 250+ customer deployments. Established the observability program — golden signals, SLOs, alerting, tracing — and served as incident commander for critical production events.
Prior Roles
// the path that got here
- Senior technical resource for AWS and GCP, automation, and IaC; mentored engineers across teams.
- Migrated Postgres 9.x to 12 and to Aurora; moved in-memory stores to ElastiCache (Redis).
- Established GitOps patterns and reusable Terraform/Ansible/SSM modules; standardized CI for application and infrastructure.
- Built and maintained the BI data platform: Redshift, Fivetran, and in-house ETL; implemented streaming for near real-time replication.
- Owned CI/CD pipelines for application and IaC deployments.
- Partnered on PCI-DSS, HIPAA, SOC 2, and HITRUST; set patch/CVE remediation and incident standards with Security, Compliance, and Legal.
- Led backend infrastructure refactors to scale efficiently while managing costs; shifted the organization to a proactive platform engineering model.
- Led migration from Rancher to Kubernetes (kOps on AWS/EC2) with automated blue/green deploys, merge gates, coverage, and tests.
- Built an OAuth 2.0 SSO connector in Go for Kubernetes; centralized cluster authentication.
- AWS SME across EC2, RDS, EKS, ECS, API Gateway, Lambda, S3, Glacier, and related services.
- Authored outage playbooks and reliability runbooks; created an internal knowledge base.
- Drove CloudFormation (Troposphere) → Terraform migration of 100K+ lines; mentored via pairing and coaching.
- Built turnkey blue/green deployment pipelines for core e-commerce; increased deploy frequency and rollback safety.
- Converted 99% of infrastructure from on-prem Rackspace to AWS in ~12 months; codified networking, compute, and application stacks.
- Designed end-to-end observability: New Relic + client telemetry, Prometheus/Graphite + Grafana, and TICK/ELK for logging; led post-mortems.
- Established automated pipelines for code, servers, networking, and underlying plumbing; senior escalation owner through post-mortem.
- Created SCM strategies and CI/CD guardrails for the infra/devops team; embedded as AWS SME with delivery teams.
- Architected Docker/Kubernetes infrastructure (workers/managers, config management, load balancers, firewalls, dev environments) including performance testing and automation.
- Operated an ElasticSearch cluster; integrated AWS and Azure; administered 1+ PB of storage across three domains and networks.
- Managed a Hyper-V estate of ~300 VMs and 100 physical servers across 3 data centers; implemented disaster recovery and ran quarterly recovery drills for 400+ servers.
- Owned the Microsoft web stack for 400+ sites (SSL, IIS bindings), DNS/DHCP/ADFS/Group Policy across 4 domains including PCI-scoped environments.
- Roles at Tribe513, M33, Robert Half/GHS, ZF Group, and others. Delivered Hyper-V and network architecture, enterprise SCCM deployments, multi-site Windows domain services (DHCP/DNS/ADFS/Group Policy), and large-scale patch and automation programs. Details available on request.