ENCELADUS

Agent Coordination Platform

v0.17.0 Active Production AWS Serverless Multi-Agent

What Is It

Enceladus is a production platform I designed and built from scratch that solves a problem the AI industry is actively wrestling with: how do you get multiple AI agents from different providers to coordinate real work on shared projects without collisions, data loss, or governance gaps? The platform manages 5 active production systems, has closed 682+ tasks across 37 features, and runs on AWS serverless infrastructure at ~$35/month. I am the sole architect, developer, and operator.

The core innovation is treating project management primitives—Features, Tasks, Issues—not as database records, but as ontologically defined objects with governed lifecycles, required evidence gates, and deterministic completion contracts. Every mutation flows through a single MCP server enforcing governance authorization on every write. The result: any AI agent with credentials can safely participate in governed development workflows, and no task advances on trust alone.

Architecture & Design Decisions

Infrastructure

19 Lambda functions, 7 DynamoDB tables, 2 SQS FIFO queues, CloudFront CDN, API Gateway HTTP v2, Cognito auth with Lambda@Edge

Frontend

React 19 PWA with TypeScript, Vite, Tailwind CSS, TanStack React Query — mobile-first portfolio visibility

Agent Interface

MCP server (35+ tools, 8 domains) via stdio for desktop agents + Streamable HTTP Lambda with OAuth 2.1/PKCE for remote access from Claude.ai

CI/CD

5 GitHub Actions workflows, nightly SHA-256 parity audits, secrets guardrail, 13 deployment types with semver changelog

Multi-Agent

Coordination API with dispatch heuristics across Claude, OpenAI Codex, and AWS Bedrock; configurable rollback policies (continue, halt, rollback)

Systems Thinking & Key Innovations

Exclusive Checkout Service

Prevents agent collisions via atomic task ownership. Only the owning session can advance status. Child tasks support parallel dispatch. Commit Approval IDs (CAI) gate code completion; Commit Complete IDs (CCI) are validated in PR bodies by GitHub Actions before merge.

Evidence-Gated Lifecycle

Task state machine (open → in-progress → coding-complete → committed → pr → merged-main → deploy-success → closed) requires proof at every gate: commit SHAs validated against GitHub API, PR merge timestamps within 60-second tolerance, deployment evidence from GitHub Actions Jobs API with 7 validated fields.

Governance as Architecture

SHA-256 governance hash required on every write mutation. MCP-API boundary policy ensures no tool handler directly accesses DynamoDB business tables, preventing transport-specific behavior drift. Governance data dictionary provides ontological enforcement with field-level validation.

Token Economy Design

Every design decision weighs token cost. Prompt caching (90% discount), strategic model selection (Haiku vs Sonnet trade-offs), batch API integration (50% discount), and minimal-context session briefings keep operational intelligence affordable at scale.

Event-Driven Pipelines

DynamoDB Streams → EventBridge Pipes → SQS FIFO (natural debounce via 5-min visibility timeout) → Lambda feed publisher → S3/CloudFront invalidation. Deployment pipeline validates PR evidence against GitHub API before accepting any request.

By The Numbers

682+
Tasks Closed
~95%
Completion Rate
55
Knowledge Docs
~$35
Monthly Cost
35+
MCP Tools
19
Lambda Functions
5
Production Projects
1
Operator

Bottom Line

Enceladus demonstrates that a single operator with the right abstractions and governed AI agents can build and maintain production systems at a scale and quality level that would traditionally require a team. The platform's patterns—ontological entity discipline, evidence-gated state machines, exclusive checkout ownership—are architecturally ahead of where the multi-agent framework ecosystem (LangGraph, CrewAI, AutoGen) currently delivers, and represent the governance layer the industry is independently converging toward.

◆ Architecture Documentation