Files
ai-stack-deployer/CLAUDE.md
2026-01-10 14:17:31 +01:00

14 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨

YOU ARE FORCED TO FOLLOW THESE PRINCIPLE RULES

  • MUST USE SKILL TODO**
  • MUST FOLLOW YOUR TODO
  • MUST USE DOCUMENTATION/REPOSITORIES AFTER 3 TRIES
  • MUST PROPPERLY TEST WHAT YOU ARE DOING
  • NEVER NEVER ASSUME
  • MUST BE SURE
  • MUST DOCUMENT YOU FINDINGS FOR THE NEXT TIME
  • MUST CLEAN UP PROPPERLY
  • MUST USE/UPDATE YOUR TEST DOCUMENT

🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨

Project Overview

AI Stack Deployer is a self-service portal that deploys personal OpenCode AI coding assistant stacks. Users enter a name, and the system provisions containers via Dokploy to create a fully functional AI stack at {name}.ai.flexinit.nl. Wildcard DNS and SSL are pre-configured, so deployments only need to create the Dokploy project and application.

Core Architecture

Deployment Flow

The system orchestrates deployments through Dokploy, leveraging pre-configured infrastructure:

  1. Dokploy API - Manages projects, applications, and container deployments
  2. Traefik - Handles SSL termination and routing (pre-configured wildcard DNS and SSL)

Each deployment creates:

  • Dokploy project: ai-stack-{name}
  • Application with OpenCode server + ttyd terminal
  • Domain configuration with automatic HTTPS (Traefik handles SSL via wildcard cert)

Note: DNS is pre-configured with wildcard *.ai.flexinit.nl144.76.116.169. Individual DNS records are NOT created per deployment - Traefik routes based on hostname matching.

Two Runtime Modes

  1. HTTP Server (src/index.ts) - Hono-based API for web portal

    • Fully implemented production-ready web application
    • REST API endpoints for deployment management
    • Server-Sent Events (SSE) for real-time progress streaming
    • Static frontend serving (HTML/CSS/JS)
    • In-memory deployment state tracking
    • CORS and logging middleware
  2. MCP Server (src/mcp-server.ts) - Model Context Protocol server

    • Development tool for Claude Code integration
    • Exposes deployment tools via stdio transport
    • Same deployment logic as HTTP server
    • Useful for testing and automation

API Clients

HetznerDNSClient (src/api/hetzner.ts):

  • Available for DNS management via Hetzner Cloud API
  • NOT used in deployment flow (wildcard DNS already configured)
  • Key methods: createARecord(), recordExists(), findRRSetByName()
  • Could be used for manual DNS operations or testing

DokployClient (src/api/dokploy.ts):

  • Orchestrates container deployments (primary deployment mechanism)
  • Key flow: createProject()createApplication()createDomain()deployApplication()
  • Communicates with internal Dokploy at http://10.100.0.20:3000
  • Traefik on Dokploy automatically handles SSL via pre-configured wildcard certificate

HTTP API Endpoints

The HTTP server exposes the following endpoints:

Health Check:

  • GET /health - Returns service health status

Deployment:

  • POST /api/deploy - Start a new deployment
    • Body: { "name": "stack-name" }
    • Returns: { deploymentId, url, statusEndpoint }
  • GET /api/status/:deploymentId - SSE stream for deployment progress
    • Events: progress, complete, error
  • GET /api/check/:name - Check if a name is available
    • Returns: { available, valid, error? }

Frontend:

  • GET / - Serves the web UI (src/frontend/index.html)
  • GET /static/* - Serves static assets (CSS, JS)

State Management

Both servers track deployments in-memory using a Map:

interface DeploymentState {
  id: string;              // dep_{timestamp}_{random}
  name: string;            // normalized username
  status: 'initializing' | 'creating_project' | 'creating_application' |
          'deploying' | 'completed' | 'failed';
  url?: string;            // https://{name}.ai.flexinit.nl
  error?: string;
  projectId?: string;
  applicationId?: string;
  progress: number;        // 0-100 (HTTP server only)
  currentStep: string;     // Human-readable step (HTTP server only)
}

Note: State is in-memory only and lost on server restart. For production with persistence, implement database storage.

Name Validation

Stack names must be:

  • 3-20 characters
  • Lowercase alphanumeric with hyphens
  • Cannot start/end with hyphen
  • Not in reserved list (admin, api, www, root, system, test, demo, portal)

Frontend Architecture

Location: src/frontend/

  • index.html - Main UI with state machine (form, progress, success, error)
  • style.css - Modern gradient design with animations
  • app.js - Vanilla JavaScript with SSE client and real-time validation

Features:

  • Real-time name availability checking
  • Client-side validation with server verification
  • SSE-powered live deployment progress tracking
  • State machine: Form → Progress → Success/Error
  • Responsive design (mobile-friendly)
  • No framework dependencies (vanilla JS)

Development Commands

# Development server (HTTP API with hot reload)
bun run dev

# Production server (HTTP API)
bun run start

# MCP server (for Claude Code integration)
bun run mcp

# Type checking
bun run typecheck

# Build for production
bun run build

# Test API clients (requires valid credentials)
bun run src/test-clients.ts

# Docker commands
docker build -t ai-stack-deployer .
docker-compose up -d
docker-compose logs -f
docker-compose down

Session Management

The project supports two types of Claude Code sessions:

🤖 Built-in Sessions (Automatic)

  • Created automatically by Claude Code for every conversation
  • Stored in ~/.claude/projects/.../
  • Resume with: claude --session-id {uuid} or claude --continue

📁 Custom Sessions (Optional, for organization)

  • Created explicitly via ./scripts/claude-start.sh {name}
  • Enable named sessions and Graphiti Memory auto-integration
  • Best for: feature development, bug fixes, multi-day work

Commands

# List ALL sessions (both built-in and custom)
bash scripts/claude-session.sh list

# Create/resume custom named session
./scripts/claude-start.sh feature-http-api

# Delete a custom session
bash scripts/claude-session.sh delete feature-http-api

# Override permission mode (default: bypassPermissions)
CLAUDE_PERMISSION_MODE=prompt ./scripts/claude-start.sh feature-name

Custom Session Benefits

Automatic Configuration:

  • Permission mode: bypassPermissions (no permission prompts for file operations)
  • Session ID: Persistent UUID throughout work session
  • Environment variables: Auto-set for Graphiti Memory integration

Environment Variables (Set Automatically):

CLAUDE_SESSION_ID=550e8400-e29b-41d4-a716-446655440000
CLAUDE_SESSION_NAME=feature-http-api
CLAUDE_SESSION_START=2026-01-09 20:16:00
CLAUDE_SESSION_PROJECT=ai-stack-deployer
CLAUDE_SESSION_MCP_GROUP=project_ai_stack_deployer

Graphiti Memory Integration:

// At session end, store learnings
graphiti-memory_add_memory({
  name: "Session: feature-http-api - 2026-01-09",
  episode_body: "Session ID: 550e8400. Implemented HTTP server endpoints for deploy API. Added SSE for progress updates. Tests passing.",
  group_id: "project_ai_stack_deployer"  // Auto-set from CLAUDE_SESSION_MCP_GROUP
})

Storage:

  • Custom sessions: $HOME/.claude/sessions/ai-stack-deployer/*.session
  • Built-in sessions: ~/.claude/projects/-home-odouhou-locale-projects-ai-stack-deployer/*.jsonl

Environment Variables

Required for deployment operations:

Optional configuration:

  • PORT - HTTP server port (default: 3000)
  • HOST - HTTP server bind address (default: 0.0.0.0)
  • STACK_DOMAIN_SUFFIX - Domain suffix for stacks (default: ai.flexinit.nl)
  • STACK_IMAGE - Docker image for user stacks
  • RESERVED_NAMES - Comma-separated list of forbidden names

Not used in deployment (available for testing/manual operations):

  • HETZNER_API_TOKEN - Hetzner Cloud API token
  • HETZNER_ZONE_ID - DNS zone ID (343733 for flexinit.nl)
  • TRAEFIK_IP - Public IP (144.76.116.169) - only for reference

See .env.example for complete configuration template.

MCP Server Integration

The MCP server is configured in .mcp.json and provides these tools:

  • deploy_stack - Deploys a new AI stack (Dokploy orchestration only, no DNS creation)
  • check_deployment_status - Query deployment progress by ID
  • list_deployments - List all deployments in current session
  • check_name_availability - Validate name before deployment
  • test_api_connections - Verify Hetzner and Dokploy connectivity (both clients available for testing)

To test MCP functionality:

# Start MCP server
bun run mcp

# Test API connections
bun run src/test-clients.ts

Key Implementation Details

Error Handling

Both API clients throw errors on failure. The MCP server catches these and returns structured error responses. No automatic retry logic exists yet.

Deployment Idempotency

  • Dokploy projects: Searches for existing project by name before creating
  • Creates only if not found
  • No automatic cleanup on partial failures
  • DNS is wildcard-based, so no per-deployment DNS operations needed

Concurrency

The MCP server handles one request at a time per invocation. No rate limiting or queue management exists yet.

Security Notes

  • All tokens in environment variables (never in code)
  • Dokploy URL is internal-only (10.100.0.x network)
  • No authentication on HTTP endpoints (portal will need auth)
  • Name validation prevents injection attacks

Testing Strategy

Currently implemented:

  • src/test-clients.ts - Manual testing of Hetzner and Dokploy clients
  • Requires real API credentials in .env
  • Note: Only Dokploy client is used in actual deployments

Missing (needs implementation):

  • Unit tests for validation logic
  • Integration tests for deployment flow
  • Mock API clients for testing without credentials
  • Health check monitoring
  • Rollback on failures

Common Patterns

Adding a New MCP Tool

  1. Define tool schema in tools array (src/mcp-server.ts:178)
  2. Add case to switch statement in CallToolRequestSchema handler (src/mcp-server.ts:249)
  3. Extract typed arguments: const { arg } = args as { arg: Type }
  4. Return structured response with content: [{ type: 'text', text: JSON.stringify(...) }]

Adding HTTP Endpoints

  1. Add route to Hono app in src/index.ts
  2. Use API clients from src/api/ directory
  3. Return JSON with consistent error format
  4. Consider adding SSE for long-running operations

Extending API Clients

  • Keep TypeScript interfaces at top of file
  • Use satisfies for type-safe request bodies
  • Throw descriptive errors (include API status codes)
  • Add methods to client class, use private request() helper

Production Deployment

Docker Build and Run

# Build the Docker image
docker build -t ai-stack-deployer:latest .

# Run with docker-compose (recommended)
docker-compose up -d

# Or run manually
docker run -d \
  --name ai-stack-deployer \
  -p 3000:3000 \
  --env-file .env \
  ai-stack-deployer:latest

Deploying to Dokploy

  1. Prepare Environment:

    • Ensure .env file has valid DOKPLOY_API_TOKEN
    • Verify DOKPLOY_URL points to internal Dokploy instance
  2. Build and Push Image (if using custom registry):

    docker build -t your-registry/ai-stack-deployer:latest .
    docker push your-registry/ai-stack-deployer:latest
    
  3. Deploy via Dokploy UI:

    • Create new project: ai-stack-deployer-portal
    • Create application from Docker image
    • Configure domain (e.g., portal.ai.flexinit.nl)
    • Set environment variables from .env
    • Deploy
  4. Verify Deployment:

    curl https://portal.ai.flexinit.nl/health
    

Health Monitoring

The application includes a /health endpoint that returns:

{
  "status": "healthy",
  "timestamp": "2026-01-09T...",
  "version": "0.1.0",
  "service": "ai-stack-deployer",
  "activeDeployments": 0
}

Docker health check runs every 30 seconds and restarts container if unhealthy.

Infrastructure Dependencies

  • Wildcard DNS - *.ai.flexinit.nl144.76.116.169 (pre-configured in Hetzner DNS)
  • Traefik at 144.76.116.169 - Pre-configured wildcard SSL certificate for *.ai.flexinit.nl
  • Dokploy at 10.100.0.20:3000 - Container orchestration platform (handles all deployments)
  • Docker image - oh-my-opencode-free (OpenCode + ttyd terminal)

Key Point: Individual DNS records are NOT created per deployment. The wildcard DNS and SSL are already configured, so Traefik automatically routes {name}.ai.flexinit.nl to the correct container based on hostname matching.

Logging Infrastructure

AI Stack logging integrates with the existing monitoring stack at logs.intra.flexinit.nl.

Components

Component Location Purpose
Log-ingest http://ai-stack-log-ingest:3000 (dokploy-network) Receives events from AI stacks, pushes to Loki
Loki monitor-grafanaloki-qkj16i-loki-1 Log storage
Grafana https://logs.intra.flexinit.nl Visualization
Dashboard /d/ai-stack-overview AI Stack metrics and logs

Datasource UIDs (Grafana)

  • Loki: af9a823s6iku8b
  • Prometheus: cf9r1fmfw9xxcf

Configuration

AI stacks send logs via environment variable:

LOG_INGEST_URL=http://ai-stack-log-ingest:3000/ingest

Local Development

The logging-stack/ directory contains a standalone docker-compose for local testing:

cd logging-stack && docker-compose up -d

Credentials

Grafana service account token stored in BWS:

  • Key: GRAFANA_OPENCODE_ACCESS_TOKEN
  • BWS ID: c77e58e3-fb34-41dc-9824-b3ce00da18a0

Project Status

Completed:

  • HTTP Server with REST API and SSE streaming
  • Frontend UI with real-time deployment tracking
  • MCP Server for Claude Code integration
  • Docker configuration for production deployment
  • Full deployment orchestration via Dokploy API
  • Name validation and availability checking
  • Error handling and progress reporting

Ready for Production Deployment