Files
ai-stack-deployer/CLAUDE.md
Oussama Douhou 55378f74e0 fix: Docker build AVX issue with Node.js/Bun hybrid strategy
- Switch build stage from Bun to Node.js to avoid AVX CPU requirement
- Use Node.js 20 Alpine for building React client (Vite)
- Keep Bun runtime for API server (no AVX needed for runtime)
- Update README.md with build strategy and troubleshooting
- Update CLAUDE.md with Docker architecture documentation
- Add comprehensive docs/DOCKER_BUILD_FIX.md with technical details

Fixes #14 - Docker build crashes with "CPU lacks AVX support"

Tested:
- Docker build: SUCCESS
- Container runtime: SUCCESS
- Health check: PASS
- React client serving: PASS
2026-01-13 11:42:15 +01:00

16 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨

YOU ARE FORCED TO FOLLOW THESE PRINCIPLE RULES

  • MUST USE SKILL TODO**
  • MUST FOLLOW YOUR TODO
  • MUST USE DOCUMENTATION/REPOSITORIES AFTER 3 TRIES
  • MUST PROPPERLY TEST WHAT YOU ARE DOING
  • NEVER NEVER ASSUME
  • MUST BE SURE
  • MUST DOCUMENT YOU FINDINGS FOR THE NEXT TIME
  • MUST CLEAN UP PROPPERLY
  • MUST USE/UPDATE YOUR TEST DOCUMENT

🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨

Project Overview

AI Stack Deployer is a self-service portal that deploys personal OpenCode AI coding assistant stacks. Users enter a name, and the system provisions containers via Dokploy to create a fully functional AI stack at {name}.ai.flexinit.nl. Wildcard DNS and SSL are pre-configured, so deployments only need to create the Dokploy project and application.

Core Architecture

Deployment Flow

The system orchestrates deployments through Dokploy, leveraging pre-configured infrastructure:

  1. Dokploy API - Manages projects, applications, and container deployments
  2. Traefik - Handles SSL termination and routing (pre-configured wildcard DNS and SSL)

Each deployment creates:

  • Dokploy project: ai-stack-{name}
  • Application with OpenCode server + ttyd terminal
  • Domain configuration with automatic HTTPS (Traefik handles SSL via wildcard cert)

Note: DNS is pre-configured with wildcard *.ai.flexinit.nl144.76.116.169. Individual DNS records are NOT created per deployment - Traefik routes based on hostname matching.

Two Runtime Modes

  1. HTTP Server (src/index.ts) - Hono-based API for web portal

    • Fully implemented production-ready web application
    • REST API endpoints for deployment management
    • Server-Sent Events (SSE) for real-time progress streaming
    • Static frontend serving (HTML/CSS/JS)
    • In-memory deployment state tracking
    • CORS and logging middleware
  2. MCP Server (src/mcp-server.ts) - Model Context Protocol server

    • Development tool for Claude Code integration
    • Exposes deployment tools via stdio transport
    • Same deployment logic as HTTP server
    • Useful for testing and automation

API Clients

HetznerDNSClient (src/api/hetzner.ts):

  • Available for DNS management via Hetzner Cloud API
  • NOT used in deployment flow (wildcard DNS already configured)
  • Key methods: createARecord(), recordExists(), findRRSetByName()
  • Could be used for manual DNS operations or testing

DokployClient (src/api/dokploy.ts):

  • Orchestrates container deployments (primary deployment mechanism)
  • Key flow: createProject()createApplication()createDomain()deployApplication()
  • Communicates with internal Dokploy at http://10.100.0.20:3000
  • Traefik on Dokploy automatically handles SSL via pre-configured wildcard certificate

HTTP API Endpoints

The HTTP server exposes the following endpoints:

Health Check:

  • GET /health - Returns service health status

Deployment:

  • POST /api/deploy - Start a new deployment
    • Body: { "name": "stack-name" }
    • Returns: { deploymentId, url, statusEndpoint }
  • GET /api/status/:deploymentId - SSE stream for deployment progress
    • Events: progress, complete, error
  • GET /api/check/:name - Check if a name is available
    • Returns: { available, valid, error? }

Frontend:

  • GET / - Serves the web UI (src/frontend/index.html)
  • GET /static/* - Serves static assets (CSS, JS)

State Management

Both servers track deployments in-memory using a Map:

interface DeploymentState {
  id: string;              // dep_{timestamp}_{random}
  name: string;            // normalized username
  status: 'initializing' | 'creating_project' | 'creating_application' |
          'deploying' | 'completed' | 'failed';
  url?: string;            // https://{name}.ai.flexinit.nl
  error?: string;
  projectId?: string;
  applicationId?: string;
  progress: number;        // 0-100 (HTTP server only)
  currentStep: string;     // Human-readable step (HTTP server only)
}

Note: State is in-memory only and lost on server restart. For production with persistence, implement database storage.

Name Validation

Stack names must be:

  • 3-20 characters
  • Lowercase alphanumeric with hyphens
  • Cannot start/end with hyphen
  • Not in reserved list (admin, api, www, root, system, test, demo, portal)

Frontend Architecture

Location: src/frontend/

  • index.html - Main UI with state machine (form, progress, success, error)
  • style.css - Modern gradient design with animations
  • app.js - Vanilla JavaScript with SSE client and real-time validation

Features:

  • Real-time name availability checking
  • Client-side validation with server verification
  • SSE-powered live deployment progress tracking
  • State machine: Form → Progress → Success/Error
  • Responsive design (mobile-friendly)
  • No framework dependencies (vanilla JS)

Development Commands

# Development server (HTTP API with hot reload)
bun run dev

# Production server (HTTP API)
bun run start

# MCP server (for Claude Code integration)
bun run mcp

# Type checking
bun run typecheck

# Build for production
bun run build

# Test API clients (requires valid credentials)
bun run src/test-clients.ts

# Docker commands
docker build -t ai-stack-deployer .
docker-compose up -d
docker-compose logs -f
docker-compose down

Session Management

The project supports two types of Claude Code sessions:

🤖 Built-in Sessions (Automatic)

  • Created automatically by Claude Code for every conversation
  • Stored in ~/.claude/projects/.../
  • Resume with: claude --session-id {uuid} or claude --continue

📁 Custom Sessions (Optional, for organization)

  • Created explicitly via ./scripts/claude-start.sh {name}
  • Enable named sessions and Graphiti Memory auto-integration
  • Best for: feature development, bug fixes, multi-day work

Commands

# List ALL sessions (both built-in and custom)
bash scripts/claude-session.sh list

# Create/resume custom named session
./scripts/claude-start.sh feature-http-api

# Delete a custom session
bash scripts/claude-session.sh delete feature-http-api

# Override permission mode (default: bypassPermissions)
CLAUDE_PERMISSION_MODE=prompt ./scripts/claude-start.sh feature-name

Custom Session Benefits

Automatic Configuration:

  • Permission mode: bypassPermissions (no permission prompts for file operations)
  • Session ID: Persistent UUID throughout work session
  • Environment variables: Auto-set for Graphiti Memory integration

Environment Variables (Set Automatically):

CLAUDE_SESSION_ID=550e8400-e29b-41d4-a716-446655440000
CLAUDE_SESSION_NAME=feature-http-api
CLAUDE_SESSION_START=2026-01-09 20:16:00
CLAUDE_SESSION_PROJECT=ai-stack-deployer
CLAUDE_SESSION_MCP_GROUP=project_ai_stack_deployer

Graphiti Memory Integration:

// At session end, store learnings
graphiti-memory_add_memory({
  name: "Session: feature-http-api - 2026-01-09",
  episode_body: "Session ID: 550e8400. Implemented HTTP server endpoints for deploy API. Added SSE for progress updates. Tests passing.",
  group_id: "project_ai_stack_deployer"  // Auto-set from CLAUDE_SESSION_MCP_GROUP
})

Storage:

  • Custom sessions: $HOME/.claude/sessions/ai-stack-deployer/*.session
  • Built-in sessions: ~/.claude/projects/-home-odouhou-locale-projects-ai-stack-deployer/*.jsonl

Environment Variables

Required for deployment operations:

Optional configuration:

  • PORT - HTTP server port (default: 3000)
  • HOST - HTTP server bind address (default: 0.0.0.0)
  • STACK_DOMAIN_SUFFIX - Domain suffix for stacks (default: ai.flexinit.nl)
  • STACK_IMAGE - Docker image for user stacks
  • RESERVED_NAMES - Comma-separated list of forbidden names

Not used in deployment (available for testing/manual operations):

  • HETZNER_API_TOKEN - Hetzner Cloud API token
  • HETZNER_ZONE_ID - DNS zone ID (343733 for flexinit.nl)
  • TRAEFIK_IP - Public IP (144.76.116.169) - only for reference

See .env.example for complete configuration template.

MCP Server Integration

The MCP server is configured in .mcp.json and provides these tools:

  • deploy_stack - Deploys a new AI stack (Dokploy orchestration only, no DNS creation)
  • check_deployment_status - Query deployment progress by ID
  • list_deployments - List all deployments in current session
  • check_name_availability - Validate name before deployment
  • test_api_connections - Verify Hetzner and Dokploy connectivity (both clients available for testing)

To test MCP functionality:

# Start MCP server
bun run mcp

# Test API connections
bun run src/test-clients.ts

Key Implementation Details

Error Handling

Both API clients throw errors on failure. The MCP server catches these and returns structured error responses. No automatic retry logic exists yet.

Deployment Idempotency

  • Dokploy projects: Searches for existing project by name before creating
  • Creates only if not found
  • No automatic cleanup on partial failures
  • DNS is wildcard-based, so no per-deployment DNS operations needed

Concurrency

The MCP server handles one request at a time per invocation. No rate limiting or queue management exists yet.

Security Notes

  • All tokens in environment variables (never in code)
  • Dokploy URL is internal-only (10.100.0.x network)
  • No authentication on HTTP endpoints (portal will need auth)
  • Name validation prevents injection attacks

Testing Strategy

Currently implemented:

  • src/test-clients.ts - Manual testing of Hetzner and Dokploy clients
  • Requires real API credentials in .env
  • Note: Only Dokploy client is used in actual deployments

Missing (needs implementation):

  • Unit tests for validation logic
  • Integration tests for deployment flow
  • Mock API clients for testing without credentials
  • Health check monitoring
  • Rollback on failures

Common Patterns

Adding a New MCP Tool

  1. Define tool schema in tools array (src/mcp-server.ts:178)
  2. Add case to switch statement in CallToolRequestSchema handler (src/mcp-server.ts:249)
  3. Extract typed arguments: const { arg } = args as { arg: Type }
  4. Return structured response with content: [{ type: 'text', text: JSON.stringify(...) }]

Adding HTTP Endpoints

  1. Add route to Hono app in src/index.ts
  2. Use API clients from src/api/ directory
  3. Return JSON with consistent error format
  4. Consider adding SSE for long-running operations

Extending API Clients

  • Keep TypeScript interfaces at top of file
  • Use satisfies for type-safe request bodies
  • Throw descriptive errors (include API status codes)
  • Add methods to client class, use private request() helper

Production Deployment

Docker Build and Run

Build Architecture: The Dockerfile uses a hybrid approach to avoid AVX CPU requirements:

  • Build stage (Node.js 20): Builds React client with Vite (no AVX required)
  • Runtime stage (Bun 1.3): Runs the API server (Bun only needs AVX for builds, not runtime)

This approach ensures the Docker image builds successfully on all CPU architectures, including older systems and some cloud build environments that lack AVX support.

# Build the Docker image
docker build -t ai-stack-deployer:latest .

# Run with docker-compose (recommended)
docker-compose up -d

# Or run manually
docker run -d \
  --name ai-stack-deployer \
  -p 3000:3000 \
  --env-file .env \
  ai-stack-deployer:latest

Note: If you encounter "CPU lacks AVX support" errors during Docker builds, ensure you're using the latest Dockerfile which implements the Node.js/Bun hybrid build strategy.

Deploying to Dokploy

  1. Prepare Environment:

    • Ensure .env file has valid DOKPLOY_API_TOKEN
    • Verify DOKPLOY_URL points to internal Dokploy instance
  2. Build and Push Image (if using custom registry):

    docker build -t your-registry/ai-stack-deployer:latest .
    docker push your-registry/ai-stack-deployer:latest
    
  3. Deploy via Dokploy UI:

    • Create new project: ai-stack-deployer-portal
    • Create application from Docker image
    • Configure domain (e.g., portal.ai.flexinit.nl)
    • Set environment variables from .env
    • Deploy
  4. Verify Deployment:

    curl https://portal.ai.flexinit.nl/health
    

Health Monitoring

The application includes a /health endpoint that returns:

{
  "status": "healthy",
  "timestamp": "2026-01-09T...",
  "version": "0.1.0",
  "service": "ai-stack-deployer",
  "activeDeployments": 0
}

Docker health check runs every 30 seconds and restarts container if unhealthy.

Infrastructure Dependencies

  • Wildcard DNS - *.ai.flexinit.nl144.76.116.169 (pre-configured in Hetzner DNS)
  • Traefik at 144.76.116.169 - Pre-configured wildcard SSL certificate for *.ai.flexinit.nl
  • Dokploy at 10.100.0.20:3000 - Container orchestration platform (handles all deployments)
  • Docker image - oh-my-opencode-free (OpenCode + ttyd terminal)

Key Point: Individual DNS records are NOT created per deployment. The wildcard DNS and SSL are already configured, so Traefik automatically routes {name}.ai.flexinit.nl to the correct container based on hostname matching.

Logging Infrastructure

AI Stack logging integrates with the existing monitoring stack at logs.intra.flexinit.nl.

Components

Component Location Purpose
Log-ingest http://ai-stack-log-ingest:3000 (dokploy-network) Receives events from AI stacks, pushes to Loki
Loki monitor-grafanaloki-qkj16i-loki-1 Log storage
Grafana https://logs.intra.flexinit.nl Visualization
Dashboard /d/ai-stack-overview AI Stack metrics and logs

Datasource UIDs (Grafana)

  • Loki: af9a823s6iku8b
  • Prometheus: cf9r1fmfw9xxcf

Configuration

AI stacks send logs via environment variable:

LOG_INGEST_URL=http://ai-stack-log-ingest:3000/ingest

Local Development

The logging-stack/ directory contains a standalone docker-compose for local testing:

cd logging-stack && docker-compose up -d

Credentials

Grafana service account token stored in BWS:

  • Key: GRAFANA_OPENCODE_ACCESS_TOKEN
  • BWS ID: c77e58e3-fb34-41dc-9824-b3ce00da18a0

CI/CD - Gitea Actions

The oh-my-opencode-free Docker image is built automatically via Gitea Actions on push to main.

Check Workflow Status

Web UI:

https://git.app.flexinit.nl/oussamadouhou/oh-my-opencode-free/actions

API:

# Get token from BWS (key: GITEA_API_TOKEN)
GITEA_TOKEN="<token>"

# List recent runs with status
curl -s -H "Authorization: token $GITEA_TOKEN" \
  "https://git.app.flexinit.nl/api/v1/repos/oussamadouhou/oh-my-opencode-free/actions/runs?limit=5" | \
  jq '.workflow_runs[] | {run_number, status, conclusion, display_title, head_sha: .head_sha[0:7]}'

API Response Fields

Field Values
status queued, in_progress, completed
conclusion success, failure, cancelled, skipped

Credentials

  • GITEA_API_TOKEN - Gitea API access (stored in BWS)

Project Status

Completed:

  • HTTP Server with REST API and SSE streaming
  • Frontend UI with real-time deployment tracking
  • MCP Server for Claude Code integration
  • Docker configuration for production deployment
  • Full deployment orchestration via Dokploy API
  • Name validation and availability checking
  • Error handling and progress reporting

Ready for Production Deployment