# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 ***YOU ARE FORCED TO FOLLOW THESE PRINCIPLE RULES*** - ***MUST*** USE SKILL TODO** - ***MUST*** FOLLOW YOUR TODO - ***MUST*** USE DOCUMENTATION/REPOSITORIES AFTER 3 TRIES - ***MUST*** PROPPERLY TEST WHAT YOU ARE DOING - ***NEVER*** NEVER ASSUME - ***MUST*** BE SURE - ***MUST*** DOCUMENT YOU FINDINGS FOR THE NEXT TIME - ***MUST*** CLEAN UP PROPPERLY - ***MUST*** USE/UPDATE YOUR TEST DOCUMENT 🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨🚨 ## Project Overview AI Stack Deployer is a self-service portal that deploys personal OpenCode AI coding assistant stacks. Users enter a name, and the system provisions containers via Dokploy to create a fully functional AI stack at `{name}.ai.flexinit.nl`. Wildcard DNS and SSL are pre-configured, so deployments only need to create the Dokploy project and application. ## Core Architecture ### Deployment Flow The system orchestrates deployments through Dokploy, leveraging pre-configured infrastructure: 1. **Dokploy API** - Manages projects, applications, and container deployments 2. **Traefik** - Handles SSL termination and routing (pre-configured wildcard DNS and SSL) Each deployment creates: - Dokploy project: `ai-stack-{name}` - Application with OpenCode server + ttyd terminal - Domain configuration with automatic HTTPS (Traefik handles SSL via wildcard cert) **Note**: DNS is pre-configured with wildcard `*.ai.flexinit.nl` → `144.76.116.169`. Individual DNS records are NOT created per deployment - Traefik routes based on hostname matching. ### Two Runtime Modes 1. **HTTP Server** (`src/index.ts`) - Hono-based API for web portal - **Fully implemented** production-ready web application - REST API endpoints for deployment management - Server-Sent Events (SSE) for real-time progress streaming - Static frontend serving (HTML/CSS/JS) - In-memory deployment state tracking - CORS and logging middleware 2. **MCP Server** (`src/mcp-server.ts`) - Model Context Protocol server - Development tool for Claude Code integration - Exposes deployment tools via stdio transport - Same deployment logic as HTTP server - Useful for testing and automation ### API Clients **HetznerDNSClient** (`src/api/hetzner.ts`): - Available for DNS management via Hetzner Cloud API - **NOT used in deployment flow** (wildcard DNS already configured) - Key methods: `createARecord()`, `recordExists()`, `findRRSetByName()` - Could be used for manual DNS operations or testing **DokployClient** (`src/api/dokploy.ts`): - Orchestrates container deployments (primary deployment mechanism) - Key flow: `createProject()` → `createApplication()` → `createDomain()` → `deployApplication()` - Communicates with internal Dokploy at `http://10.100.0.20:3000` - Traefik on Dokploy automatically handles SSL via pre-configured wildcard certificate ### HTTP API Endpoints The HTTP server exposes the following endpoints: **Health Check**: - `GET /health` - Returns service health status **Deployment**: - `POST /api/deploy` - Start a new deployment - Body: `{ "name": "stack-name" }` - Returns: `{ deploymentId, url, statusEndpoint }` - `GET /api/status/:deploymentId` - SSE stream for deployment progress - Events: `progress`, `complete`, `error` - `GET /api/check/:name` - Check if a name is available - Returns: `{ available, valid, error? }` **Frontend**: - `GET /` - Serves the web UI (`src/frontend/index.html`) - `GET /static/*` - Serves static assets (CSS, JS) ### State Management Both servers track deployments in-memory using a Map: ```typescript interface DeploymentState { id: string; // dep_{timestamp}_{random} name: string; // normalized username status: 'initializing' | 'creating_project' | 'creating_application' | 'deploying' | 'completed' | 'failed'; url?: string; // https://{name}.ai.flexinit.nl error?: string; projectId?: string; applicationId?: string; progress: number; // 0-100 (HTTP server only) currentStep: string; // Human-readable step (HTTP server only) } ``` **Note**: State is in-memory only and lost on server restart. For production with persistence, implement database storage. ### Name Validation Stack names must be: - 3-20 characters - Lowercase alphanumeric with hyphens - Cannot start/end with hyphen - Not in reserved list (admin, api, www, root, system, test, demo, portal) ### Frontend Architecture **Location**: `src/frontend/` - `index.html` - Main UI with state machine (form, progress, success, error) - `style.css` - Modern gradient design with animations - `app.js` - Vanilla JavaScript with SSE client and real-time validation **Features**: - Real-time name availability checking - Client-side validation with server verification - SSE-powered live deployment progress tracking - State machine: Form → Progress → Success/Error - Responsive design (mobile-friendly) - No framework dependencies (vanilla JS) ## Development Commands ```bash # Development server (HTTP API with hot reload) bun run dev # Production server (HTTP API) bun run start # MCP server (for Claude Code integration) bun run mcp # Type checking bun run typecheck # Build for production bun run build # Test API clients (requires valid credentials) bun run src/test-clients.ts # Docker commands docker build -t ai-stack-deployer . docker-compose up -d docker-compose logs -f docker-compose down ``` ## Session Management The project supports two types of Claude Code sessions: **🤖 Built-in Sessions** (Automatic) - Created automatically by Claude Code for every conversation - Stored in `~/.claude/projects/.../` - Resume with: `claude --session-id {uuid}` or `claude --continue` **📁 Custom Sessions** (Optional, for organization) - Created explicitly via `./scripts/claude-start.sh {name}` - Enable named sessions and Graphiti Memory auto-integration - Best for: feature development, bug fixes, multi-day work ### Commands ```bash # List ALL sessions (both built-in and custom) bash scripts/claude-session.sh list # Create/resume custom named session ./scripts/claude-start.sh feature-http-api # Delete a custom session bash scripts/claude-session.sh delete feature-http-api # Override permission mode (default: bypassPermissions) CLAUDE_PERMISSION_MODE=prompt ./scripts/claude-start.sh feature-name ``` ### Custom Session Benefits **Automatic Configuration:** - Permission mode: `bypassPermissions` (no permission prompts for file operations) - Session ID: Persistent UUID throughout work session - Environment variables: Auto-set for Graphiti Memory integration **Environment Variables (Set Automatically):** ```bash CLAUDE_SESSION_ID=550e8400-e29b-41d4-a716-446655440000 CLAUDE_SESSION_NAME=feature-http-api CLAUDE_SESSION_START=2026-01-09 20:16:00 CLAUDE_SESSION_PROJECT=ai-stack-deployer CLAUDE_SESSION_MCP_GROUP=project_ai_stack_deployer ``` **Graphiti Memory Integration:** ```javascript // At session end, store learnings graphiti-memory_add_memory({ name: "Session: feature-http-api - 2026-01-09", episode_body: "Session ID: 550e8400. Implemented HTTP server endpoints for deploy API. Added SSE for progress updates. Tests passing.", group_id: "project_ai_stack_deployer" // Auto-set from CLAUDE_SESSION_MCP_GROUP }) ``` **Storage:** - Custom sessions: `$HOME/.claude/sessions/ai-stack-deployer/*.session` - Built-in sessions: `~/.claude/projects/-home-odouhou-locale-projects-ai-stack-deployer/*.jsonl` ## Environment Variables Required for deployment operations: - `DOKPLOY_URL` - Dokploy API URL (http://10.100.0.20:3000) - `DOKPLOY_API_TOKEN` - Dokploy API authentication token Optional configuration: - `PORT` - HTTP server port (default: 3000) - `HOST` - HTTP server bind address (default: 0.0.0.0) - `STACK_DOMAIN_SUFFIX` - Domain suffix for stacks (default: ai.flexinit.nl) - `STACK_IMAGE` - Docker image for user stacks - `RESERVED_NAMES` - Comma-separated list of forbidden names Not used in deployment (available for testing/manual operations): - `HETZNER_API_TOKEN` - Hetzner Cloud API token - `HETZNER_ZONE_ID` - DNS zone ID (343733 for flexinit.nl) - `TRAEFIK_IP` - Public IP (144.76.116.169) - only for reference See `.env.example` for complete configuration template. ## MCP Server Integration The MCP server is configured in `.mcp.json` and provides these tools: - `deploy_stack` - Deploys a new AI stack (Dokploy orchestration only, no DNS creation) - `check_deployment_status` - Query deployment progress by ID - `list_deployments` - List all deployments in current session - `check_name_availability` - Validate name before deployment - `test_api_connections` - Verify Hetzner and Dokploy connectivity (both clients available for testing) To test MCP functionality: ```bash # Start MCP server bun run mcp # Test API connections bun run src/test-clients.ts ``` ## Key Implementation Details ### Error Handling Both API clients throw errors on failure. The MCP server catches these and returns structured error responses. No automatic retry logic exists yet. ### Deployment Idempotency - Dokploy projects: Searches for existing project by name before creating - Creates only if not found - No automatic cleanup on partial failures - DNS is wildcard-based, so no per-deployment DNS operations needed ### Concurrency The MCP server handles one request at a time per invocation. No rate limiting or queue management exists yet. ### Security Notes - All tokens in environment variables (never in code) - Dokploy URL is internal-only (10.100.0.x network) - No authentication on HTTP endpoints (portal will need auth) - Name validation prevents injection attacks ## Testing Strategy Currently implemented: - `src/test-clients.ts` - Manual testing of Hetzner and Dokploy clients - Requires real API credentials in `.env` - Note: Only Dokploy client is used in actual deployments Missing (needs implementation): - Unit tests for validation logic - Integration tests for deployment flow - Mock API clients for testing without credentials - Health check monitoring - Rollback on failures ## Common Patterns ### Adding a New MCP Tool 1. Define tool schema in `tools` array (src/mcp-server.ts:178) 2. Add case to switch statement in `CallToolRequestSchema` handler (src/mcp-server.ts:249) 3. Extract typed arguments: `const { arg } = args as { arg: Type }` 4. Return structured response with `content: [{ type: 'text', text: JSON.stringify(...) }]` ### Adding HTTP Endpoints 1. Add route to Hono app in `src/index.ts` 2. Use API clients from `src/api/` directory 3. Return JSON with consistent error format 4. Consider adding SSE for long-running operations ### Extending API Clients - Keep TypeScript interfaces at top of file - Use `satisfies` for type-safe request bodies - Throw descriptive errors (include API status codes) - Add methods to client class, use `private request()` helper ## Production Deployment ### Docker Build and Run ```bash # Build the Docker image docker build -t ai-stack-deployer:latest . # Run with docker-compose (recommended) docker-compose up -d # Or run manually docker run -d \ --name ai-stack-deployer \ -p 3000:3000 \ --env-file .env \ ai-stack-deployer:latest ``` ### Deploying to Dokploy 1. **Prepare Environment**: - Ensure `.env` file has valid `DOKPLOY_API_TOKEN` - Verify `DOKPLOY_URL` points to internal Dokploy instance 2. **Build and Push Image** (if using custom registry): ```bash docker build -t your-registry/ai-stack-deployer:latest . docker push your-registry/ai-stack-deployer:latest ``` 3. **Deploy via Dokploy UI**: - Create new project: `ai-stack-deployer-portal` - Create application from Docker image - Configure domain (e.g., `portal.ai.flexinit.nl`) - Set environment variables from `.env` - Deploy 4. **Verify Deployment**: ```bash curl https://portal.ai.flexinit.nl/health ``` ### Health Monitoring The application includes a `/health` endpoint that returns: ```json { "status": "healthy", "timestamp": "2026-01-09T...", "version": "0.1.0", "service": "ai-stack-deployer", "activeDeployments": 0 } ``` Docker health check runs every 30 seconds and restarts container if unhealthy. ## Infrastructure Dependencies - **Wildcard DNS** - `*.ai.flexinit.nl` → `144.76.116.169` (pre-configured in Hetzner DNS) - **Traefik** at 144.76.116.169 - Pre-configured wildcard SSL certificate for `*.ai.flexinit.nl` - **Dokploy** at 10.100.0.20:3000 - Container orchestration platform (handles all deployments) - **Docker image** - oh-my-opencode-free (OpenCode + ttyd terminal) **Key Point**: Individual DNS records are NOT created per deployment. The wildcard DNS and SSL are already configured, so Traefik automatically routes `{name}.ai.flexinit.nl` to the correct container based on hostname matching. ## Logging Infrastructure AI Stack logging integrates with the existing monitoring stack at `logs.intra.flexinit.nl`. ### Components | Component | Location | Purpose | |-----------|----------|---------| | Log-ingest | `http://ai-stack-log-ingest:3000` (dokploy-network) | Receives events from AI stacks, pushes to Loki | | Loki | `monitor-grafanaloki-qkj16i-loki-1` | Log storage | | Grafana | https://logs.intra.flexinit.nl | Visualization | | Dashboard | `/d/ai-stack-overview` | AI Stack metrics and logs | ### Datasource UIDs (Grafana) - Loki: `af9a823s6iku8b` - Prometheus: `cf9r1fmfw9xxcf` ### Configuration AI stacks send logs via environment variable: ``` LOG_INGEST_URL=http://ai-stack-log-ingest:3000/ingest ``` ### Local Development The `logging-stack/` directory contains a standalone docker-compose for local testing: ```bash cd logging-stack && docker-compose up -d ``` ### Credentials Grafana service account token stored in BWS: - Key: `GRAFANA_OPENCODE_ACCESS_TOKEN` - BWS ID: `c77e58e3-fb34-41dc-9824-b3ce00da18a0` ## CI/CD - Gitea Actions The `oh-my-opencode-free` Docker image is built automatically via Gitea Actions on push to main. ### Check Workflow Status **Web UI:** ``` https://git.app.flexinit.nl/oussamadouhou/oh-my-opencode-free/actions ``` **API:** ```bash # Get token from BWS (key: GITEA_API_TOKEN) GITEA_TOKEN="" # List recent runs with status curl -s -H "Authorization: token $GITEA_TOKEN" \ "https://git.app.flexinit.nl/api/v1/repos/oussamadouhou/oh-my-opencode-free/actions/runs?limit=5" | \ jq '.workflow_runs[] | {run_number, status, conclusion, display_title, head_sha: .head_sha[0:7]}' ``` ### API Response Fields | Field | Values | |-------|--------| | `status` | `queued`, `in_progress`, `completed` | | `conclusion` | `success`, `failure`, `cancelled`, `skipped` | ### Credentials - **GITEA_API_TOKEN** - Gitea API access (stored in BWS) ## Project Status ✅ **Completed**: - HTTP Server with REST API and SSE streaming - Frontend UI with real-time deployment tracking - MCP Server for Claude Code integration - Docker configuration for production deployment - Full deployment orchestration via Dokploy API - Name validation and availability checking - Error handling and progress reporting **Ready for Production Deployment**