# Dokploy API - Production Specification **Date**: 2026-01-09 **Status**: ENTERPRISE GRADE - PRODUCTION READY ## API Authentication - **Header**: `x-api-key: {token}` - **Base URL**: `https://app.flexinit.nl` (public) or `http://10.100.0.20:3000` (internal) ## Production Deployment Flow ### Phase 1: Project & Environment Creation ```typescript POST /api/project.create Body: { name: string, // "ai-stack-{username}" description?: string // "AI Stack for {username}" } Response: { projectId: string, name: string, description: string, createdAt: string, organizationId: string, env: string } // Note: Environment is created automatically with production environment // Environment ID must be retrieved separately ``` ### Phase 2: Get Environment ID ```typescript GET /api/environment.byProjectId?projectId={projectId} Response: Array<{ environmentId: string, name: string, // "production" projectId: string, isDefault: boolean, env: string, createdAt: string }> ``` ### Phase 3: Create Application ```typescript POST /api/application.create Body: { name: string, // "opencode-{username}" environmentId: string // From Phase 2 } Response: { applicationId: string, name: string, environmentId: string, applicationStatus: 'idle' | 'running' | 'done' | 'error', createdAt: string, // ... other fields } ``` ### Phase 4: Configure Application (Docker Image) ```typescript POST /api/application.update Body: { applicationId: string, dockerImage: string, // "git.app.flexinit.nl/..." sourceType: 'docker' } Response: { applicationId: string, // ... updated fields } ``` ### Phase 5: Create Domain ```typescript POST /api/domain.create Body: { host: string, // "{username}.ai.flexinit.nl" applicationId: string, https: boolean, // true port: number // 8080 } Response: { domainId: string, host: string, applicationId: string, https: boolean, port: number } ``` ### Phase 6: Deploy Application ```typescript POST /api/application.deploy Body: { applicationId: string } Response: void | { deploymentId?: string } ``` ## Error Handling - Enterprise Grade ### Retry Strategy - **Transient errors** (5xx, network): Exponential backoff (1s, 2s, 4s, 8s, 16s) - **Rate limiting** (429): Respect Retry-After header - **Authentication** (401): Fail immediately, no retry - **Validation** (400): Fail immediately, log and report ### Rollback Strategy On any phase failure: 1. Log failure point and error details 2. Execute cleanup in reverse order: - Delete domain (if created) - Delete application (if created) - Delete project (if no other resources) 3. Report detailed failure to user 4. Store failure record for analysis ### Circuit Breaker - **Threshold**: 5 consecutive failures - **Timeout**: 60 seconds - **Half-open**: After timeout, allow 1 test request - **Reset**: After 3 consecutive successes ## Idempotency ### Project Creation - Check if project exists by name before creating - If exists, use existing projectId - Store creation timestamp for audit ### Application Creation - Query existing applications by name in environment - If exists and in valid state, reuse - If exists but failed, delete and recreate ### Domain Creation - Query existing domains for application - If exists with same config, skip creation - If exists with different config, update ### Deployment - Check current deployment status before triggering - If deployment in progress, poll status instead of re-triggering - If deployment failed, analyze logs before retry ## Monitoring & Observability ### Structured Logging ```typescript { timestamp: ISO8601, level: 'info' | 'warn' | 'error', phase: 'project' | 'environment' | 'application' | 'domain' | 'deploy', action: 'create' | 'update' | 'delete' | 'query', deploymentId: string, username: string, duration_ms: number, status: 'success' | 'failure', error?: { code: string, message: string, stack?: string, apiResponse?: unknown } } ``` ### Health Checks - **Application health**: GET /health every 10s for 2 minutes - **Container status**: Query application status via API - **Domain resolution**: Verify DNS + HTTPS connectivity - **Service availability**: Check if ttyd terminal is accessible ### Metrics - Deployment success rate - Average deployment time - Failure reasons histogram - API latency percentiles (p50, p95, p99) - Retry counts per phase - Rollback occurrences ## Security ### Input Validation - Sanitize all user inputs before API calls - Validate against injection attacks - Enforce strict name regex - Check reserved names list ### Secrets Management - Never log API tokens - Redact sensitive data in error messages - Use environment variables for all credentials - Rotate tokens periodically ### Rate Limiting - Client-side: Max 10 deployments per user per hour - Per-phase rate limiting to prevent API abuse - Queue requests if limit exceeded ## Production Checklist - [ ] All API calls use correct parameter names - [ ] Environment ID retrieved and used for application creation - [ ] Retry logic with exponential backoff implemented - [ ] Circuit breaker pattern implemented - [ ] Complete rollback on any failure - [ ] Idempotency checks for all operations - [ ] Structured logging with deployment tracking - [ ] Health checks with timeout - [ ] Input validation and sanitization - [ ] Integration tests with real API - [ ] Load testing (10 concurrent deployments) - [ ] Failure scenario testing (network, auth, validation) - [ ] Documentation and runbook complete - [ ] Monitoring and alerting configured