Files
ai-stack-deployer/docs/PRODUCTION_API_SPEC.md
Oussama Douhou 19845880e3 fix(ci): trigger workflow on main branch to enable :latest tag
Changes:
- Create Gitea workflow for ai-stack-deployer
- Trigger on main branch (default branch)
- Use oussamadouhou + REGISTRY_TOKEN for authentication
- Build from ./Dockerfile

This enables :latest tag creation via {{is_default_branch}}.

Tags created:
- git.app.flexinit.nl/oussamadouhou/ai-stack-deployer:latest
- git.app.flexinit.nl/oussamadouhou/ai-stack-deployer:<sha>

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-09 23:33:39 +01:00

5.5 KiB

Dokploy API - Production Specification

Date: 2026-01-09 Status: ENTERPRISE GRADE - PRODUCTION READY

API Authentication

  • Header: x-api-key: {token}
  • Base URL: https://app.flexinit.nl (public) or http://10.100.0.20:3000 (internal)

Production Deployment Flow

Phase 1: Project & Environment Creation

POST /api/project.create
Body: {
  name: string,        // "ai-stack-{username}"
  description?: string // "AI Stack for {username}"
}

Response: {
  projectId: string,
  name: string,
  description: string,
  createdAt: string,
  organizationId: string,
  env: string
}

// Note: Environment is created automatically with production environment
// Environment ID must be retrieved separately

Phase 2: Get Environment ID

GET /api/environment.byProjectId?projectId={projectId}

Response: Array<{
  environmentId: string,
  name: string,        // "production"
  projectId: string,
  isDefault: boolean,
  env: string,
  createdAt: string
}>

Phase 3: Create Application

POST /api/application.create
Body: {
  name: string,           // "opencode-{username}"
  environmentId: string   // From Phase 2
}

Response: {
  applicationId: string,
  name: string,
  environmentId: string,
  applicationStatus: 'idle' | 'running' | 'done' | 'error',
  createdAt: string,
  // ... other fields
}

Phase 4: Configure Application (Docker Image)

POST /api/application.update
Body: {
  applicationId: string,
  dockerImage: string,    // "git.app.flexinit.nl/..."
  sourceType: 'docker'
}

Response: {
  applicationId: string,
  // ... updated fields
}

Phase 5: Create Domain

POST /api/domain.create
Body: {
  host: string,          // "{username}.ai.flexinit.nl"
  applicationId: string,
  https: boolean,        // true
  port: number          // 8080
}

Response: {
  domainId: string,
  host: string,
  applicationId: string,
  https: boolean,
  port: number
}

Phase 6: Deploy Application

POST /api/application.deploy
Body: {
  applicationId: string
}

Response: void | { deploymentId?: string }

Error Handling - Enterprise Grade

Retry Strategy

  • Transient errors (5xx, network): Exponential backoff (1s, 2s, 4s, 8s, 16s)
  • Rate limiting (429): Respect Retry-After header
  • Authentication (401): Fail immediately, no retry
  • Validation (400): Fail immediately, log and report

Rollback Strategy

On any phase failure:

  1. Log failure point and error details
  2. Execute cleanup in reverse order:
    • Delete domain (if created)
    • Delete application (if created)
    • Delete project (if no other resources)
  3. Report detailed failure to user
  4. Store failure record for analysis

Circuit Breaker

  • Threshold: 5 consecutive failures
  • Timeout: 60 seconds
  • Half-open: After timeout, allow 1 test request
  • Reset: After 3 consecutive successes

Idempotency

Project Creation

  • Check if project exists by name before creating
  • If exists, use existing projectId
  • Store creation timestamp for audit

Application Creation

  • Query existing applications by name in environment
  • If exists and in valid state, reuse
  • If exists but failed, delete and recreate

Domain Creation

  • Query existing domains for application
  • If exists with same config, skip creation
  • If exists with different config, update

Deployment

  • Check current deployment status before triggering
  • If deployment in progress, poll status instead of re-triggering
  • If deployment failed, analyze logs before retry

Monitoring & Observability

Structured Logging

{
  timestamp: ISO8601,
  level: 'info' | 'warn' | 'error',
  phase: 'project' | 'environment' | 'application' | 'domain' | 'deploy',
  action: 'create' | 'update' | 'delete' | 'query',
  deploymentId: string,
  username: string,
  duration_ms: number,
  status: 'success' | 'failure',
  error?: {
    code: string,
    message: string,
    stack?: string,
    apiResponse?: unknown
  }
}

Health Checks

  • Application health: GET /health every 10s for 2 minutes
  • Container status: Query application status via API
  • Domain resolution: Verify DNS + HTTPS connectivity
  • Service availability: Check if ttyd terminal is accessible

Metrics

  • Deployment success rate
  • Average deployment time
  • Failure reasons histogram
  • API latency percentiles (p50, p95, p99)
  • Retry counts per phase
  • Rollback occurrences

Security

Input Validation

  • Sanitize all user inputs before API calls
  • Validate against injection attacks
  • Enforce strict name regex
  • Check reserved names list

Secrets Management

  • Never log API tokens
  • Redact sensitive data in error messages
  • Use environment variables for all credentials
  • Rotate tokens periodically

Rate Limiting

  • Client-side: Max 10 deployments per user per hour
  • Per-phase rate limiting to prevent API abuse
  • Queue requests if limit exceeded

Production Checklist

  • All API calls use correct parameter names
  • Environment ID retrieved and used for application creation
  • Retry logic with exponential backoff implemented
  • Circuit breaker pattern implemented
  • Complete rollback on any failure
  • Idempotency checks for all operations
  • Structured logging with deployment tracking
  • Health checks with timeout
  • Input validation and sanitization
  • Integration tests with real API
  • Load testing (10 concurrent deployments)
  • Failure scenario testing (network, auth, validation)
  • Documentation and runbook complete
  • Monitoring and alerting configured