- Move test files to tests/ - Archive session notes to docs/archive/ - Remove temp/diagnostic files - Clean src/ to only contain production code
666 lines
16 KiB
Markdown
666 lines
16 KiB
Markdown
# Deployment Notes - AI Stack Deployer
|
|
## Automated Deployment Documentation
|
|
|
|
**Date**: 2026-01-09
|
|
**Operator**: Claude Code
|
|
**Target**: Dokploy (10.100.0.20:3000)
|
|
**Domain**: portal.ai.flexinit.nl (or TBD)
|
|
|
|
---
|
|
|
|
## Phase 1: Pre-Deployment Verification
|
|
|
|
### Step 1.1: Environment Variables Check
|
|
**Purpose**: Verify all required credentials are available
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Check if .env file exists
|
|
test -f .env && echo "✓ .env exists" || echo "✗ .env missing"
|
|
|
|
# Verify required variables are set (without exposing values)
|
|
grep -q "DOKPLOY_API_TOKEN=" .env && echo "✓ DOKPLOY_API_TOKEN set" || echo "✗ DOKPLOY_API_TOKEN missing"
|
|
grep -q "DOKPLOY_URL=" .env && echo "✓ DOKPLOY_URL set" || echo "✗ DOKPLOY_URL missing"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Script must check for `.env` file existence
|
|
- Validate required variables: `DOKPLOY_API_TOKEN`, `DOKPLOY_URL`
|
|
- Exit with error if missing critical variables
|
|
|
|
---
|
|
|
|
### Step 1.2: Dokploy API Connectivity Test
|
|
**Purpose**: Ensure we can reach Dokploy API before attempting deployment
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Test API connectivity (masked token in logs)
|
|
curl -s -o /dev/null -w "%{http_code}" \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
"${DOKPLOY_URL}/api/project.all"
|
|
```
|
|
|
|
**Expected Result**: HTTP 200
|
|
**On Failure**: Check network access to 10.100.0.20:3000
|
|
|
|
**Automation Notes**:
|
|
- Test API before proceeding
|
|
- Log HTTP status code
|
|
- Abort if not 200
|
|
|
|
---
|
|
|
|
### Step 1.3: Docker Environment Check
|
|
**Purpose**: Verify Docker is available for building
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Check Docker installation
|
|
docker --version
|
|
|
|
# Check Docker daemon is running
|
|
docker ps > /dev/null 2>&1 && echo "✓ Docker running" || echo "✗ Docker not running"
|
|
|
|
# Check available disk space (need ~500MB)
|
|
df -h . | awk 'NR==2 {print "Available:", $4}'
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Verify Docker installed and running
|
|
- Check minimum 500MB free space
|
|
- Fail fast if Docker unavailable
|
|
|
|
---
|
|
|
|
## Phase 2: Docker Image Build
|
|
|
|
### Step 2.1: Build Docker Image
|
|
**Purpose**: Create production Docker image
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Build with timestamp tag
|
|
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
|
|
IMAGE_TAG="ai-stack-deployer:${TIMESTAMP}"
|
|
IMAGE_TAG_LATEST="ai-stack-deployer:latest"
|
|
|
|
docker build \
|
|
-t "${IMAGE_TAG}" \
|
|
-t "${IMAGE_TAG_LATEST}" \
|
|
--progress=plain \
|
|
.
|
|
```
|
|
|
|
**Expected Duration**: 2-3 minutes
|
|
**Expected Size**: ~150-200MB
|
|
|
|
**Automation Notes**:
|
|
- Use timestamp tags for traceability
|
|
- Always tag as `:latest` as well
|
|
- Stream build logs for debugging
|
|
- Check exit code (0 = success)
|
|
|
|
---
|
|
|
|
### Step 2.2: Verify Build Success
|
|
**Purpose**: Confirm image was created successfully
|
|
|
|
**Commands**:
|
|
```bash
|
|
# List the newly created image
|
|
docker images ai-stack-deployer:latest
|
|
|
|
# Get image ID and size
|
|
IMAGE_ID=$(docker images -q ai-stack-deployer:latest)
|
|
echo "Image ID: ${IMAGE_ID}"
|
|
|
|
# Inspect image metadata
|
|
docker inspect "${IMAGE_ID}" --format='{{.Config.ExposedPorts}}'
|
|
docker inspect "${IMAGE_ID}" --format='{{.Config.Healthcheck.Test}}'
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Verify image exists with correct name
|
|
- Log image ID and size
|
|
- Confirm healthcheck is configured
|
|
|
|
---
|
|
|
|
## Phase 3: Local Container Testing
|
|
|
|
### Step 3.1: Start Test Container
|
|
**Purpose**: Verify container runs before deploying to production
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Start container in detached mode
|
|
docker run -d \
|
|
--name ai-stack-deployer-test \
|
|
-p 3001:3000 \
|
|
--env-file .env \
|
|
ai-stack-deployer:latest
|
|
|
|
# Wait for container to be ready (max 30 seconds)
|
|
timeout 30 bash -c 'until docker exec ai-stack-deployer-test curl -f http://localhost:3000/health 2>/dev/null; do sleep 1; done'
|
|
```
|
|
|
|
**Expected Result**: Container starts and responds to health check
|
|
|
|
**Automation Notes**:
|
|
- Use non-conflicting port (3001) for testing
|
|
- Wait for health check before proceeding
|
|
- Timeout after 30 seconds if unhealthy
|
|
|
|
---
|
|
|
|
### Step 3.2: Health Check Verification
|
|
**Purpose**: Verify application is running correctly
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Test health endpoint from host
|
|
curl -s http://localhost:3001/health | jq .
|
|
|
|
# Check container logs for errors
|
|
docker logs ai-stack-deployer-test 2>&1 | tail -20
|
|
|
|
# Verify no crashes
|
|
docker ps -f name=ai-stack-deployer-test --format "{{.Status}}"
|
|
```
|
|
|
|
**Expected Response**:
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"timestamp": "...",
|
|
"version": "0.1.0",
|
|
"service": "ai-stack-deployer",
|
|
"activeDeployments": 0
|
|
}
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Parse JSON response and verify status="healthy"
|
|
- Check for ERROR/FATAL in logs
|
|
- Confirm container is "Up" status
|
|
|
|
---
|
|
|
|
### Step 3.3: Cleanup Test Container
|
|
**Purpose**: Remove test container after verification
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Stop and remove test container
|
|
docker stop ai-stack-deployer-test
|
|
docker rm ai-stack-deployer-test
|
|
|
|
echo "✓ Test container cleaned up"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Always cleanup test resources
|
|
- Use `--force` flags if automation needs to be idempotent
|
|
|
|
---
|
|
|
|
## Phase 4: Image Registry Push (Optional)
|
|
|
|
### Step 4.1: Tag for Registry
|
|
**Purpose**: Prepare image for remote registry (if not using local Dokploy)
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Example for custom registry
|
|
REGISTRY="git.app.flexinit.nl"
|
|
docker tag ai-stack-deployer:latest "${REGISTRY}/ai-stack-deployer:latest"
|
|
docker tag ai-stack-deployer:latest "${REGISTRY}/ai-stack-deployer:${TIMESTAMP}"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Skip if Dokploy can access local Docker daemon
|
|
- Required if Dokploy is on separate server
|
|
|
|
---
|
|
|
|
### Step 4.2: Push to Registry
|
|
**Purpose**: Upload image to registry
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Login to registry (if required)
|
|
echo "${REGISTRY_PASSWORD}" | docker login "${REGISTRY}" -u "${REGISTRY_USER}" --password-stdin
|
|
|
|
# Push images
|
|
docker push "${REGISTRY}/ai-stack-deployer:latest"
|
|
docker push "${REGISTRY}/ai-stack-deployer:${TIMESTAMP}"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Store registry credentials securely
|
|
- Verify push succeeded (check exit code)
|
|
- Log image digest for traceability
|
|
|
|
---
|
|
|
|
## Phase 5: Dokploy Deployment
|
|
|
|
### Step 5.1: Check for Existing Project
|
|
**Purpose**: Determine if this is a new deployment or update
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Search for existing project
|
|
curl -s \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
"${DOKPLOY_URL}/api/project.all" | \
|
|
jq -r '.projects[] | select(.name=="ai-stack-deployer-portal") | .projectId'
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- If project exists: update existing
|
|
- If not found: create new project
|
|
- Store project ID for subsequent API calls
|
|
|
|
---
|
|
|
|
### Step 5.2: Create Dokploy Project (if new)
|
|
**Purpose**: Create project container in Dokploy
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Create project via API
|
|
PROJECT_RESPONSE=$(curl -s -X POST \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
-H "Content-Type: application/json" \
|
|
"${DOKPLOY_URL}/api/project.create" \
|
|
-d '{
|
|
"name": "ai-stack-deployer-portal",
|
|
"description": "Self-service portal for deploying AI stacks"
|
|
}')
|
|
|
|
# Extract project ID
|
|
PROJECT_ID=$(echo "${PROJECT_RESPONSE}" | jq -r '.projectId')
|
|
echo "Created project: ${PROJECT_ID}"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Parse response for projectId
|
|
- Handle error if project name conflicts
|
|
- Store PROJECT_ID for next steps
|
|
|
|
---
|
|
|
|
### Step 5.3: Create Application
|
|
**Purpose**: Create application within project
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Create application
|
|
APP_RESPONSE=$(curl -s -X POST \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
-H "Content-Type: application/json" \
|
|
"${DOKPLOY_URL}/api/application.create" \
|
|
-d "{
|
|
\"name\": \"ai-stack-deployer-web\",
|
|
\"projectId\": \"${PROJECT_ID}\",
|
|
\"dockerImage\": \"ai-stack-deployer:latest\",
|
|
\"env\": \"DOKPLOY_URL=${DOKPLOY_URL}\\nDOKPLOY_API_TOKEN=${DOKPLOY_API_TOKEN}\\nPORT=3000\\nHOST=0.0.0.0\"
|
|
}")
|
|
|
|
# Extract application ID
|
|
APP_ID=$(echo "${APP_RESPONSE}" | jq -r '.applicationId')
|
|
echo "Created application: ${APP_ID}"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Set all required environment variables
|
|
- Use escaped newlines for env variables
|
|
- Store APP_ID for domain and deployment
|
|
|
|
---
|
|
|
|
### Step 5.4: Configure Domain
|
|
**Purpose**: Set up domain routing through Traefik
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Determine domain name (use portal.ai.flexinit.nl or ask user)
|
|
DOMAIN="portal.ai.flexinit.nl"
|
|
|
|
# Create domain mapping
|
|
curl -s -X POST \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
-H "Content-Type: application/json" \
|
|
"${DOKPLOY_URL}/api/domain.create" \
|
|
-d "{
|
|
\"domain\": \"${DOMAIN}\",
|
|
\"applicationId\": \"${APP_ID}\",
|
|
\"https\": true,
|
|
\"port\": 3000
|
|
}"
|
|
|
|
echo "Configured domain: https://${DOMAIN}"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Domain must match wildcard DNS pattern
|
|
- Enable HTTPS (Traefik handles SSL)
|
|
- Port 3000 matches container expose
|
|
|
|
---
|
|
|
|
### Step 5.5: Deploy Application
|
|
**Purpose**: Trigger deployment on Dokploy
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Trigger deployment
|
|
DEPLOY_RESPONSE=$(curl -s -X POST \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
-H "Content-Type: application/json" \
|
|
"${DOKPLOY_URL}/api/application.deploy" \
|
|
-d "{
|
|
\"applicationId\": \"${APP_ID}\"
|
|
}")
|
|
|
|
# Extract deployment ID
|
|
DEPLOY_ID=$(echo "${DEPLOY_RESPONSE}" | jq -r '.deploymentId // "unknown"')
|
|
echo "Deployment started: ${DEPLOY_ID}"
|
|
echo "Monitor at: ${DOKPLOY_URL}/project/${PROJECT_ID}"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Deployment is asynchronous
|
|
- Need to poll for completion
|
|
- Typical deployment: 1-3 minutes
|
|
|
|
---
|
|
|
|
## Phase 6: Deployment Verification
|
|
|
|
### Step 6.1: Wait for Deployment
|
|
**Purpose**: Monitor deployment until complete
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Poll deployment status (example - adjust based on Dokploy API)
|
|
MAX_WAIT=300 # 5 minutes
|
|
ELAPSED=0
|
|
INTERVAL=10
|
|
|
|
while [ $ELAPSED -lt $MAX_WAIT ]; do
|
|
# Check if application is running
|
|
STATUS=$(curl -s \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
"${DOKPLOY_URL}/api/application.status?id=${APP_ID}" | \
|
|
jq -r '.status // "unknown"')
|
|
|
|
echo "Status: ${STATUS} (${ELAPSED}s elapsed)"
|
|
|
|
if [ "${STATUS}" = "running" ]; then
|
|
echo "✓ Deployment completed successfully"
|
|
break
|
|
fi
|
|
|
|
sleep ${INTERVAL}
|
|
ELAPSED=$((ELAPSED + INTERVAL))
|
|
done
|
|
|
|
if [ $ELAPSED -ge $MAX_WAIT ]; then
|
|
echo "✗ Deployment timeout after ${MAX_WAIT}s"
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Poll with exponential backoff
|
|
- Timeout after reasonable duration
|
|
- Log status changes
|
|
|
|
---
|
|
|
|
### Step 6.2: Health Check via Domain
|
|
**Purpose**: Verify application is accessible via public URL
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Test public endpoint
|
|
echo "Testing: https://${DOMAIN}/health"
|
|
|
|
# Allow time for DNS/SSL propagation
|
|
sleep 10
|
|
|
|
# Verify health endpoint
|
|
HEALTH_RESPONSE=$(curl -s "https://${DOMAIN}/health")
|
|
HEALTH_STATUS=$(echo "${HEALTH_RESPONSE}" | jq -r '.status // "error"')
|
|
|
|
if [ "${HEALTH_STATUS}" = "healthy" ]; then
|
|
echo "✓ Application is healthy"
|
|
echo "${HEALTH_RESPONSE}" | jq .
|
|
else
|
|
echo "✗ Application health check failed"
|
|
echo "${HEALTH_RESPONSE}"
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
**Expected Response**:
|
|
```json
|
|
{
|
|
"status": "healthy",
|
|
"timestamp": "2026-01-09T...",
|
|
"version": "0.1.0",
|
|
"service": "ai-stack-deployer",
|
|
"activeDeployments": 0
|
|
}
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Test via HTTPS (validate SSL works)
|
|
- Retry on first failure (DNS propagation)
|
|
- Verify JSON structure and status field
|
|
|
|
---
|
|
|
|
### Step 6.3: Frontend Accessibility Test
|
|
**Purpose**: Confirm frontend loads correctly
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Test root endpoint returns HTML
|
|
curl -s "https://${DOMAIN}/" | head -20
|
|
|
|
# Check for expected HTML content
|
|
if curl -s "https://${DOMAIN}/" | grep -q "AI Stack Deployer"; then
|
|
echo "✓ Frontend is accessible"
|
|
else
|
|
echo "✗ Frontend not loading correctly"
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Verify HTML contains expected title
|
|
- Check for 200 status code
|
|
- Test at least one static asset (CSS/JS)
|
|
|
|
---
|
|
|
|
### Step 6.4: API Endpoint Test
|
|
**Purpose**: Verify API endpoints respond correctly
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Test name availability check
|
|
TEST_RESPONSE=$(curl -s "https://${DOMAIN}/api/check/test-deployment-123")
|
|
echo "API Test Response:"
|
|
echo "${TEST_RESPONSE}" | jq .
|
|
|
|
# Verify response structure
|
|
if echo "${TEST_RESPONSE}" | jq -e '.valid' > /dev/null; then
|
|
echo "✓ API endpoints functional"
|
|
else
|
|
echo "✗ API response malformed"
|
|
exit 1
|
|
fi
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Test each critical endpoint
|
|
- Verify JSON responses parse correctly
|
|
- Log any API errors for debugging
|
|
|
|
---
|
|
|
|
## Phase 7: Post-Deployment
|
|
|
|
### Step 7.1: Document Deployment Details
|
|
**Purpose**: Record deployment information for reference
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Create deployment record
|
|
cat > deployment-record-${TIMESTAMP}.txt << EOF
|
|
Deployment Completed: $(date -Iseconds)
|
|
Project ID: ${PROJECT_ID}
|
|
Application ID: ${APP_ID}
|
|
Deployment ID: ${DEPLOY_ID}
|
|
Image: ai-stack-deployer:${TIMESTAMP}
|
|
Domain: https://${DOMAIN}
|
|
Health Check: https://${DOMAIN}/health
|
|
Dokploy Console: ${DOKPLOY_URL}/project/${PROJECT_ID}
|
|
|
|
Status: SUCCESS
|
|
EOF
|
|
|
|
echo "Deployment record saved: deployment-record-${TIMESTAMP}.txt"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Save deployment metadata
|
|
- Include rollback information
|
|
- Log all IDs for future operations
|
|
|
|
---
|
|
|
|
### Step 7.2: Cleanup Build Artifacts
|
|
**Purpose**: Remove temporary files and images
|
|
|
|
**Commands**:
|
|
```bash
|
|
# Keep latest, remove older images
|
|
docker images ai-stack-deployer --format "{{.Tag}}" | \
|
|
grep -v latest | \
|
|
xargs -r -I {} docker rmi ai-stack-deployer:{} 2>/dev/null || true
|
|
|
|
# Clean up build cache if needed
|
|
# docker builder prune -f
|
|
|
|
echo "✓ Cleanup completed"
|
|
```
|
|
|
|
**Automation Notes**:
|
|
- Keep `:latest` tag
|
|
- Optional: clean build cache
|
|
- Don't fail script if no images to remove
|
|
|
|
---
|
|
|
|
## Automation Script Skeleton
|
|
|
|
```bash
|
|
#!/usr/bin/env bash
|
|
set -euo pipefail
|
|
|
|
# Configuration
|
|
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
PROJECT_ROOT="${SCRIPT_DIR}/.."
|
|
TIMESTAMP=$(date +%Y%m%d-%H%M%S)
|
|
|
|
# Load environment
|
|
source "${PROJECT_ROOT}/.env"
|
|
|
|
# Functions
|
|
log_info() { echo "[INFO] $*"; }
|
|
log_error() { echo "[ERROR] $*" >&2; }
|
|
check_prerequisites() { ... }
|
|
build_image() { ... }
|
|
test_locally() { ... }
|
|
deploy_to_dokploy() { ... }
|
|
verify_deployment() { ... }
|
|
|
|
# Main execution
|
|
main() {
|
|
log_info "Starting deployment at ${TIMESTAMP}"
|
|
|
|
check_prerequisites
|
|
build_image
|
|
test_locally
|
|
deploy_to_dokploy
|
|
verify_deployment
|
|
|
|
log_info "Deployment completed successfully!"
|
|
log_info "Access: https://${DOMAIN}"
|
|
}
|
|
|
|
main "$@"
|
|
```
|
|
|
|
---
|
|
|
|
## Rollback Procedure
|
|
|
|
If deployment fails:
|
|
|
|
```bash
|
|
# Get previous deployment
|
|
PREV_DEPLOY=$(curl -s \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
"${DOKPLOY_URL}/api/deployment.list?applicationId=${APP_ID}" | \
|
|
jq -r '.deployments[1].deploymentId')
|
|
|
|
# Rollback
|
|
curl -X POST \
|
|
-H "x-api-key: ${DOKPLOY_API_TOKEN}" \
|
|
"${DOKPLOY_URL}/api/deployment.rollback" \
|
|
-d "{\"deploymentId\": \"${PREV_DEPLOY}\"}"
|
|
```
|
|
|
|
---
|
|
|
|
## Notes for Future Automation
|
|
|
|
1. **Error Handling**: Add `|| exit 1` to critical steps
|
|
2. **Logging**: Redirect all output to log file: `2>&1 | tee deployment.log`
|
|
3. **Notifications**: Add Slack/email notifications on success/failure
|
|
4. **Parallel Testing**: Run multiple verification tests concurrently
|
|
5. **Metrics**: Collect deployment duration, image size, startup time
|
|
6. **CI/CD Integration**: Trigger on git push with GitHub Actions/GitLab CI
|
|
|
|
---
|
|
|
|
**End of Deployment Notes**
|
|
|
|
---
|
|
|
|
## Graphiti Memory Search Results
|
|
|
|
### Dokploy Infrastructure Details:
|
|
- **Location**: 10.100.0.20:3000 (shares VM with Grafana/Loki)
|
|
- **UI**: https://deploy.intra.flexinit.nl (requires login)
|
|
- **Config Location**: /etc/dokploy/compose/
|
|
- **API Token Format**: `app_deployment{random}`
|
|
- **Token Generation**: Via Dokploy UI → Settings → Profile → API Tokens
|
|
- **Token Storage**: BWS secret `6b3618fc-ba02-49bc-bdc8-b3c9004087bc`
|
|
|
|
### Previous Known Issues:
|
|
- 401 Unauthorized errors occurred (token might need regeneration)
|
|
- Credentials stored in Bitwarden at pass.cloud.flexinit.nl
|
|
|
|
### Registry Information:
|
|
- Docker image referenced: `git.app.flexinit.nl/oussamadouhou/oh-my-opencode-free:latest`
|
|
- This suggests git.app.flexinit.nl may have a Docker registry
|
|
|