Shipyard
A self-hosted internal deployment platform — think Vercel, running on your own Kubernetes cluster. Supports Next.js, Node.js, Spring Boot, PHP, and static sites with preview environments, one-click rollbacks, and a unified developer CLI. Projects can be organized into groups (e.g., "frontend", "backend") and tagged with labels for filtering and cost attribution.
| Component | Technology |
|---|---|
| Control plane API | Fastify 4, TypeScript, Node 22 |
| Dashboard | React 19, TanStack Query + Router, Vite |
| CLI | ship — Node.js, distributed via npm |
| Database | PostgreSQL 16, Drizzle ORM |
| Queue | BullMQ v5, Redis 7 |
| Container registry | Google Artifact Registry (prod) / local registry (dev) |
| Static storage | MinIO (S3-compatible) |
| Routing | Traefik v3 with IngressRoute CRDs |
| Runtime | Kubernetes — one namespace per project |
Architecture
Every deployment flows through three async stages: build → deploy → live. Each stage is a separate BullMQ worker process. A fourth cleanup worker runs cron jobs for maintenance.
Queue Details
| Queue | Concurrency | Retries | Purpose |
|---|---|---|---|
build-queue | 3 | 0 | Clone, detect buildpack, install deps, build, push image |
deploy-queue | 2 | 2 | Decrypt env vars, apply k8s resources, wait for rollout |
cleanup-queue | 1 | 0 | Cron jobs: preview TTL, image GC, stale deployment cleanup |
Cleanup Routines
| Routine | Schedule | Description |
|---|---|---|
preview-ttl | Hourly | Cancels preview deployments older than 7 days (PREVIEW_MAX_AGE_DAYS) and tears down k8s resources |
image-gc | Every 6 hours | Retains only 10 builds (IMAGES_PER_BRANCH) per project/branch, deletes older build records |
stale-deploy-check | Hourly | Force-fails deployments stuck: queued >30min, building >70min, deploying >15min |
Namespaces
| Namespace | Contents |
|---|---|
shipyard-system | API, dashboard, all three queue workers |
shipyard-infra | Postgres, Redis, MinIO, Traefik |
proj-<slug> | Production deployment for each project |
proj-<slug>-preview | PR preview deployments, isolated by NetworkPolicy |
URL Scheme
| Pattern | Purpose |
|---|---|
<project>.apps.shipyard.wake.co.ke | Production alias (updated on each prod deploy) |
<project>-<sha7>.deploy.shipyard.wake.co.ke | Permanent per-deployment link |
<project>-pr-<n>.deploy.shipyard.wake.co.ke | PR preview environments |
<custom-domain> | After DNS TXT verification + cert issuance |
Secrets model
Two-tier encryption: project env vars are AES-256-GCM encrypted with a per-project DEK, itself encrypted with a platform-wide KEK stored only in the cluster Secret. The raw KEK never appears in logs, API responses, or application code paths beyond the single encrypt/decrypt call site. Git source OAuth tokens are encrypted at rest with the same KEK.
Authentication
JWT access tokens (15-minute expiry) + opaque refresh tokens (30-day expiry, rotated on each use). API tokens for CI/CD are prefixed ship_ and stored as SHA-256 hashes. Both clients (dashboard and CLI) auto-refresh silently on 401 before surfacing an error.
Local Development
Requirements
- Node 22 LTS (
.nvmrcpins the version) - pnpm ≥ 9.x
- Docker Desktop
Start the full stack
# 1. Install all dependencies
pnpm install
# 2. Start Postgres, Redis, MinIO, and local registry
docker-compose -f docker-compose.dev.yml up -d
# 3. Run database migrations
pnpm --filter=@shipyard/db migrate
# 4. Seed local data
pnpm tsx scripts/seed.ts
# 5. Start the API (port 3001)
pnpm --filter=@shipyard/api dev
# 6. Start the dashboard (port 5173)
pnpm --filter=@shipyard/dashboard dev
Common commands
# Build all packages
pnpm turbo build
# Type-check everything
pnpm turbo typecheck
# Lint everything
pnpm turbo lint
# Run all tests
pnpm turbo test
# Run tests for one package
pnpm turbo test --filter=packages/buildpack
# Run a single migration generation
pnpm --filter=@shipyard/db migrate:generate
Environment flags
| Variable | Default | Effect |
|---|---|---|
K8S_ENABLED | false | Apply real Kubernetes resources on deploy |
CONTAINER_BUILD_ENABLED | false | Run real docker build + push to registry |
STATIC_STORAGE_ENABLED | false | Upload static assets to MinIO |
CACHE_ENABLED | false | Restore/save Docker layer cache and npm cache |
All four default to false so the workers run in sim mode locally — deployments complete without needing a real cluster.
Completed Sprints
Each sprint is a focused feature slice shipped as a single commit.
Sprint 1 — Auth, RBAC, and Project CRUD
Foundation: user registration and login with bcrypt + JWT, team creation, role-based access control (owner / admin / developer / viewer), and full project CRUD. All routes protected from day one.
Key files: apps/api/src/routes/auth.ts, apps/api/src/lib/rbac.ts
Sprint 2 — Deployment CRUD, Webhooks, BullMQ Wiring
Deployment lifecycle (queued → building → deploying → live → failed / cancelled), BullMQ queue setup for build and deploy queues, and a stub webhook handler for GitHub and GitLab.
Sprint 3 — Build Runner Sim, Deploy Worker, SSE Log Streaming
Build worker that clones repos and streams output line-by-line. Deploy worker that applies k8s resources. Real-time log streaming via Server-Sent Events over Redis pub/sub. CLI --follow flag with long-poll fallback.
Sprint 4 — Encrypted Env Vars and API Tokens
AES-256-GCM env var encryption with per-project DEK + platform KEK. API tokens with ship_ prefix stored as SHA-256 hashes. Reveal endpoint (admin-only, audit-logged). CI/CD token authentication flow.
Key files: apps/api/src/lib/crypto.ts, apps/api/src/routes/env-vars.ts
Sprint 5 — React Dashboard
React 19 SPA with TanStack Query and TanStack Router. Project list, deployment table with live polling, real-time build log viewer, env var management with masked values, domain management panel.
Sprint 6 — ship CLI
Full ship CLI: login, deploy --follow, logs --follow, status, ps, rollback, env list/set/unset. Auto-detects project from git remote. Streams build logs over SSE.
Key files: apps/cli/src/commands/
Sprint 7 — PR Preview Environments
Each open PR gets its own isolated deployment at <project>-pr-<n>.deploy.*. Separate k8s namespace (proj-<slug>-preview), scoped env vars, and proper status tracking. Preview scope separated from production scope in all deployment queries.
Sprint 8 — Cleanup Queue Worker
BullMQ cron worker with three routines: preview-ttl (cancel previews older than 7 days, hourly), image-gc (prune old build records beyond last 10 per branch, every 6 hours), log-archive (flag logs older than 90 days, nightly).
Sprint 9 — Custom Domain Management
Add custom domains to projects with DNS TXT verification. Traefik IngressRoute is updated to serve the domain once verified. Cert issuance via Let's Encrypt ACME handled by Traefik automatically.
Sprint 10 — Real Kubernetes Deploy Worker
Full k8s resource lifecycle using @kubernetes/client-node: namespace, ConfigMap, Secret (with base64-encoded env vars), ClusterIP Service, Deployment with readiness probe, and Traefik IngressRoute CRD. 3-minute rollout timeout with automatic rollback to previous image on failure.
Key files: apps/api/src/lib/k8s.ts, apps/api/src/workers/deploy.ts
Sprint 11 — Team Member Management
Invite members by email, change roles, remove members. Guards against demoting the last owner. Full UI in Team Settings page.
Sprint 13 — Git Sources and API Token UI
Git source management (GitHub / GitLab installation IDs + webhook secrets). API token creation and revocation UI for CI/CD. Webhook secret shown exactly once at creation.
Sprint 14 — Observability
Prometheus metrics: deployments_total, builds_total, build_duration_seconds. Paginated build log browsing endpoint (GET /deployments/:id/logs?after=<seq>). Metrics endpoint at /metrics.
Sprints 15–16 — shipyard.json Config and Production Helm Chart
A shipyard.json (or .shiprc) at the repo root overrides buildpack detection entirely. Full production Helm chart with ClusterRole/ClusterRoleBinding, ConfigMap, Secret, all four Deployments, Service, ServiceAccount, and Traefik IngressRoutes.
Key files: infra/helm/
Sprint 17 — E2E Test Fixtures and Runner
End-to-end test runner (scripts/e2e/run.ts) that logs in, creates temp git repos from fixtures, triggers deployments, and polls for live status. Fixture projects for: static HTML, Node.js app, Next.js, Spring Boot (Maven), and PHP (legacy).
Key files: scripts/e2e/
Sprint 19 — Static Site MinIO Upload and Serving
Static and static-node (Vite/Next.js export) builds upload dist/ to MinIO bucket shipyard-static-<project-id>. Deploy worker spins up an nginx:alpine pod that reverse-proxies to MinIO with SPA fallback. _current pointer updated atomically on each deploy.
Key files: apps/api/src/lib/storage.ts
Dockerfile Templates + Real Container Build Pipeline
Multi-stage Dockerfile templates for every buildpack type: Next.js (3-stage), Node, Spring Boot Maven and Gradle, PHP-FPM + NGINX via BuildKit heredocs. Build worker runs real docker build + docker push to Harbor/Artifact Registry with --password-stdin.
Key files: packages/buildpack/src/dockerfile.ts
Build Caching
Docker layer cache via BuildKit --cache-from / --cache-to type=registry referencing a :cache tag per project. npm dependency cache stored as a tarball in MinIO bucket shipyard-buildcache, keyed by project + buildpack type. Restore before install, save after — transparent to build scripts.
Key files: apps/api/src/lib/cache.ts
Sprint 20 — One-Click Rollback
Restore any past deployment as the live version in under 10 seconds — it's a pointer update, not a new build. API validates the target has a build artifact, updates the MinIO _current pointer (static) or patches the k8s Deployment image (containers), then swaps statuses atomically. Dashboard shows inline "Restore sha? Yes / No" confirmation.
Key files: apps/api/src/routes/deployments.ts, apps/dashboard/src/pages/ProjectPage.tsx
Sprint 21 — Audit Log
audit_events table persists security events to Postgres. Six instrumented actions: env var reveal, member role change, member removal, API token creation, API token revocation, domain verification. GET /teams/:teamId/audit-log endpoint (admin-only). Dashboard "Audit log" tab in Team Settings.
Key files: apps/api/src/lib/audit.ts, packages/db/migrations/0004_audit_events.sql
Sprint 22 — Webhook Auto-Deploy + Preview Teardown
GitHub and GitLab webhook handlers fully wired: push events trigger production or branch deploys; PR/MR open/reopen/synchronize creates preview environments; PR/MR close/merge tears them down. On PR close: cancels DB records, drains the deploy queue, and deletes all k8s resources (Deployment, Service, ConfigMaps, Secret, IngressRoute). On synchronize: cancels any queued build for the same PR before starting a new one.
Key files: apps/api/src/routes/webhooks.ts, apps/api/src/lib/k8s.ts
Production Hardening — Token Refresh + Credential Encryption
Dashboard and CLI both auto-refresh JWT access tokens on 401 before surfacing errors. Dashboard deduplicates concurrent refresh attempts via a singleton promise. CLI persists the rotated token pair back to ~/.shipyard/config.json. Git source OAuth/PAT tokens now encrypted at rest with the platform KEK using AES-256-GCM.
Key files: apps/dashboard/src/api/client.ts, apps/cli/src/client.ts, apps/api/src/lib/crypto.ts
Bug fix — Deploy Always Used main Branch
Manual "Deploy now" and the triggerDeploymentSchema had ref defaulting to 'main'. Because the schema filled in the default before the API's body.ref ?? project.defaultBranch fallback, the project's configured branch was silently ignored. Fixed by making ref optional in the schema so the nullish coalescing correctly falls through to project.defaultBranch.
Key files: packages/shared/src/schemas.ts
Bitbucket Support + Private Repo Cloning (v1.1.0)
Added Bitbucket Cloud as a third git provider alongside GitHub and GitLab. POST /api/v1/webhooks/bitbucket handles repo:push, pullrequest:created/updated/fulfilled/rejected events with HMAC-SHA256 (X-Hub-Signature) verification. Pull-request teardown and stale-build cancellation follow the same path as GitHub/GitLab.
Private repo cloning is now supported for all three providers. The build worker receives a cloneToken (decrypted from the git source, stored ephemerally in Redis — never in the DB) and injects credentials into the clone URL using provider-specific prefixes: x-access-token (GitHub), oauth2 (GitLab), x-token-auth (Bitbucket). The token is never written to build logs. Migration 0005_add_bitbucket_provider.sql extends the git_provider enum.
Key files: apps/api/src/routes/webhooks.ts, apps/api/src/workers/build.ts, packages/db/migrations/0005_add_bitbucket_provider.sql
Upcoming Sprints
Prioritised backlog — each is a self-contained sprint.
preview-ttl cron now calls teardownPreviewResources() for each age-expired preview after cancelling its DB record, deleting the Deployment, Service, ConfigMaps, Secret, and IngressRoute. Matches the same teardown path used by the webhook PR-close handler.git_provider enum. Shipped as v1.1.0.main automatically typechecks, builds a linux/amd64 image on the runner, pushes to Artifact Registry, runs pending migrations via a schema_migrations tracker, and Helm-upgrades the cluster. Auth uses Workload Identity Federation — no SA keys. Total run time ~3m30s.queued → pending/"Build queued", building → pending/"Building…", live → success, failed/cancelled → failure. teamId threaded through all callers (manual deploy + all 6 webhook paths). Best-effort — never throws.DELETE /projects/:id (owner-only): drains in-flight BullMQ jobs, tears down proj-<slug> and proj-<slug>-preview k8s namespaces (best-effort), empties and deletes the MinIO static bucket (best-effort), then DB cascade-deletes all child records. Dashboard confirmation panel was already in place.lib/registry.ts implements deleteRegistryImage() using the Docker Registry HTTP API v2 — resolves the tag to a digest via HEAD /v2/{name}/manifests/{tag}, then DELETE /v2/{name}/manifests/{digest}. Works with both Harbor and Artifact Registry. The image-gc cron fetches each build's imageRef and calls it (best-effort) before pruning the DB record. Skips sim and static-only builds automatically.process.once('SIGTERM', …) added to all three workers. Build worker tracks the active deployment ID; if the 25s grace period expires before the build finishes, it marks the deployment failed and removes the workdir before exiting. Deploy and cleanup workers follow the same close-or-timeout pattern. A hard-exit timer is .unref()'d so it doesn't block the event loop if the worker drains cleanly."scaling": {"minReplicas": 1, "maxReplicas": 5, "targetCPU": 60} in shipyard.json creates an autoscaling/v2 HPA targeting the project Deployment on CPU utilisation. Build worker extracts and validates the config; deploy worker calls applyHpa() when present or deleteHpa() when removed. HPA is also deleted during preview teardown.shipyard_deployments_active (Gauge, polled from DB every 15s alongside queue depth, grouped by project slug) and shipyard_rollbacks_total (Counter with project + reason labels). Incremented in the deploy worker on auto-rollback and in the manual rollback API route.runLogArchive() stub in the cleanup worker with a real implementation: for each deployment older than LOG_RETENTION_DAYS (90d), fetches its build_logs rows, gzips them, uploads to MinIO as shipyard-logs/<deployment-id>.log.gz, then deletes the DB rows to keep Postgres lean. Skipped gracefully when STATIC_STORAGE_ENABLED is false (local dev).lib/notify.ts: notifySlack() reads SHIPYARD_SLACK_WEBHOOK from encrypted project env vars and posts a Slack message. New postPrComment() in lib/git-status.ts posts comments to GitHub PRs, GitLab MRs, and Bitbucket PRs. Wired into both workers: build failure → Slack + PR comment; production deploy success → Slack; preview deploy success → PR comment with URL; auto-rollback → Slack critical. All best-effort.packages/runtime)RuntimeAdapter interface in a new packages/runtime package covering deploy(), teardown(), rollback(), launchBuild(), and getLogs(). Move all existing @kubernetes/client-node logic in the deploy and build workers into a KubernetesRuntime class that implements the interface. Active adapter selected via RUNTIME=kubernetes|docker env var. All business logic above the adapter — queue workers, API, buildpack detection, storage, dashboard, CLI — remains unchanged. Prerequisite for all subsequent deployment-target sprints. Estimated effort: 1 week.KubernetesRuntime works identically on any conformant k8s cluster. Deliverables: (1) values.do.yaml targeting DOKS + DO Container Registry + DO Spaces; (2) values.aws.yaml targeting EKS + ECR + S3; (3) a parameterised CI/CD workflow that branches only on auth steps (doctl vs aws vs gcloud) based on a CLOUD_TARGET secret. Storage adapter already supports S3-compatible endpoints via env vars — config change only. Production deployment guides for both providers added to docs. Estimated effort: 1 week.DockerRuntime using the Docker socket (Dockerode) as a second RuntimeAdapter. Containers replace k8s Deployments; named Docker networks replace namespaces; Traefik switches to its Docker label provider in place of IngressRoute CRDs. Build jobs launch as short-lived containers instead of k8s Jobs. Rollback swaps the running container image. Isolation is enforced via Docker networks rather than NetworkPolicy — functional but less strict. HPA is not supported in this mode. Enables running the full Shipyard platform on a single VPS with no k8s dependency. Depends on Sprint 29. Estimated effort: 3–5 weeks.Production Deployment Guide
Deploying Shipyard itself to GKE on Google Cloud Platform. Target domain: shipyard.wake.co.ke.
Use GKE Standard, not Autopilot. The build worker needs to mount the host Docker socket (/var/run/docker.sock) to run docker build. Autopilot blocks hostPath volumes.
Step 1 — GCP Setup
Install gcloud CLI
brew install --cask google-cloud-sdk
gcloud init # opens browser, log in, select project
Enable required APIs
gcloud services enable container.googleapis.com artifactregistry.googleapis.com compute.googleapis.com cloudbuild.googleapis.com --project=shipyard-254
Run each gcloud command on a single line. Backslash line continuations break silently in zsh when pasted into the terminal.
Step 2 — GKE Cluster
Create a Standard (not Autopilot) cluster in africa-south1 (Johannesburg — lowest latency for Kenya). The --scopes=cloud-platform flag is critical: without it, nodes get only devstorage.read_only scope and cannot pull from Artifact Registry.
gcloud container clusters create shipyard-prod --project=shipyard-254 --region=africa-south1 --release-channel=stable --cluster-version=latest --num-nodes=2 --machine-type=e2-standard-4 --enable-autoscaling --min-nodes=2 --max-nodes=6 --workload-pool=shipyard-254.svc.id.goog --scopes=cloud-platform
Do not use GKE Autopilot. Autopilot blocks hostPath volumes and privileged containers, which the build worker requires for Docker socket access. Always use a Standard cluster.
Connect kubectl
gcloud components install gke-gcloud-auth-plugin
gcloud container clusters get-credentials shipyard-prod --region=africa-south1 --project=shipyard-254
kubectl get nodes # should show 6 nodes (2 per zone × 3 zones)
Create namespaces
kubectl create namespace shipyard-system
kubectl create namespace shipyard-infra
Step 3 — Artifact Registry
gcloud artifacts repositories create shipyard --repository-format=docker --location=africa-south1 --project=shipyard-254 --description="Shipyard platform images"
# Authenticate Docker
gcloud auth configure-docker africa-south1-docker.pkg.dev
Image URL: africa-south1-docker.pkg.dev/shipyard-254/shipyard/platform:<tag>
Step 4 — Build and Push the Platform Image
The same image runs all four processes (API + 3 workers). The k8s Deployment command: selects which one starts.
GKE nodes are linux/amd64. If you are building on an Apple Silicon Mac, a plain docker build produces an arm64 image that GKE will refuse with "no match for platform in manifest". Use Google Cloud Build (recommended) or docker buildx --platform linux/amd64 to produce the correct architecture.
Option A — Google Cloud Build (recommended)
Builds on Google infrastructure, natively linux/amd64, pushes straight to Artifact Registry. Enable the API once, then use this for all future builds:
gcloud services enable cloudbuild.googleapis.com --project=shipyard-254
gcloud builds submit --tag africa-south1-docker.pkg.dev/shipyard-254/shipyard/platform:1.0.0 --project=shipyard-254 .
Option B — Local cross-platform build
docker buildx build --platform linux/amd64 -t africa-south1-docker.pkg.dev/shipyard-254/shipyard/platform:1.0.0 --push .
Dockerfile (repo root)
FROM node:22-alpine AS builder
RUN corepack enable pnpm
WORKDIR /app
COPY . .
RUN pnpm install --frozen-lockfile
RUN pnpm turbo build
FROM node:22-alpine
RUN apk add --no-cache git docker-cli
RUN corepack enable pnpm
WORKDIR /app
COPY --from=builder /app/package.json /app/pnpm-lock.yaml /app/pnpm-workspace.yaml /app/turbo.json ./
COPY --from=builder /app/node_modules ./node_modules
COPY --from=builder /app/packages ./packages
COPY --from=builder /app/apps/api ./apps/api
COPY --from=builder /app/apps/cli ./apps/cli
RUN pnpm install --frozen-lockfile --prod --ignore-scripts
CMD ["node", "apps/api/dist/index.js"]
Patch the build worker for Docker socket access
Edit infra/helm/templates/deployment-worker-build.yaml and add under volumeMounts and volumes:
volumeMounts:
- name: build-workspace
mountPath: {{ .Values.workers.build.workdir }}
- name: docker-socket
mountPath: /var/run/docker.sock
volumes:
- name: build-workspace
emptyDir: {}
- name: docker-socket
hostPath:
path: /var/run/docker.sock
type: Socket
Step 5 — Install Infrastructure Dependencies
Add Helm repos
helm repo add traefik https://traefik.github.io/charts
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo add minio https://charts.min.io/
helm repo update
Traefik v3
Uses a values file (infra/helm/traefik-values.yaml) to avoid shell line-continuation issues. Do not add ports.websecure.tls — Traefik v3 chart schema rejects that key; TLS on websecure is enabled by default.
Do not use TLS-ALPN-01 or HTTP-01 ACME challenges on GKE africa-south1. Let's Encrypt's validation servers cannot reach this region — both challenge types time out with "Timeout during connect (likely firewall problem)" even though the firewall is open. Use DNS-01 via Cloudflare API instead. This completely bypasses connectivity and works regardless of region.
Create the Cloudflare API token
In the Cloudflare dashboard → My Profile → API Tokens → Create Token → Custom token:
- Permissions: Zone → DNS → Edit
- Zone Resources: Include → Specific zone →
wake.co.ke
Store the token as a k8s secret before installing Traefik:
kubectl create secret generic traefik-cloudflare-token \
--from-literal=CF_DNS_API_TOKEN=<your-token> \
-n shipyard-infra
# infra/helm/traefik-values.yaml
providers:
kubernetesCRD:
enabled: true
allowCrossNamespace: true
kubernetesIngress:
enabled: true
certificatesResolvers:
letsencrypt:
acme:
email: alberto@zaoshinani.com
storage: /data/acme.json
dnsChallenge:
provider: cloudflare
resolvers:
- "1.1.1.1:53"
- "8.8.8.8:53"
env:
- name: CF_DNS_API_TOKEN
valueFrom:
secretKeyRef:
name: traefik-cloudflare-token
key: CF_DNS_API_TOKEN
persistence:
enabled: true
size: 128Mi
podSecurityContext:
fsGroup: 65532
fsGroupChangePolicy: "OnRootMismatch"
helm upgrade --install traefik traefik/traefik --namespace shipyard-infra -f infra/helm/traefik-values.yaml
acme.json permissions: Traefik requires exactly 600 on acme.json. Kubernetes's default fsGroup behaviour resets files to 660 on every pod restart, causing Traefik to refuse to start. The fsGroupChangePolicy: "OnRootMismatch" option prevents this by skipping the recursive chmod when the volume root is already owned by the fsGroup. The recommended initContainer fix (runAsUser: 0) is blocked by GKE's non-root pod security policy.
PostgreSQL 16
helm upgrade --install postgres bitnami/postgresql --namespace shipyard-infra --set auth.username=shipyard --set auth.password=f2de63dbdda2af7c90c813999a32605e --set auth.database=shipyard
Redis 7
helm upgrade --install redis bitnami/redis --namespace shipyard-infra --set auth.password=78d00185c6a72eb62f715daba56ae34f --set architecture=standalone
MinIO
Must set explicit resource limits — the default chart memory request is 16 GiB which will not schedule on standard e2-standard-4 nodes.
helm upgrade --install minio minio/minio --namespace shipyard-infra --set rootUser=shipyard --set rootPassword=77fdc7516052eb7d883b747b65e49b73 --set mode=standalone --set persistence.size=20Gi --set resources.requests.memory=512Mi --set resources.requests.cpu=250m --set resources.limits.memory=2Gi --set resources.limits.cpu=1
Step 6 — Secrets
Back up ENCRYPTION_KEK before continuing. If it is ever lost, all project environment variables become permanently unreadable. Store it in a password manager or secrets vault, not just in this file.
The secrets generated for this deployment:
| Variable | Value |
|---|---|
POSTGRES_PASSWORD | f2de63dbdda2af7c90c813999a32605e |
REDIS_PASSWORD | 78d00185c6a72eb62f715daba56ae34f |
MINIO_PASSWORD | 77fdc7516052eb7d883b747b65e49b73 |
JWT_SECRET | 97dc32a1cb5fa01e157f44b330dcf9a1a9410fd8a67fde0eb218b8e069718d0a |
ENCRYPTION_KEK | 3eeb918a1c26ea8b86d741602b914ba6c6dfaa827518a34248d3e98b84bde38e |
Step 7 — Deploy Shipyard via Helm
Run database migrations
Migrations are SQL files in packages/db/migrations/. Pipe them directly into the postgres pod — no migration job image required:
cat packages/db/migrations/0000_past_joseph.sql \
packages/db/migrations/0001_true_landau.sql \
packages/db/migrations/0002_milky_jamie_braddock.sql \
packages/db/migrations/0003_open_nehzno.sql \
packages/db/migrations/0004_audit_events.sql \
packages/db/migrations/0005_add_bitbucket_provider.sql \
| sed 's/--> statement-breakpoint/;/g' \
| kubectl exec -i -n shipyard-infra postgres-postgresql-0 \
-- env PGPASSWORD=f2de63dbdda2af7c90c813999a32605e psql -U shipyard -d shipyard
Create the platform secret
Secrets are stored in a separate k8s Secret that Helm references via existingSecret — never committed to git:
kubectl create secret generic shipyard-secrets --namespace shipyard-system --from-literal=jwtSecret="97dc32a1cb5fa01e157f44b330dcf9a1a9410fd8a67fde0eb218b8e069718d0a" --from-literal=encryptionKek="3eeb918a1c26ea8b86d741602b914ba6c6dfaa827518a34248d3e98b84bde38e" --from-literal=databaseUrl="postgres://shipyard:f2de63dbdda2af7c90c813999a32605e@postgres-postgresql.shipyard-infra.svc.cluster.local:5432/shipyard" --from-literal=redisUrl="redis://:78d00185c6a72eb62f715daba56ae34f@redis-master.shipyard-infra.svc.cluster.local:6379"
Production values file
# infra/helm/values.production.yaml
image:
repository: africa-south1-docker.pkg.dev/shipyard-254/shipyard/platform
tag: "1.1.0"
pullPolicy: IfNotPresent
namespace:
create: false
name: shipyard-system
api:
replicaCount: 1
extraEnv:
- name: MINIO_ENDPOINT
value: "http://minio.shipyard-infra.svc.cluster.local:9000"
- name: MINIO_ACCESS_KEY
value: "shipyard"
- name: MINIO_SECRET_KEY
value: "77fdc7516052eb7d883b747b65e49b73"
- name: MINIO_INTERNAL_URL
value: "http://minio.shipyard-infra.svc.cluster.local:9000"
- name: STATIC_STORAGE_ENABLED
value: "true"
- name: CONTAINER_BUILD_ENABLED
value: "false"
- name: HARBOR_REGISTRY
value: "africa-south1-docker.pkg.dev"
- name: HARBOR_PROJECT
value: "shipyard-254/shipyard"
config:
platformDomain: deploy.shipyard.wake.co.ke
internalDomain: apps.shipyard.wake.co.ke
corsOrigin: https://shipyard.wake.co.ke
k8sEnabled: "true"
logLevel: info
secrets:
existingSecret: "shipyard-secrets"
ingress:
enabled: true
certResolver: letsencrypt
entryPoints:
- websecure
api:
hostname: api.shipyard.wake.co.ke
dashboard:
hostname: shipyard.wake.co.ke
Helm install
helm upgrade --install shipyard ./infra/helm --namespace shipyard-system -f infra/helm/values.production.yaml --wait --timeout=3m
Fix image pull — GKE node pool scope limitation
GKE Standard node pools are created with devstorage.read_only OAuth scope. This scope only covers the old Container Registry (gcr.io), not Artifact Registry (pkg.dev). IAM permissions alone cannot override this OAuth scope limit. Pods will fail with 403 Forbidden on image pull even with roles/artifactregistry.reader granted.
The permanent fix is to recreate the node pool with --scopes=cloud-platform. The immediate workaround is an image pull secret using a GCP OAuth token.
Create the pull secret (run in GCP Cloud Shell to use your user credentials — gcloud auth print-access-token returns a token with full Artifact Registry access):
# Run in GCP Cloud Shell — single line, no wrapping
kubectl create secret docker-registry artifact-registry-key --docker-server=africa-south1-docker.pkg.dev --docker-username=oauth2accesstoken --docker-password="$(gcloud auth print-access-token)" --namespace=shipyard-system
Patch all deployments to use it:
for d in shipyard-api shipyard-worker-build shipyard-worker-cleanup shipyard-worker-deploy; do kubectl patch deployment $d -n shipyard-system -p '{"spec":{"template":{"spec":{"imagePullSecrets":[{"name":"artifact-registry-key"}]}}}}'; done
OAuth tokens expire after ~1 hour. To renew: delete the secret and recreate it. For a permanent solution, recreate the node pool with --scopes=cloud-platform and remove the image pull secret patches.
Verify
kubectl get pods -n shipyard-system
Step 8 — DNS
Get Traefik's external IP:
kubectl get svc traefik -n shipyard-infra
In your DNS provider (for wake.co.ke), add these records pointing to that IP:
| Type | Name | Value |
|---|---|---|
| A | shipyard.wake.co.ke | Traefik external IP |
| A | api.shipyard.wake.co.ke | Traefik external IP |
| A | grafana.shipyard.wake.co.ke | Traefik external IP — Grafana monitoring dashboard |
| A (wildcard) | *.deploy.shipyard.wake.co.ke | Traefik external IP |
| A (wildcard) | *.apps.shipyard.wake.co.ke | Traefik external IP |
The two wildcard records are what make per-deployment preview URLs and production aliases work automatically without any per-project DNS change.
Step 9 — First Run
Register the first user
curl -X POST https://api.shipyard.wake.co.ke/api/v1/auth/register \
-H 'Content-Type: application/json' \
-d '{"name":"Your Name","email":"you@example.com","password":"your-password"}'
Log in with the CLI
The CLI stores tokens in ~/.shipyard/config.json. If the interactive prompt doesn't work (e.g. in scripts), write the file directly after calling the register endpoint above.
ship login --api-url https://api.shipyard.wake.co.ke
Create your first project and deploy
# From inside a git repo
ship deploy --follow
Configure GitHub webhooks
In your GitHub repo → Settings → Webhooks → Add webhook:
- Payload URL:
https://api.shipyard.wake.co.ke/api/v1/webhooks/github - Content type:
application/json - Secret: the
webhookSecretreturned when you created the git source - Events: Pushes + Pull requests
Step 10 — CI/CD Pipeline
Every push to main automatically builds, migrates, and deploys the platform. Run time is ~3m30s. Auth uses Workload Identity Federation — no service account keys are stored anywhere.
This step only needs to be done once on a fresh GCP project. Once the pipeline is wired, deployments are fully hands-off.
How the pipeline works
| Step | What happens |
|---|---|
| Typecheck | pnpm turbo typecheck — fails fast before any build if types are broken |
| Authenticate | GitHub OIDC token → Workload Identity Federation → impersonate shipyard-cicd SA. No JSON key ever created or stored. |
| Build image | docker build runs on the GitHub Actions runner (linux/amd64, correct for GKE). Image tagged with the git SHA and pushed to Artifact Registry. |
| Refresh pull secret | Regenerates the artifact-registry-key k8s Secret using a fresh GCP OAuth token. Needed because the node pool uses devstorage.read_only scope (see Operational Notes). |
| Migrate | Idempotent migration runner: creates a schema_migrations tracking table on first run, seeds it with already-applied migrations, then applies only new *.sql files in order. |
| Helm upgrade | Detects and recovers any stuck pending-* Helm state, then upgrades with --set image.tag=<sha>. Image tag in values.production.yaml is overridden by the git SHA. |
| Verify rollout | Waits for all four Deployments to report ready before the job completes. |
One-time GCP setup
Run these once on the GCP project. They are already done for shipyard-254.
1. Create the CI/CD service account
gcloud iam service-accounts create shipyard-cicd --display-name="Shipyard CI/CD" --project=shipyard-254
2. Create the Workload Identity Pool and GitHub provider
gcloud iam workload-identity-pools create github-actions --location=global --display-name="GitHub Actions" --project=shipyard-254
gcloud iam workload-identity-pools providers create-oidc github --location=global --workload-identity-pool=github-actions --issuer-uri="https://token.actions.githubusercontent.com" --attribute-mapping="google.subject=assertion.sub,attribute.repository=assertion.repository,attribute.ref=assertion.ref" --attribute-condition="assertion.repository=='Alisao/shipyard'" --project=shipyard-254
The attribute-condition scopes the pool to a single repository. Any fork or other repo cannot exchange tokens against this pool.
3. Bind the pool to the SA and grant IAM roles
# Allow GitHub Actions to impersonate the SA
gcloud iam service-accounts add-iam-policy-binding shipyard-cicd@shipyard-254.iam.gserviceaccount.com --role=roles/iam.workloadIdentityUser --member="principalSet://iam.googleapis.com/projects/346003678070/locations/global/workloadIdentityPools/github-actions/attribute.repository/Alisao/shipyard" --project=shipyard-254
# GCP roles: Artifact Registry writer, GKE access, token exchange
for role in roles/artifactregistry.writer roles/container.developer roles/iam.serviceAccountTokenCreator; do
gcloud projects add-iam-policy-binding shipyard-254 --member="serviceAccount:shipyard-cicd@shipyard-254.iam.gserviceaccount.com" --role="$role" --condition=None
done
4. Grant Kubernetes cluster-admin to the SA
The Helm chart manages ClusterRoles and ClusterRoleBindings, so the CI/CD SA needs cluster-admin within Kubernetes (separate from GCP IAM):
kubectl create clusterrolebinding shipyard-cicd-cluster-admin --clusterrole=cluster-admin --user=shipyard-cicd@shipyard-254.iam.gserviceaccount.com
5. Add the GitHub repository secret
In GitHub → Alisao/shipyard → Settings → Secrets and variables → Actions, add:
| Secret name | Value |
|---|---|
PG_PASSWORD | The Postgres password from Step 6 |
Or via the CLI: gh secret set PG_PASSWORD --repo Alisao/shipyard --body "<value>"
Workflow file
The full pipeline is at .github/workflows/deploy.yml. Key variables at the top of the file — update these if the project ID, region, cluster name, or repository changes:
env:
PROJECT: shipyard-254
REGION: africa-south1
REGISTRY: africa-south1-docker.pkg.dev
IMAGE: africa-south1-docker.pkg.dev/shipyard-254/shipyard/platform
CLUSTER: shipyard-prod
WIF_PROVIDER: projects/346003678070/locations/global/workloadIdentityPools/github-actions/providers/github
SA: shipyard-cicd@shipyard-254.iam.gserviceaccount.com
Migration tracking
The runner maintains a schema_migrations table in the shipyard database. On the very first pipeline run against a pre-existing database, it seeds the table with all already-applied migration filenames (detected by checking whether the users table exists). From that point on, only files not yet in the table are applied. Adding a new migration is as simple as dropping a new *.sql file in packages/db/migrations/ — the next push picks it up automatically.
Known constraints
Cloud Build source upload is blocked. The org policy prevents the CI/CD SA from accessing the shipyard-254_cloudbuild GCS bucket. The pipeline therefore builds the image directly on the GitHub Actions runner (ubuntu-latest, which is linux/amd64) instead of using gcloud builds submit. Do not attempt to restore Cloud Build without first granting roles/storage.objectAdmin on that bucket.
Artifact Registry pull secret is refreshed on every deploy. The GKE node pool uses devstorage.read_only OAuth scope, which does not cover Artifact Registry. The pipeline deletes and recreates artifact-registry-key in shipyard-system on each run using a fresh token from gcloud auth print-access-token. The permanent fix is to recreate the node pool with --scopes=cloud-platform.
CLI Reference
Install: npm install -g @shipyard/cli. All commands require ship login first.
| Command | Description |
|---|---|
ship login | Authenticate and save tokens to ~/.shipyard/config.json |
ship init | Generate shipyard.json manifest by detecting project type |
ship push | Register or update project with Shipyard (reads shipyard.json) |
ship deploy | Trigger a deployment for the current project (auto-detected from git remote) |
ship deploy --project <slug> --follow | Deploy and stream build logs; exit 0/1 based on result |
ship logs <deploymentId> | Show build logs for a deployment |
ship logs <deploymentId> --follow | Stream build logs live via long-polling |
ship status --project <slug> | Show latest deployment status and URL |
ship ps | List all projects and their live deployment status |
ship rollback <deploymentId> | Promote a past deployment back to live (<10s) |
ship env list --project <slug> | List env vars (values masked) |
ship env set KEY=value --project <slug> | Set an env var (production scope by default) |
ship env set KEY=value --target <target> | Set env var with target: production, preview, or all |
ship env unset KEY --project <slug> | Remove an env var |
Environment Variable Targets
Env vars can be scoped to production (default), preview, or all (applies to both environments). Preview-scoped vars are only injected into PR/MR preview deployments.
API Reference
Base URL: https://api.shipyard.wake.co.ke/api/v1. Full OpenAPI docs at /api/docs. All endpoints except /auth/* and /webhooks/* require Authorization: Bearer <token>.
Auth
| Method | Path | Description |
|---|---|---|
POST | /auth/register | Create account + personal team |
POST | /auth/login | Returns access token (15m) + refresh token (30d) |
POST | /auth/refresh | Rotate refresh token, issue new access token |
POST | /auth/logout | Invalidate refresh token |
Deployments
| Method | Path | Description |
|---|---|---|
GET | /projects/:id/deployments | List deployments (last 50) |
POST | /projects/:id/deployments | Trigger manual deploy |
GET | /deployments/:id | Get deployment detail |
POST | /deployments/:id/cancel | Cancel queued or building deployment |
POST | /deployments/:id/rollback | Restore this deployment as live |
GET | /deployments/:id/logs | Paginated logs (?after=<seq>&limit=500) |
GET | /deployments/:id/logs/stream | SSE live log stream |
Webhooks
| Method | Path | Description |
|---|---|---|
POST | /webhooks/github | GitHub push + pull_request events (HMAC-SHA256 via X-Hub-Signature-256) |
POST | /webhooks/gitlab | GitLab Push Hook + Merge Request Hook (token via X-Gitlab-Token) |
POST | /webhooks/bitbucket | Bitbucket repo:push + pullrequest:* events (HMAC-SHA256 via X-Hub-Signature) |
Teams & Members
| Method | Path | Min Role |
|---|---|---|
GET | /teams | viewer |
GET | /teams/:id/members | viewer |
POST | /teams/:id/members | admin |
PATCH | /teams/:id/members/:userId | admin |
DELETE | /teams/:id/members/:userId | admin |
GET | /teams/:id/audit-log | admin |
API Tokens
API tokens for CI/CD authentication. Tokens are prefixed ship_ and stored as SHA-256 hashes. The raw value is shown only once at creation.
| Method | Path | Description |
|---|---|---|
GET | /teams/:id/api-tokens | List tokens (hash excluded) |
POST | /teams/:id/api-tokens | Create token; returns raw ship_ value once |
DELETE | /teams/:id/api-tokens/:tokenId | Revoke token |
Tokens can be team-scoped (access to all team projects) or project-scoped (single project only). Set projectId when creating for project scope.
Git Sources & OAuth
| Method | Path | Description |
|---|---|---|
POST | /oauth/github/initiate | Start GitHub OAuth flow |
GET | /oauth/github/callback | OAuth callback (redirects to dashboard) |
POST | /oauth/bitbucket/initiate | Start Bitbucket OAuth flow |
GET | /oauth/bitbucket/callback | OAuth callback |
GET | /teams/:id/git-sources | List connected git sources |
GET | /git-sources/:id/repos | List repositories from connected source |
GET | /projects/:id/branches | List branches for project's repo |
DNS Providers
Cloudflare API token management for automatic DNS record creation.
| Method | Path | Description |
|---|---|---|
GET | /teams/:id/dns-providers | List configured providers |
POST | /teams/:id/dns-providers | Configure Cloudflare provider (upsert) |
DELETE | /teams/:id/dns-providers/:id | Remove provider |
RBAC Permissions
Four role levels control access to team and project resources.
| Role | Permissions |
|---|---|
| owner | Full access including team deletion, member role changes, DNS provider management |
| admin | Manage git sources, API tokens, DNS providers, invite members, view audit log |
| developer | Create projects, trigger deployments, manage env vars, add domains |
| viewer | View projects, deployments, and logs; cannot modify anything |
Permission Matrix
| Action | viewer | developer | admin | owner |
|---|---|---|---|---|
| View projects/deployments | ✓ | ✓ | ✓ | ✓ |
| View logs | ✓ | ✓ | ✓ | ✓ |
| Create project | — | ✓ | ✓ | ✓ |
| Trigger deployment | — | ✓ | ✓ | ✓ |
| Manage env vars | — | ✓ | ✓ | ✓ |
| Add custom domains | — | ✓ | ✓ | ✓ |
| Reveal env var values | — | — | ✓ | ✓ |
| Manage git sources | — | — | ✓ | ✓ |
| Manage API tokens | — | — | ✓ | ✓ |
| Manage DNS providers | — | — | ✓ | ✓ |
| Invite/change members | — | — | ✓ | ✓ |
| View audit log | — | — | ✓ | ✓ |
| Delete team | — | — | — | ✓ |
Commit Status Reporting
Build and deployment status is reported back to GitHub and Bitbucket as commit status checks. Developers see ✓ or ✗ directly in PRs without visiting the dashboard.
Status Flow
| Deployment Status | GitHub Check | Bitbucket Build Status |
|---|---|---|
queued | pending — "Build queued" | INPROGRESS — "Queued" |
building | pending — "Building" | INPROGRESS — "Building" |
live | success — "Deploy live" | SUCCESSFUL — "Live" |
failed / cancelled | failure — "Deploy failed" | FAILED — "Failed" |
Implementation
Status updates are sent via provider APIs:
- GitHub:
POST /repos/{owner}/{repo}/statuses/{sha} - Bitbucket:
POST /2.0/repositories/{fullName}/commit/{hash}/statuses/build
Requires a connected git source with valid OAuth token. Statuses are updated in real-time as deployments progress through the queue.
DNS Providers & Custom Domains
Teams can configure Cloudflare API tokens for automatic DNS management. When configured, adding a custom domain automatically creates the required DNS records.
Setup
In Team Settings → DNS Providers, add a Cloudflare API token with these permissions:
- Zone: DNS → Edit
- Zone Resources: Include → Specific zone → your domain
Automatic DNS Management
When a DNS provider is configured:
- Adding a domain auto-creates the TXT verification record (
_shipyard-verify.<domain>) - Upon verification, an A record pointing to
PLATFORM_EXTERNAL_IPis created - Domain status flows:
pending → provisioning → active
Required Environment Variables
| Variable | Description |
|---|---|
PLATFORM_EXTERNAL_IP | Cluster ingress IP for A records |
GITHUB_CLIENT_ID | GitHub OAuth app client ID |
GITHUB_CLIENT_SECRET | GitHub OAuth app secret |
BITBUCKET_CLIENT_KEY | Bitbucket OAuth consumer key |
BITBUCKET_CLIENT_SECRET | Bitbucket OAuth consumer secret |
Audit Events
Security-relevant actions are recorded in the audit_events table with immutable timestamps. Accessible to team admins and owners via Team Settings.
Recorded Actions
| Action | Description |
|---|---|
api_token.created | New API token created |
api_token.revoked | API token deleted |
domain.verified | Custom domain passed DNS verification |
domain.deleted | Custom domain removed |
member.invited | New member invited to team |
member.role_changed | Member role modified |
member.removed | Member removed from team |
env_var.revealed | Admin/owner viewed decrypted env var value |
API
GET /teams/:id/audit-log — Returns paginated audit events (admin/owner only).
Buildpack Detection
Detection runs on every build. Rules are evaluated in priority order — first match wins. A shipyard.json at the repo root overrides detection entirely.
| Priority | Signal | Type | Strategy |
|---|---|---|---|
| 0 | shipyard.json with type | explicit | Use config |
| 1 | package.json + next.config.* | nextjs | npm run build → container |
| 2 | package.json + (vite.config.* or react-scripts) | static-node | Build → upload dist/ to MinIO |
| 3 | package.json only | node | Containerise, npm start |
| 4 | pom.xml | spring-boot-maven | mvn package → JAR → JRE container |
| 5 | build.gradle | spring-boot-gradle | ./gradlew bootJar → JAR → JRE container |
| 6 | composer.json + *.php | php-composer | composer install → PHP-FPM + NGINX; NGINX root auto-detected (public/ if present, repo root otherwise) |
| 7 | *.php only | php-legacy | Copy → PHP-FPM + NGINX (root at /var/www/html) |
| 8 | index.html only | static | Upload entire dir to MinIO |
| 9 | Dockerfile present | dockerfile | docker build as-is |
shipyard.json override
{
"type": "nextjs",
"port": 3000,
"healthCheckPath": "/api/health",
"buildCommand": "npm run build:prod",
"outputDirectory": ".next"
}
PHP webRoot
For php-composer projects, Shipyard auto-detects the NGINX document root:
public/directory present → NGINX serves from/var/www/html/public(Laravel / Symfony convention)- No
public/directory → NGINX serves from/var/www/html(flat structure)
Override explicitly in shipyard.json or via project buildConfig:
{
"type": "php-composer",
"webRoot": "web"
}
Operational Notes
ENCRYPTION_KEK is irreplaceable. All project env vars are encrypted with it. Losing it means they are permanently unreadable. Rotating it requires decrypting every valueEncrypted row in the database first. Back it up in a separate secrets manager.
| Topic | Detail |
|---|---|
| Database backups | Back up Postgres before every migration. The audit_events and env_vars tables are the most sensitive. |
| Access tokens | 15-minute expiry. Both clients auto-refresh. Refresh tokens are valid 30 days — after expiry users must re-run ship login. |
| Preview cleanup | The cron worker cancels previews older than 7 days and deletes all k8s resources (Deployment, Service, ConfigMaps, Secret, IngressRoute) via teardownPreviewResources(). |
| Harbor image GC | The DB-side GC runs every 6 hours. Actual Harbor/Artifact Registry image deletion is Sprint 26 — storage grows until then. |
| Build worker node | The build worker mounts the host Docker socket. Schedule it on a dedicated node pool with a taint/toleration to isolate untrusted build code from production workloads. |
| Metrics & monitoring | Prometheus metrics at GET /metrics (requires Authorization: Bearer <METRICS_TOKEN>). kube-prometheus-stack deployed in the monitoring namespace — scrapes every 30s. Alerts (API down, high error rate, pod crash-looping, queue backlog, high deploy failure rate) fire to isabokea@gmail.com via AlertManager. Grafana dashboard at grafana.shipyard.wake.co.ke (login: admin + GRAFANA_PASSWORD secret). |
| Audit log | Security events in audit_events table. API at GET /teams/:id/audit-log. Forward to a SIEM via a cron export if needed. |
| Cloudflare API token | Stored as k8s secret traefik-cloudflare-token in shipyard-infra. Cloudflare API tokens do not expire by default. If the token is ever rotated: kubectl delete secret traefik-cloudflare-token -n shipyard-infra, recreate it, then kubectl rollout restart deployment/traefik -n shipyard-infra. |
| Dashboard API proxy | The dashboard SPA uses relative /api/v1 paths. The nginx ConfigMap in infra/helm/templates/deployment-dashboard.yaml proxies /api/ to the API ClusterIP service internally. If you ever change the API service name or port, update that proxy rule and redeploy the dashboard. |
| Artifact Registry pull secret | The artifact-registry-key secret in shipyard-system uses a GCP OAuth token that expires in ~1 hour. Renew it from Cloud Shell: kubectl delete secret artifact-registry-key -n shipyard-system && kubectl create secret docker-registry artifact-registry-key --docker-server=africa-south1-docker.pkg.dev --docker-username=oauth2accesstoken --docker-password="$(gcloud auth print-access-token)" --namespace=shipyard-system. For a permanent fix, recreate the node pool with --scopes=cloud-platform. |
| Bitbucket webhooks | In the Bitbucket repository → Settings → Webhooks → Add webhook: set Payload URL to https://api.shipyard.wake.co.ke/api/v1/webhooks/bitbucket, add the webhookSecret shown at git source creation as the Secret. Enable Repository → Push and all Pull Request triggers. The token field when adding a Bitbucket git source should be a Repository Access Token (or Workspace Access Token) with repository:read scope — this is used to clone private repos. |
| Private repo cloning | All three providers support private repos via the PAT/token stored on the git source. Tokens are decrypted at enqueue time and passed to the build job as cloneToken in Redis (ephemeral). They are never written to the DB in plaintext or emitted in build logs. GitHub uses x-access-token:<token>, GitLab uses oauth2:<token>, Bitbucket uses x-token-auth:<token> as the URL credential prefix. |
| CI/CD pipeline | Every push to main on Alisao/shipyard triggers the Deploy workflow. Monitor at github.com/Alisao/shipyard/actions. If a run gets stuck mid-upgrade the Helm release may end up in pending-upgrade state — the pipeline auto-recovers on the next push. Manual recovery: helm rollback shipyard 0 -n shipyard-system. |
| Adding a new migration | Drop a new *.sql file in packages/db/migrations/. The CI/CD pipeline applies it automatically on the next push via the schema_migrations tracker. Never run ALTER TYPE ... ADD VALUE without IF NOT EXISTS — enum additions are not transactional in Postgres and cannot be rolled back. |