1
0
Fork 0
forge/plan.md
Jason Hall 4dc1b58f2f initial commit
Signed-off-by: Jason Hall <imjasonh@gmail.com>
2026-05-07 20:02:59 -04:00

20 KiB
Raw Permalink Blame History

Self-Hosted Forgejo on GCP: Complete Plan

A declarative, low-cost, low-maintenance plan for running a personal Forgejo instance on Google Cloud Platform using Container-Optimized OS, Caddy for HTTPS, and IAP for admin access.

Goals and constraints

  • Cost: minimize monthly spend; target ~$24/month
  • Maintenance: minimal ongoing effort; OS and app patches should apply automatically
  • Security: minimal attack surface; no public SSH; principle of least privilege for service accounts
  • Reproducibility: entire stack defined in code; terraform apply from a clean project produces a working instance
  • Personal scale: low traffic, single user, occasional pushes

Architectural decisions

Decision Choice Rationale
Compute e2-micro VM in us-west1, us-central1, or us-east1 Always-free tier covers the full month
OS Container-Optimized OS (COS) Read-only root, automatic patching by Google, minimal attack surface, container-first
Database SQLite on persistent disk Free, sufficient for personal scale, simple to back up
Repo storage Local persistent disk Fast, reliable, survives VM replacement
TLS Caddy with Let's Encrypt Auto-renewing certs with one-line config
Git access HTTPS only with personal access token No SSH port conflicts, no client-side gcloud setup
Admin SSH IAP TCP forwarding Public port 22 closed; SSH via authenticated Google tunnel
App updates Watchtower with pinned major version tag Patch updates automatic; major upgrades deliberate
OS updates COS auto-update Google manages OS patching
Backups Nightly SQLite snapshot + repo tarball to GCS Survives disk loss, accidental deletion, region failure
Secrets Google Secret Manager, fetched at boot Out of Terraform state, out of git, encrypted at rest
Infrastructure Terraform Declarative, replayable, well-documented for GCP
VM bootstrap cloud-init via instance metadata Native COS support, idempotent on VM replacement

Cost estimate

Item Monthly cost
e2-micro VM (always-free region) $0
30 GB standard persistent disk (boot + data combined under 30 GB free tier) $0
Static external IP attached to running VM ~$2.92
GCS storage for backups (~1 GB, 30-day retention) ~$0.05
Secret Manager (2 secrets, low access volume) ~$0.06
Cloud DNS (optional; can use registrar's DNS) $0.20 or $0
Egress beyond 1 GB free $02 depending on usage
Total ~$35/month

Set a billing budget alert at $10/month to catch surprises early. GCP has no hard spending limit.

Network exposure

Port Protocol Source Purpose
80 TCP 0.0.0.0/0 Caddy HTTP → HTTPS redirect, ACME HTTP-01 challenge
443 TCP 0.0.0.0/0 Caddy HTTPS → Forgejo
22 TCP 35.235.240.0/20 (IAP only) Admin SSH via IAP tunnel
All others Default deny

Repository layout

forgejo-infra/
├── terraform/
│   ├── main.tf              # VM, disk, instance config
│   ├── network.tf           # Firewall rules, static IP
│   ├── iam.tf               # Service account, IAP bindings
│   ├── secrets.tf           # Secret Manager references (values out-of-band)
│   ├── backups.tf           # GCS bucket, lifecycle rules
│   ├── dns.tf               # Optional Cloud DNS record
│   ├── variables.tf
│   ├── outputs.tf
│   └── versions.tf
├── cloud-init/
│   └── user-data.yaml.tpl   # Systemd units, container startup, backup timer
├── config/
│   └── Caddyfile.tpl        # TLS reverse proxy config
├── scripts/
│   ├── bootstrap-secrets.sh # One-time: generate and upload secrets
│   ├── backup.sh            # Run on VM via systemd timer
│   ├── restore.sh           # Manual recovery from GCS tarball
│   └── test-restore.sh      # Verify a backup is restorable
├── docs/
│   ├── runbook.md           # Common operations, troubleshooting
│   └── disaster-recovery.md # Step-by-step recovery procedures
├── .gitignore
└── README.md

Terraform: key resources

main.tf

resource "google_compute_disk" "forgejo_data" {
  name = "forgejo-data"
  type = "pd-standard"
  size = 20
  zone = var.zone
  lifecycle { prevent_destroy = true }
}

resource "google_compute_instance" "forgejo" {
  name         = "forgejo"
  machine_type = "e2-micro"
  zone         = var.zone
  tags         = ["forgejo"]

  boot_disk {
    initialize_params {
      image = "cos-cloud/cos-stable"
      size  = 10
      type  = "pd-standard"
    }
  }

  attached_disk {
    source      = google_compute_disk.forgejo_data.id
    device_name = "forgejo-data"
  }

  network_interface {
    network = "default"
    access_config {
      nat_ip = google_compute_address.forgejo.address
    }
  }

  metadata = {
    user-data = templatefile("${path.module}/../cloud-init/user-data.yaml.tpl", {
      domain            = var.domain
      forgejo_image     = var.forgejo_image
      caddy_image       = var.caddy_image
      gcs_backup_bucket = google_storage_bucket.backups.name
      project_id        = var.project_id
    })
    google-logging-enabled = "true"
    cos-update-strategy    = "update_enabled"
    enable-oslogin         = "TRUE"
  }

  service_account {
    email  = google_service_account.forgejo.email
    scopes = ["cloud-platform"]
  }

  allow_stopping_for_update = true
}

network.tf

resource "google_compute_address" "forgejo" {
  name   = "forgejo-ip"
  region = var.region
}

resource "google_compute_firewall" "https" {
  name      = "allow-https"
  network   = "default"
  direction = "INGRESS"
  allow {
    protocol = "tcp"
    ports    = ["80", "443"]
  }
  source_ranges = ["0.0.0.0/0"]
  target_tags   = ["forgejo"]
}

resource "google_compute_firewall" "iap_ssh" {
  name      = "allow-iap-ssh"
  network   = "default"
  direction = "INGRESS"
  allow {
    protocol = "tcp"
    ports    = ["22"]
  }
  source_ranges = ["35.235.240.0/20"]
  target_tags   = ["forgejo"]
}

iam.tf

resource "google_service_account" "forgejo" {
  account_id   = "forgejo-vm"
  display_name = "Forgejo VM service account"
}

resource "google_secret_manager_secret_iam_member" "forgejo_secrets" {
  for_each  = toset(["forgejo-secret-key", "forgejo-internal-token"])
  project   = var.project_id
  secret_id = each.value
  role      = "roles/secretmanager.secretAccessor"
  member    = "serviceAccount:${google_service_account.forgejo.email}"
}

resource "google_storage_bucket_iam_member" "backups_writer" {
  bucket = google_storage_bucket.backups.name
  role   = "roles/storage.objectAdmin"
  member = "serviceAccount:${google_service_account.forgejo.email}"
}

resource "google_iap_tunnel_instance_iam_member" "ssh_admin" {
  project  = var.project_id
  zone     = var.zone
  instance = google_compute_instance.forgejo.name
  role     = "roles/iap.tunnelResourceAccessor"
  member   = "user:${var.admin_email}"
}

resource "google_project_iam_member" "ssh_os_login" {
  project = var.project_id
  role    = "roles/compute.osLogin"
  member  = "user:${var.admin_email}"
}

backups.tf

resource "google_storage_bucket" "backups" {
  name                        = "${var.project_id}-forgejo-backups"
  location                    = var.region
  storage_class               = "STANDARD"
  uniform_bucket_level_access = true

  lifecycle_rule {
    condition { age = 30 }
    action    { type = "Delete" }
  }

  versioning { enabled = false }
}

secrets.tf

# Secrets are created out-of-band by scripts/bootstrap-secrets.sh
# This file only declares them as data sources and grants access (in iam.tf)

data "google_secret_manager_secret" "secret_key" {
  secret_id = "forgejo-secret-key"
}

data "google_secret_manager_secret" "internal_token" {
  secret_id = "forgejo-internal-token"
}

variables.tf

variable "project_id"   { type = string }
variable "region"       { type = string  default = "us-central1" }
variable "zone"         { type = string  default = "us-central1-a" }
variable "domain"       { type = string }
variable "admin_email"  { type = string }
variable "forgejo_image" {
  type    = string
  default = "codeberg.org/forgejo/forgejo:11"
}
variable "caddy_image" {
  type    = string
  default = "caddy:2-alpine"
}

outputs.tf

output "static_ip" {
  value       = google_compute_address.forgejo.address
  description = "Point your domain's A record at this address"
}

output "ssh_command" {
  value       = "gcloud compute ssh forgejo --zone=${var.zone} --tunnel-through-iap"
  description = "Admin SSH via IAP tunnel"
}

Cloud-init: user-data.yaml.tpl

#cloud-config

write_files:
  - path: /etc/systemd/system/forgejo-data.mount
    content: |
      [Unit]
      Description=Mount Forgejo data disk
      Before=docker.service

      [Mount]
      What=/dev/disk/by-id/google-forgejo-data
      Where=/mnt/disks/forgejo-data
      Type=ext4
      Options=defaults,nofail

      [Install]
      WantedBy=multi-user.target

  - path: /var/lib/forgejo/Caddyfile
    content: |
      ${domain} {
        reverse_proxy forgejo:3000
        encode gzip
      }

  - path: /var/lib/forgejo/fetch-secrets.sh
    permissions: '0755'
    content: |
      #!/bin/bash
      set -euo pipefail
      TOKEN=$(curl -sf -H "Metadata-Flavor: Google" \
        "http://metadata.google.internal/computeMetadata/v1/instance/service-accounts/default/token" \
        | python3 -c "import sys,json;print(json.load(sys.stdin)['access_token'])")
      fetch() {
        curl -sf -H "Authorization: Bearer $TOKEN" \
          "https://secretmanager.googleapis.com/v1/projects/${project_id}/secrets/$1/versions/latest:access" \
          | python3 -c "import sys,json,base64;print(base64.b64decode(json.load(sys.stdin)['payload']['data']).decode())"
      }
      mkdir -p /run
      umask 077
      {
        echo "FORGEJO__security__SECRET_KEY=$(fetch forgejo-secret-key)"
        echo "FORGEJO__security__INTERNAL_TOKEN=$(fetch forgejo-internal-token)"
      } > /run/forgejo-secrets.env

  - path: /etc/systemd/system/forgejo-stack.service
    content: |
      [Unit]
      Description=Forgejo + Caddy + Watchtower
      After=forgejo-data.mount network-online.target docker.service
      Requires=forgejo-data.mount
      Wants=network-online.target

      [Service]
      Type=oneshot
      RemainAfterExit=true
      ExecStartPre=/var/lib/forgejo/fetch-secrets.sh
      ExecStartPre=-/usr/bin/docker network create web
      ExecStart=/usr/bin/docker run -d --name caddy --network web \
        -p 80:80 -p 443:443 \
        -v /mnt/disks/forgejo-data/caddy:/data \
        -v /var/lib/forgejo/Caddyfile:/etc/caddy/Caddyfile:ro \
        --restart=unless-stopped \
        ${caddy_image}
      ExecStart=/usr/bin/docker run -d --name forgejo --network web \
        -e FORGEJO__server__DISABLE_SSH=true \
        -e FORGEJO__server__ROOT_URL=https://${domain}/ \
        -e FORGEJO__service__DISABLE_REGISTRATION=true \
        -e FORGEJO__database__DB_TYPE=sqlite3 \
        --env-file /run/forgejo-secrets.env \
        -v /mnt/disks/forgejo-data/forgejo:/data \
        --restart=unless-stopped \
        ${forgejo_image}
      ExecStart=/usr/bin/docker run -d --name watchtower \
        -v /var/run/docker.sock:/var/run/docker.sock \
        --restart=unless-stopped \
        containrrr/watchtower --cleanup --schedule "0 0 4 * * *"
      ExecStop=/usr/bin/docker stop watchtower forgejo caddy

      [Install]
      WantedBy=multi-user.target

  - path: /var/lib/forgejo/backup.sh
    permissions: '0755'
    content: |
      #!/bin/bash
      set -euo pipefail
      STAMP=$(date -u +%Y%m%dT%H%M%SZ)
      BACKUP_DIR=/mnt/disks/forgejo-data/forgejo
      docker exec forgejo sqlite3 /data/gitea/gitea.db ".backup '/data/gitea/snapshot.db'"
      tar czf /tmp/forgejo-$STAMP.tar.gz -C /mnt/disks/forgejo-data forgejo
      docker run --rm -v /tmp:/tmp google/cloud-sdk:slim \
        gsutil cp /tmp/forgejo-$STAMP.tar.gz gs://${gcs_backup_bucket}/
      rm /tmp/forgejo-$STAMP.tar.gz
      docker exec forgejo rm -f /data/gitea/snapshot.db

  - path: /etc/systemd/system/forgejo-backup.service
    content: |
      [Unit]
      Description=Backup Forgejo to GCS
      After=forgejo-stack.service
      Requires=forgejo-stack.service

      [Service]
      Type=oneshot
      ExecStart=/var/lib/forgejo/backup.sh

  - path: /etc/systemd/system/forgejo-backup.timer
    content: |
      [Unit]
      Description=Nightly Forgejo backup

      [Timer]
      OnCalendar=*-*-* 03:30:00
      Persistent=true

      [Install]
      WantedBy=timers.target

runcmd:
  - mkdir -p /mnt/disks/forgejo-data
  - if ! blkid /dev/disk/by-id/google-forgejo-data; then mkfs.ext4 -F /dev/disk/by-id/google-forgejo-data; fi
  - systemctl daemon-reload
  - systemctl enable --now forgejo-data.mount
  - mkdir -p /mnt/disks/forgejo-data/forgejo /mnt/disks/forgejo-data/caddy
  - systemctl enable --now forgejo-stack.service
  - systemctl enable --now forgejo-backup.timer

Bootstrap procedure

One-time setup (before first terraform apply)

  1. Create the GCP project and enable required APIs:

    gcloud services enable \
      compute.googleapis.com \
      secretmanager.googleapis.com \
      iap.googleapis.com \
      storage.googleapis.com
    
  2. Generate and upload secrets (scripts/bootstrap-secrets.sh):

    #!/bin/bash
    set -euo pipefail
    for SECRET in forgejo-secret-key forgejo-internal-token; do
      if ! gcloud secrets describe "$SECRET" >/dev/null 2>&1; then
        openssl rand -hex 32 | gcloud secrets create "$SECRET" --data-file=-
        echo "Created $SECRET"
      else
        echo "$SECRET already exists, skipping"
      fi
    done
    
  3. Configure Terraform variables in terraform.tfvars:

    project_id  = "your-project-id"
    domain      = "git.yourdomain.com"
    admin_email = "you@yourdomain.com"
    

First deploy

cd terraform/
terraform init
terraform plan
terraform apply

Note the static_ip output. Point your domain's A record at it. Wait for DNS propagation (a few minutes typically).

Forgejo first-run installer

Visit https://yourdomain in a browser. Forgejo's installer will appear. Configure:

  • Database: SQLite3 (path /data/gitea/gitea.db)
  • Site title: whatever you want
  • Server domain: your domain
  • Server base URL: https://yourdomain/
  • Disable self-registration: yes
  • Create the admin user

After this, the installer is locked. Subsequent VM replacements (terraform-driven) will keep the database and skip the installer.

Generate a personal access token

In Forgejo: Settings → Applications → Generate New Token. Scope it minimally (read/write repository is usually enough). Configure your local git client:

git config --global credential.helper store
# On first push, enter username and the PAT as password; it'll be saved.

Operations

Admin SSH

gcloud compute ssh forgejo --zone=us-central1-a --tunnel-through-iap

Inspect containers

docker ps
docker logs forgejo
docker logs caddy
journalctl -u forgejo-stack.service

Force an update of containers

docker exec watchtower kill -s SIGHUP 1
# or
docker pull codeberg.org/forgejo/forgejo:11
sudo systemctl restart forgejo-stack.service

Run a manual backup

sudo /var/lib/forgejo/backup.sh
gsutil ls gs://YOUR_PROJECT-forgejo-backups/

Restore from backup (scripts/restore.sh)

#!/bin/bash
set -euo pipefail
BACKUP=$1  # e.g. forgejo-20260507T033000Z.tar.gz
sudo systemctl stop forgejo-stack.service
gsutil cp "gs://YOUR_PROJECT-forgejo-backups/$BACKUP" /tmp/
sudo rm -rf /mnt/disks/forgejo-data/forgejo
sudo tar xzf "/tmp/$BACKUP" -C /mnt/disks/forgejo-data/
sudo systemctl start forgejo-stack.service

Major version upgrade of Forgejo

  1. Read the Forgejo release notes for breaking changes
  2. Take a manual backup
  3. Update the forgejo_image variable in Terraform (e.g. codeberg.org/forgejo/forgejo:12)
  4. terraform apply — this will replace the VM
  5. The persistent disk persists; first boot will run any DB migrations

Disaster recovery

Scenario: VM is unrecoverable

terraform apply recreates the VM. The persistent disk has prevent_destroy, so it survives. Forgejo comes back up with all data intact.

Scenario: Persistent disk is corrupted or deleted

  1. Remove prevent_destroy from the data disk resource (if needed)
  2. terraform apply to create a fresh disk
  3. SSH in and run the restore script with the latest GCS backup

Scenario: Whole project is lost

  1. Create a new GCP project
  2. Run bootstrap-secrets.sh in the new project (generates new secrets — DB tables encrypted with the old SECRET_KEY for things like 2FA will need re-setup, but repos and basic data are fine)
  3. Update project_id in tfvars
  4. terraform apply
  5. Manually copy the latest backup tarball from old project's GCS bucket to new one (do this BEFORE deleting the old project)
  6. Run restore script

Note: rotating SECRET_KEY invalidates 2FA tokens and some encrypted fields. For a true bit-exact recovery, also back up the secrets to a password manager you control.

Scenario: Backup itself is corrupt

This is why we test restores. scripts/test-restore.sh should:

  1. Spin up a temporary VM (or use a local Docker setup)
  2. Restore the latest backup
  3. Verify Forgejo starts and at least one repo is browsable
  4. Tear down

Run this monthly. Calendar reminder.

Security checklist

  • Public SSH (port 22 from 0.0.0.0/0) blocked at firewall
  • Admin SSH only via IAP tunnel
  • OS Login enabled (no SSH keys in metadata)
  • HTTPS-only; HTTP redirects to HTTPS via Caddy
  • Forgejo registration disabled
  • Service account has minimum required permissions (Secret Manager read for two specific secrets, Storage write to one specific bucket)
  • Secrets in Secret Manager, not in Terraform state or git
  • COS auto-updates enabled for OS patching
  • Watchtower for application patch updates
  • Major version upgrades pinned (no :latest)
  • Billing budget alert at $10/month
  • Backups encrypted at rest in GCS (default), 30-day retention
  • Manual: enable 2FA on your GCP account (the IAP gate is only as strong as your Google login)
  • Manual: enable 2FA on your Forgejo admin account after first login
  • Manual: store secret values in a password manager for cross-project recovery

Maintenance schedule

Frequency Task
Continuous Watchtower handles app patch updates; COS handles OS patches
Daily Automatic backup at 03:30 UTC
Monthly Run test-restore.sh to verify backups are restorable
Monthly Review GCP billing for anomalies
Quarterly Review Forgejo release notes; consider major version upgrade
Annually Rotate SECRET_KEY and INTERNAL_TOKEN (requires care; see Forgejo docs)
Annually Review IAM bindings; remove anything unused

Open questions and future work

  • Email notifications: Forgejo can send issue/PR emails. Easiest path is configuring SMTP via a free-tier transactional email provider (e.g. Brevo, SendGrid). Not covered here; add as FORGEJO__mailer__* env vars when needed.
  • Forgejo Actions (CI): Runs on dedicated runners. The e2-micro is too small to host runners. If wanted, run a runner on a separate cheap host or skip CI.
  • Repo size growth: 30 GB persistent disk holds a lot of personal repos but isn't infinite. Monitor with a simple disk-usage alert. Resizing the disk is online and non-disruptive on GCP.
  • Multiple users: this design assumes one user. Adding more is fine (Forgejo handles it natively) but reconsider the registration-disabled and HTTPS-token approach if multiple humans need access.
  • Geographic redundancy: not in scope. Backups in GCS are regional; for multi-region durability use a multi-region bucket (slightly more expensive).

Appendix: useful references