1
0
Fork 0

add budget alert and nightly OS-update reboot

- $10/month project budget via google_billing_budget, alerts to admin_email
- forgejo-reboot.timer at 04:30 UTC applies staged COS updates
- relocate cloud-init scripts to /var/lib/google/forgejo (COS noexec on /var)
- runbook: updated zone, script paths, added "How updates work" section

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Jason Hall 2026-05-07 20:35:58 -04:00
parent 4dc1b58f2f
commit 15ea287728
5 changed files with 115 additions and 5 deletions

View file

@ -39,6 +39,30 @@ Single container only:
docker restart forgejo
```
## How updates work
| Layer | Mechanism | Schedule |
|---|---|---|
| Host OS (COS) | `cos-update-strategy=update_enabled` stages updates onto the inactive A/B partition; reboot applies them. | Applied on the nightly reboot below. |
| Forgejo & Caddy patch updates | Watchtower pulls new image digests for the pinned tags (`forgejo:11`, `caddy:2-alpine`). | 04:00 UTC daily (inside the watchtower container; cron `0 0 4 * * *`). |
| Forgejo major version (e.g. 11→12) | Bump `var.forgejo_image` in tfvars and `terraform apply` — VM is replaced, data disk persists, first boot runs DB migrations. | Manual / deliberate. |
| Watchtower itself | Pinned at `containrrr/watchtower` (no tag = `latest`), self-updates with `--cleanup`. | 04:00 UTC daily. |
| Backups | `forgejo-backup.service` via timer. | 03:30 UTC daily. |
| Reboot to apply COS updates | `forgejo-reboot.service` runs `shutdown -r +0`. Containers come back via `forgejo-stack.service` + `--restart=unless-stopped`. | 04:30 UTC daily. ~3060s downtime. |
Tonight's order: backup at 03:30 → container update check at 04:00 → reboot at 04:30. Backups always land before any reboot, so a bad update can be rolled back from GCS.
### Disable the nightly reboot
If the reboot ever causes trouble, turn it off without affecting backups or container updates:
```bash
gcloud compute ssh forgejo --zone=us-east1-b --tunnel-through-iap \
--command='sudo systemctl disable --now forgejo-reboot.timer'
```
Re-enable with `enable --now` instead of `disable --now`. Cloud-init will re-enable it on the next VM replacement regardless.
## Update containers immediately
Watchtower pulls new images at 04:00 UTC by default. To force now:
@ -106,5 +130,5 @@ Rotating `SECRET_KEY` invalidates 2FA and some encrypted DB fields. Read the For
## Cost / billing watch
- Set a project budget alert at $10/month in Cloud Billing (manual; not in Terraform by design — the budget API requires the billing-account-admin role).
- A $10/month project budget is managed by `terraform/budget.tf`. Email alerts at 50%, 90%, 100% (current spend) and 100% (forecasted) go to `admin_email`. Adjust the threshold via `budget_amount_usd` in tfvars.
- Skim the billing report monthly. Egress is the most likely surprise.