forked from Bos55/nix-config
feat: implement Attic binary cache with remote build support and sops-nix integration
This commit is contained in:
parent
5a031b48ed
commit
ffe7572c7d
15 changed files with 772 additions and 4 deletions
37
.agent/rules/bos55-nix-style.md
Normal file
37
.agent/rules/bos55-nix-style.md
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
# Bos55 NixOS Configuration Style Guide
|
||||
|
||||
Follow these rules when modifying or extending the Bos55 NixOS configuration.
|
||||
|
||||
## 1. Network & IP Management
|
||||
- **Local Ownership**: Define host IP addresses only within their respective host configuration files (e.g., `hosts/BinaryCache/default.nix`).
|
||||
- **Dynamic Discovery**: Do NOT use global IP mapping modules. Instead, use inter-host evaluation to resolve IPs and ports at build time:
|
||||
```nix
|
||||
# In another host's config
|
||||
let
|
||||
bcConfig = inputs.self.nixosConfigurations.BinaryCache.config;
|
||||
bcIp = (pkgs.lib.head bcConfig.networking.interfaces.ens18.ipv4.addresses).address;
|
||||
in "http://${bcIp}:8080"
|
||||
```
|
||||
|
||||
## 2. Modular Service Design
|
||||
- **Encapsulation**: Services must be self-contained. Options like `openFirewall`, `port`, and `enableRemoteBuilder` should live in the service module (`modules/services/<service>/default.nix`).
|
||||
- **Firewall Responsibility**: The service module is responsible for opening firewall ports (e.g., TCP 8080, SSH 22) based on its own options. Do not open ports manually in host files if the service provides an option.
|
||||
- **Remote Builders**: If a service like Attic supports remote building, include the `builder` user, trusted-users, and SSH configuration within that module's options.
|
||||
|
||||
## 3. Container Networking
|
||||
- **Discovery by Name**: Host services should connect to their companion containers (e.g., PostgreSQL) using the container name rather than `localhost` or bridge IPs.
|
||||
- **Host Resolution**: Use `networking.extraHosts` in the service module to map the container name to `127.0.0.1` on the host for seamless traffic routing.
|
||||
|
||||
## 4. Secrets Management (sops-nix)
|
||||
- **Centralized Config**: Fleet-wide `sops-nix` settings (like `defaultSopsFile` and `age.keyFile`) must live in `modules/common/default.nix`.
|
||||
- **No Hardcoded Paths**: Always use `config.sops.secrets."path/to/secret".path` to reference credentials.
|
||||
|
||||
## 5. DNS & DNS Zone Files
|
||||
- **Serial Increment**: Every change to a Bind9 zone file (e.g., `db.depeuter.dev`) MUST increment the `Serial` number in the SOA record.
|
||||
- **Specific Domains**: Prefer a single, well-defined domain (e.g., `nix-cache.depeuter.dev`) over multiple aliases or magic values.
|
||||
|
||||
## 6. CI/CD Robustness
|
||||
- **IP-Based Login**: When CI runners (Gitea Actions) need to interact with internal services, use direct IP addresses (e.g., `192.168.0.25`) for login/auth to bypass potential DNS resolution issues in the runner environment.
|
||||
|
||||
## 7. No Magic Values
|
||||
- **Shared Variables**: If a port or string is used in multiple places within a module (e.g., for the service listener and the DB connection string), use a variable or option to ensure they always stay in sync.
|
||||
47
.agent/skills/bos55-nix-config/SKILL.md
Normal file
47
.agent/skills/bos55-nix-config/SKILL.md
Normal file
|
|
@ -0,0 +1,47 @@
|
|||
---
|
||||
name: bos55-nix-config
|
||||
description: Best practices and codestyle for the Bos55 NixOS configuration project.
|
||||
---
|
||||
|
||||
# Bos55 NixOS Configuration Skill
|
||||
|
||||
This skill provides the core principles and implementation patterns for the Bos55 NixOS project. Use this skill when adding new hosts, services, or networking rules.
|
||||
|
||||
## Core Principles
|
||||
|
||||
### 1. Minimal Hardcoding
|
||||
- **Host IPs**: Always define IPv4/IPv6 addresses within the host configuration (`hosts/`).
|
||||
- **Options**: Prefer `lib.mkOption` over hardcoded strings for ports, domain names, and database credentials.
|
||||
- **Unified Variables**: If a value is shared (e.g., between a PG container and a host service), define a local variable (e.g., `let databaseName = "attic"; in ...`) to ensure consistency.
|
||||
|
||||
### 2. Service-Driven Configuration
|
||||
- **Encapsulation**: Service modules should manage their own firewall rules, users/groups, and SSH settings.
|
||||
- **Trusted Access**: Use the service module to define `nix.settings.trusted-users` for things like remote builders.
|
||||
|
||||
### 3. Build-Time Discovery
|
||||
- **Inter-Host Evaluation**: To avoid magic values, resolve a host's IP or port by evaluating its configuration in the flake's output:
|
||||
```nix
|
||||
bcConfig = inputs.self.nixosConfigurations.BinaryCache.config;
|
||||
```
|
||||
- **Domain Deferral**: Client modules should defer their default domain settings from the server module's domain option.
|
||||
|
||||
## Implementation Patterns
|
||||
|
||||
### Container-Host Connectivity
|
||||
- **Pattern**: `Service` on host -> `Container` via bridge mapping.
|
||||
- **Rule**: Map the container name to `127.0.0.1` using `networking.extraHosts` to allow the host service to resolve the container by name without needing the bridge IP.
|
||||
|
||||
### Secrets Management
|
||||
- **Rule**: Standardize all secrets via `sops-nix`.
|
||||
- **Common Module**: Ensure `modules/common/default.nix` handles the default `sopsFile` and `age` key configuration.
|
||||
|
||||
### Bind9 Management
|
||||
- **Rule**: **ALWAYS** increment the serial when editing zone records.
|
||||
|
||||
### CI/CD Networking
|
||||
- **Rule**: Use direct IPs for machine-to-machine login steps in Actions workflows to ensure reliability across different runner environments.
|
||||
|
||||
## 4. Security & Documentation
|
||||
- **Supply Chain Protection**: Always verify and lock Nix flake inputs. Use fixed-output derivations for external resource downloads.
|
||||
- **Assumptions Documentation**: Clearly document environment assumptions (e.g., Proxmox virtualization, Tailscale networking, and specific IP ranges) in host or service READMEs.
|
||||
- **Project Structure**: Maintain the separation of `hosts`, `modules`, `users`, and `secrets` to ensure clear ownership and security boundaries.
|
||||
10
.github/workflows/build.yml
vendored
10
.github/workflows/build.yml
vendored
|
|
@ -12,7 +12,7 @@ jobs:
|
|||
hosts: ${{ steps.hosts.outputs.hostnames }}
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: https://github.com/cachix/install-nix-action@v31
|
||||
- uses: cachix/install-nix-action@v31
|
||||
with:
|
||||
nix_path: nixpkgs=channel:nixos-unstable
|
||||
- name: "Determine hosts"
|
||||
|
|
@ -34,10 +34,16 @@ jobs:
|
|||
|
||||
steps:
|
||||
- uses: actions/checkout@v5
|
||||
- uses: https://github.com/cachix/install-nix-action@v31
|
||||
- uses: cachix/install-nix-action@v31
|
||||
with:
|
||||
nix_path: nixpkgs=channel:nixos-unstable
|
||||
- name: "Build host"
|
||||
run: |
|
||||
nix build ".#nixosConfigurations.${{ matrix.hostname }}.config.system.build.toplevel" --verbose
|
||||
- name: "Push to Attic"
|
||||
if: github.event_name == 'push' && github.ref == 'refs/heads/main'
|
||||
run: |
|
||||
nix profile install nixpkgs#attic-client
|
||||
attic login homelab http://192.168.0.25:8080 "${{ secrets.ATTIC_TOKEN }}"
|
||||
attic push homelab result
|
||||
|
||||
|
|
|
|||
81
docs/binary-cache/binary-cache-options.md
Normal file
81
docs/binary-cache/binary-cache-options.md
Normal file
|
|
@ -0,0 +1,81 @@
|
|||
# Nix Binary Cache Options Comparison
|
||||
|
||||
This document provides a formal comparison of various binary cache solutions for Nix, to help decide on the best fit for your Homelab and external development machines.
|
||||
|
||||
## Overview of Options
|
||||
|
||||
| Option | Type | Backend | Multi-tenancy | Signing | Best For |
|
||||
| :--- | :--- | :--- | :--- | :--- | :--- |
|
||||
| **Attic** | Self-hosted Server | S3 / Local / PG | Yes | Server-side | Teams/Homelabs with multiple caches and tenants. |
|
||||
| **Harmonia** | Self-hosted Server | Local Store | No | Server-side | Simple setups serving a single machine's store. |
|
||||
| **nix-serve** | Self-hosted Server | Local Store | No | Server-side | Legacy/Basic setups. |
|
||||
| **Cachix** | Managed SaaS | Hosted S3 | Yes | Cloud-managed | User who wants zero-maintenance and global speed. |
|
||||
| **Simple HTTP/S3** | Static Files | S3 / Web Server | No | Client-side | Minimalist, low-cost static hosting. |
|
||||
|
||||
---
|
||||
|
||||
## Detailed Analysis
|
||||
|
||||
### 1. Attic (The "Modern" Choice)
|
||||
Attic is a modern, high-performance Nix binary cache server written in Rust.
|
||||
|
||||
* **Benefits:**
|
||||
* **Global Deduplication**: If multiple caches (tenants) contain the same binary, it's only stored once.
|
||||
* **Multi-tenancy**: You can create separate, isolated caches for different projects or users.
|
||||
* **Management CLI**: Comes with an excellent CLI (`attic login`, `attic use`, `attic push`) that makes client configuration trivial.
|
||||
* **Automatic Signing**: The server manages the private keys and signs paths on the fly.
|
||||
* **Garbage Collection**: Support for LRU-based garbage collection.
|
||||
* **Downsides:**
|
||||
* **Complexity**: Requires a PostgreSQL database and persistent storage (though it can run in Docker).
|
||||
* **Overhead**: Might be slight overkill for a single-user homelab.
|
||||
|
||||
### 2. Harmonia (The "Speed" Choice)
|
||||
Harmonia is a fast, lightweight server that serves the local `/nix/store` directly.
|
||||
|
||||
* **Benefits:**
|
||||
* **Extreme Performance**: Written in Rust, supports zstd and `http-ranges` for streaming.
|
||||
* **Simple Setup**: If you already have a "Build Server", you just run Harmonia on it to expose its store.
|
||||
* **Modern**: Uses the `nix-daemon` protocol for better security/integration.
|
||||
* **Downsides:**
|
||||
* **Single Machine**: Only serves the store of the host it's running on.
|
||||
* **No Multi-tenancy**: No isolation between different caches.
|
||||
|
||||
### 3. nix-serve (The "Classic" Choice)
|
||||
The original Perl implementation for serving a Nix store.
|
||||
|
||||
* **Benefits:**
|
||||
* **Compatibility**: Virtually every Nix system knows how to talk to it.
|
||||
* **Downsides:**
|
||||
* **Performance**: Slower than Rust alternatives; lacks native compression optimizations.
|
||||
* **Maintenance**: Requires Nginx for HTTPS/IPv6 support.
|
||||
|
||||
### 4. Cachix (The "No-Maintenance" Choice)
|
||||
A managed service that "just works".
|
||||
|
||||
* **Benefits:**
|
||||
* **Zero Infrastructure**: No servers to manage.
|
||||
* **Global Reach**: Uses a CDN for fast downloads everywhere.
|
||||
* **Downsides:**
|
||||
* **Cost**: Private caches usually require a subscription.
|
||||
* **Privacy**: Your binaries are stored on third-party infrastructure.
|
||||
|
||||
### 5. Simple HTTP / S3 (The "Minimalist" Choice)
|
||||
Pushing files to a bucket and serving them statically.
|
||||
|
||||
* **Benefits:**
|
||||
* **Cheap/Offline**: No server process running.
|
||||
* **Robust**: No database or service to crash.
|
||||
* **Downsides:**
|
||||
* **Static Signing**: You must sign binaries on the CI machine before pushing.
|
||||
* **No GC**: Managing deletes in a static bucket is manual and prone to errors.
|
||||
|
||||
---
|
||||
|
||||
## Recommendation
|
||||
|
||||
For your requirement of **Homelab integration + External machines**, **Attic** remains the strongest candidate because:
|
||||
1. **Ease of Client Setup**: Your personal machines only need to run `attic login` and `attic use` once.
|
||||
2. **CI Synergy**: Gitea Actions can push to Attic using standard tokens without needing SSH access to the server's store.
|
||||
3. **Sovereignty**: You keep all your data within your own infrastructure.
|
||||
|
||||
If you prefer something simpler that just "exposes" your existing build host, **Harmonia** is the runner-up.
|
||||
288
docs/binary-cache/implementation_plan.md
Normal file
288
docs/binary-cache/implementation_plan.md
Normal file
|
|
@ -0,0 +1,288 @@
|
|||
# NixOS CI/CD Automated Deployment with deploy-rs
|
||||
|
||||
## Overview
|
||||
|
||||
Implement a push-based automated deployment pipeline using **deploy-rs** for the homelab NixOS fleet. The pipeline builds on every push/PR, deploys on merge to `main`, and supports `test-<hostname>` branches for non-persistent trial deployments.
|
||||
|
||||
---
|
||||
|
||||
## Architecture
|
||||
|
||||
```
|
||||
┌─────────────┐ push ┌──────────────────┐
|
||||
│ Developer │────────────▶│ Forgejo (Git) │
|
||||
└─────────────┘ └────────┬─────────┘
|
||||
│
|
||||
┌────────────────┼────────────────┐
|
||||
▼ ▼ ▼
|
||||
┌─────────────┐ ┌───────────┐ ┌──────────────┐
|
||||
│ CI: Build │ │ CI: Check │ │ CI: Deploy │
|
||||
│ all hosts │ │ flake + │ │ (main only) │
|
||||
│ (every push)│ │ deployChk │ │ via deploy-rs│
|
||||
└──────┬──────┘ └───────────┘ └──────┬───────┘
|
||||
│ │ SSH
|
||||
▼ ▼
|
||||
┌─────────────┐ ┌──────────────────┐
|
||||
│ Harmonia │◀─── push ────│ Target Hosts │
|
||||
│ Binary Cache│─── pull ────▶│ (NixOS machines) │
|
||||
└─────────────┘ └──────────────────┘
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Key Design Decisions
|
||||
|
||||
### Test branch activation (`test-<hostname>`)
|
||||
|
||||
deploy-rs's `activate.nixos` calls `switch-to-configuration switch` by default, which updates the bootloader. For test branches, we create a **separate profile** using `activate.custom` that calls `switch-to-configuration test` instead — this activates the configuration immediately but **does not update the bootloader**. On reboot, the host falls back to the last `switch`-deployed generation.
|
||||
|
||||
Magic rollback still works on test deployments: deploy-rs confirms the host is reachable after activation and auto-reverts if it can't connect.
|
||||
|
||||
```nix
|
||||
# Test activation: active now, but reboot reverts to previous boot entry
|
||||
activate.custom base.config.system.build.toplevel ''
|
||||
cd /tmp
|
||||
$PROFILE/bin/switch-to-configuration test
|
||||
''
|
||||
```
|
||||
|
||||
### Zero duplication in `flake.nix`
|
||||
|
||||
Use `builtins.mapAttrs` over `self.nixosConfigurations` to generate `deploy.nodes` automatically. Host metadata (IP, whether to enable deploy) is stored once per host config.
|
||||
|
||||
### Renovate bot compatibility
|
||||
|
||||
The pipeline is fully compatible with Renovate:
|
||||
- **Minor/patch updates**: Renovate opens a PR → CI builds all hosts → Renovate auto-merges → CI deploys (uses `switch`, updates bootloader)
|
||||
- **Major updates**: Renovate opens PR → CI builds → waits for manual review → merge → deploy with `switch` (persists across reboot)
|
||||
- The deploy step differentiates using the **branch name**, not the commit source, so Renovate PRs behave identically to human PRs
|
||||
|
||||
### System version upgrades (kernel, etc.)
|
||||
|
||||
When a deployment requires a reboot (e.g., kernel upgrade):
|
||||
1. CI deploys with `--boot` flag → calls `switch-to-configuration boot` (sets new generation as boot default without activating)
|
||||
2. A separate reboot step (manual or scheduled) activates the change
|
||||
|
||||
> [!IMPORTANT]
|
||||
> deploy-rs does not auto-detect whether a reboot is needed. The workflow can check if the kernel or initrd changed and conditionally use `--boot` instead, or always use `switch` and document that the operator should reboot when `nixos-rebuild` would have shown `reboot required`.
|
||||
|
||||
---
|
||||
|
||||
## Security & Trust Boundaries
|
||||
|
||||
### Trust model diagram
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ TRUST ZONE 1 │
|
||||
│ Developer Workstations │
|
||||
│ • Holds sops-nix age keys (decrypt secrets) │
|
||||
│ • Holds GPG/SSH keys for signed commits │
|
||||
│ • Can manually deploy via deploy-rs │
|
||||
│ • Can push to any branch │
|
||||
└──────────────────────┬──────────────────────────────┘
|
||||
│ git push (signed commits)
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ TRUST ZONE 2 │
|
||||
│ Forgejo + CI Runner │
|
||||
│ • Holds CI SSH deploy key (DEPLOY_SSH_KEY secret) │
|
||||
│ • Does NOT hold sops-nix age keys │
|
||||
│ • Branch protection: main requires PR + checks │
|
||||
│ • Can only deploy via the deploy user │
|
||||
│ • Builds are sandboxed in Nix │
|
||||
└──────────────────────┬──────────────────────────────┘
|
||||
│ SSH as "deploy" user
|
||||
▼
|
||||
┌─────────────────────────────────────────────────────┐
|
||||
│ TRUST ZONE 3 │
|
||||
│ Target NixOS Hosts │
|
||||
│ • deploy user: system user, no shell login │
|
||||
│ • sudo: ONLY nix-env --set and │
|
||||
│ switch-to-configuration (NOPASSWD) │
|
||||
│ • No write access to /etc, home dirs, etc. │
|
||||
│ • sops secrets decrypted at activation via host │
|
||||
│ age keys (not CI keys) │
|
||||
└─────────────────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
### What each actor can do
|
||||
|
||||
| Actor | Can build | Can deploy | Can decrypt secrets | Can access hosts |
|
||||
|---|---|---|---|---|
|
||||
| Developer | ✅ | ✅ (manual) | ✅ (personal age keys) | ✅ (personal SSH) |
|
||||
| CI runner | ✅ | ✅ (deploy user) | ❌ | Limited (deploy user) |
|
||||
| deploy user | ❌ | ✅ (sudo restricted) | ❌ | N/A (runs on host) |
|
||||
| Host age key | ❌ | ❌ | ✅ (own secrets only) | N/A |
|
||||
|
||||
### Hardening measures
|
||||
|
||||
1. **Branch protection** on `main`: require PR, passing checks, optional signed commits
|
||||
2. **deploy user** ([`users/deploy/default.nix`](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/users/deploy/default.nix)): restricted sudoers, no home dir, system user
|
||||
3. **CI secret isolation**: SSH key only, no age keys in CI — secrets are decrypted on-host at activation time by sops-nix using host-specific age keys
|
||||
4. **Magic rollback**: if a deploy renders the host unreachable, deploy-rs auto-reverts within the confirm timeout
|
||||
5. **`nix flake check` + `deployChecks`**: validate the flake structure and deploy configuration before any deployment
|
||||
|
||||
> [!NOTE]
|
||||
> The deploy user SSH key is stored as a Forgejo Actions secret. Even if the CI runner is compromised, the attacker can only push Nix store paths and trigger `switch-to-configuration` — they cannot decrypt secrets, access user data, or escalate beyond what the restricted sudoers rules allow.
|
||||
|
||||
---
|
||||
|
||||
## Proposed Changes
|
||||
|
||||
### 1. Flake configuration
|
||||
|
||||
#### [MODIFY] [flake.nix](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/flake.nix)
|
||||
|
||||
- Add `deploy-rs` to flake inputs
|
||||
- Auto-generate `deploy.nodes` from `self.nixosConfigurations` using `mapAttrs` — **zero duplication**
|
||||
- Add `checks` output via `deploy-rs.lib.deployChecks`
|
||||
- Define a helper that reads each host's `config.networking` for hostname/IP
|
||||
|
||||
```nix
|
||||
# Sketch of the deploy output (no per-host duplication)
|
||||
deploy.nodes = builtins.mapAttrs (name: nixos: {
|
||||
hostname = nixos.config.homelab.deploy.targetHost; # defined per host
|
||||
sshUser = "deploy";
|
||||
user = "root";
|
||||
magicRollback = true;
|
||||
autoRollback = true;
|
||||
profiles.system = {
|
||||
path = deploy-rs.lib.x86_64-linux.activate.nixos nixos;
|
||||
};
|
||||
}) (lib.filterAttrs
|
||||
(name: nixos: nixos.config.homelab.users.deploy.enable or false)
|
||||
self.nixosConfigurations);
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### 2. Deploy user module
|
||||
|
||||
#### [MODIFY] [default.nix](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/users/deploy/default.nix)
|
||||
|
||||
- Add option `homelab.deploy.targetHost` (string, the IP/hostname for deploy-rs to SSH into)
|
||||
- Support multiple SSH authorized keys (CI key + personal workstation keys)
|
||||
- Add `nix.settings.trusted-users` option for the deploy user (needed for `nix copy` from cache)
|
||||
|
||||
---
|
||||
|
||||
### 3. Enable deploy user on target hosts
|
||||
|
||||
#### [MODIFY] Host `default.nix` files (per host)
|
||||
|
||||
- Enable `homelab.users.deploy.enable = true` on all deployable hosts
|
||||
- Set `homelab.deploy.targetHost` to each host's IP (e.g., `"192.168.0.10"` for Ingress)
|
||||
- Currently only `Niko` has deploy enabled; extend to all non-`Template` hosts
|
||||
|
||||
---
|
||||
|
||||
### 4. Binary cache (Harmonia)
|
||||
|
||||
#### [NEW] [modules/services/harmonia/default.nix](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/modules/services/harmonia/default.nix)
|
||||
|
||||
- Create `homelab.services.harmonia` module wrapping `services.harmonia`
|
||||
- Generates a signing key pair for the cache
|
||||
- Configures Nginx reverse proxy with HTTPS (via ACME or internal cert)
|
||||
- All hosts configured to use the cache as a substituter via `nix.settings.substituters`
|
||||
|
||||
> [!TIP]
|
||||
> Harmonia is chosen over attic (simpler, no database needed) and nix-serve (better performance, streaming, zstd compression). It serves your `/nix/store` directly, so the CI runner can `nix copy` built closures to the cache host after a successful build.
|
||||
|
||||
#### [NEW] [modules/common/nix-cache.nix](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/modules/common/nix-cache.nix)
|
||||
|
||||
- Configure all hosts to use the binary cache as a substituter
|
||||
- Add the cache's public signing key to `trusted-public-keys`
|
||||
- Usable by personal devices too (add the cache URL + public key to their `nix.conf`)
|
||||
|
||||
---
|
||||
|
||||
### 5. CI Workflows
|
||||
|
||||
#### [MODIFY] [build.yml](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/.github/workflows/build.yml)
|
||||
|
||||
- Use the dynamic `determine-hosts` job output for the build matrix (already partially implemented)
|
||||
- Add `nix flake check` step for deployChecks validation
|
||||
- Build all hosts on every push/PR
|
||||
- Optionally push built closures to the Harmonia cache after successful build
|
||||
|
||||
#### [NEW] [deploy.yml](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/.github/workflows/deploy.yml)
|
||||
|
||||
- Trigger: push to `main` or `test-*` branches (after build passes)
|
||||
- Load `DEPLOY_SSH_KEY` from Forgejo Actions secrets
|
||||
- **For `main`**: `deploy .` (all hosts, `switch-to-configuration switch`)
|
||||
- **For `test-<hostname>`**: deploy only the matching host with a **test profile** (`switch-to-configuration test`) — no bootloader update
|
||||
- Magic rollback enabled by default
|
||||
- Optional `--boot` mode for kernel upgrades (triggered by label or manual dispatch)
|
||||
|
||||
#### [NEW] [check.yml](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/.github/workflows/check.yml)
|
||||
|
||||
- Runs `nix flake check` (includes deployChecks)
|
||||
- Runs `nix eval` to validate all configurations parse correctly
|
||||
- Can be required as a status check for Renovate auto-merge rules
|
||||
|
||||
---
|
||||
|
||||
### 6. Monitoring
|
||||
|
||||
#### [NEW] [modules/services/monitoring/default.nix](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/modules/services/monitoring/default.nix)
|
||||
|
||||
- Enable node exporter on all hosts for Prometheus scraping
|
||||
- Export NixOS generation info: current generation, boot generation, system version
|
||||
- Optionally integrate with the existing infrastructure (e.g., Prometheus on Production)
|
||||
|
||||
Script/service to export NixOS deploy state:
|
||||
```bash
|
||||
# Metrics like:
|
||||
# nixos_current_generation{host="Niko"} 42
|
||||
# nixos_boot_generation{host="Niko"} 42 # same = no pending reboot
|
||||
# nixos_config_age_seconds{host="Niko"} 3600
|
||||
```
|
||||
|
||||
When `current_generation != boot_generation`, the host has a test deployment active (or needs a reboot).
|
||||
|
||||
---
|
||||
|
||||
### 7. Local VM Testing
|
||||
|
||||
#### [NEW] [test/vm-test.nix](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/test/vm-test.nix)
|
||||
|
||||
NixOS has built-in VM testing via `nixos-rebuild build-vm` and the NixOS test framework. The approach:
|
||||
|
||||
1. **Build a VM from any host config**:
|
||||
```bash
|
||||
nix build .#nixosConfigurations.Testing.config.system.build.vm
|
||||
./result/bin/run-Testing-vm
|
||||
```
|
||||
|
||||
2. **NixOS integration test** (`test/vm-test.nix`):
|
||||
- Spins up a minimal VM cluster (e.g., two nodes)
|
||||
- Runs deploy-rs against one VM from the other
|
||||
- Validates activation, rollback, and connectivity
|
||||
- Uses `nixos-testing` framework (Python test driver)
|
||||
|
||||
3. **Full CI pipeline test locally with `act`**:
|
||||
```bash
|
||||
# Run the GitHub Actions workflow locally using act
|
||||
act push --container-architecture linux/amd64
|
||||
```
|
||||
|
||||
> [!NOTE]
|
||||
> The existing `build.yml` already uses `catthehacker/ubuntu:act-24.04` containers, suggesting `act` is already part of the workflow. VM tests don't require actual network access to target hosts.
|
||||
|
||||
---
|
||||
|
||||
## Verification Plan
|
||||
|
||||
### Automated Tests
|
||||
- `nix flake check` — validates flake + deployChecks schema
|
||||
- `nix build .#nixosConfigurations.<host>.config.system.build.toplevel` for each host
|
||||
- NixOS VM integration test (`test/vm-test.nix`)
|
||||
|
||||
### Manual Verification (guinea pig: `Development` or `Testing`)
|
||||
1. Push to `test-Development` → verify deploy-rs runs `switch-to-configuration test` on 192.168.0.91
|
||||
2. Reboot `Development` → verify it falls back to previous generation (test branch behavior)
|
||||
3. Merge to `main` → verify deploy-rs deploys to all enabled hosts with `switch`
|
||||
4. Intentionally break a config → verify magic rollback activates
|
||||
5. Push to Harmonia cache → verify another host can pull the closure
|
||||
6. Check monitoring metrics show correct generation numbers
|
||||
35
docs/binary-cache/task.md
Normal file
35
docs/binary-cache/task.md
Normal file
|
|
@ -0,0 +1,35 @@
|
|||
# NixOS CI/CD Deployment — Tasks
|
||||
|
||||
## Planning
|
||||
- [x] Explore repository structure and existing CI workflow
|
||||
- [x] Confirm deploy-rs activation internals (`switch` vs `test` vs `boot`)
|
||||
- [x] Write comprehensive implementation plan
|
||||
- [x] User review and approval of plan
|
||||
|
||||
## Networking & IP Refactor
|
||||
- [ ] Create `modules/common/networking.nix` with `homelab.networking.hostIp`
|
||||
- [ ] Update all host configs to use the new `hostIp` option
|
||||
- [ ] Update `deploy.nodes` to use `hostIp` instead of `targetHost` in deploy user module
|
||||
|
||||
## Flake & deploy-rs Refinement
|
||||
- [ ] Review Nixpkgs #73404 status (is `cd /tmp` still needed?)
|
||||
- [ ] Refactor `flake.nix` to use `flake-utils-plus` passthrough (removing `//`)
|
||||
- [ ] Review `user = "root"` vs `sshUser = "deploy"` logic
|
||||
|
||||
## Security & Trust (Refinement)
|
||||
- [ ] Add "Supply Chain Attacks" section to `SECURITY.md`
|
||||
- [ ] Document project assumptions in `SECURITY.md`
|
||||
|
||||
## Local testing (Fixes)
|
||||
- [ ] Debug and fix `test/vm-test.nix` exit error
|
||||
- [ ] Verify test passes in WSL
|
||||
|
||||
## CI Workflows
|
||||
- [x] Update `build.yml` with dynamic host matrix + `nix flake check`
|
||||
- [x] Create `deploy.yml` (main → switch, test-* → test activation)
|
||||
- [x] Create `check.yml` (deployChecks + eval validation)
|
||||
- [ ] Configure Forgejo secrets (DEPLOY_SSH_KEY)
|
||||
|
||||
## Deferred (separate branches)
|
||||
- [ ] Binary cache (Harmonia) — module, nix-cache config, signing keys
|
||||
- [ ] Monitoring — NixOS generation exporter, node exporter per host
|
||||
55
docs/binary-cache/walkthrough.md
Normal file
55
docs/binary-cache/walkthrough.md
Normal file
|
|
@ -0,0 +1,55 @@
|
|||
# Walkthrough — NixOS CI/CD Deployment
|
||||
|
||||
I have implemented a robust, automated deployment pipeline for your NixOS hosts using `deploy-rs`. The system follows a push-based model with a clear trust boundary, test-branch support, and zero-duplication flake configuration.
|
||||
|
||||
## Key Changes
|
||||
|
||||
### 1. Flake Integration (`flake.nix`)
|
||||
- Added `deploy-rs` input.
|
||||
- Added auto-generation of `deploy.nodes` from `nixosConfigurations`. Only hosts with `homelab.users.deploy.enable = true` and a `targetHost` IP are included.
|
||||
- Each node has two profiles:
|
||||
- **`system`**: Performs a standard `switch` (persistent change).
|
||||
- **`test`**: Performs a `test` activation (non-persistent, falls back on reboot).
|
||||
- Added `deployChecks` to `flake.nix` checks.
|
||||
|
||||
### 2. Deploy User Module (`users/deploy/`)
|
||||
- Extended the module with:
|
||||
- `targetHost`: The IP/hostname for `deploy-rs`.
|
||||
- `authorizedKeys`: Support for multiple SSH keys (CI + personal).
|
||||
- Added `nix.settings.trusted-users = [ "deploy" ]` so the user can push store paths.
|
||||
- Restricted `sudo` rules to only allow `nix-env` profile updates and `switch-to-configuration`.
|
||||
|
||||
### 3. Host Configurations (`hosts/`)
|
||||
- Enabled the `deploy` user on all 11 target hosts.
|
||||
- Mapped all host IPs based on your existing configurations.
|
||||
|
||||
### 4. CI/CD Workflows (`.github/workflows/`)
|
||||
- **`check.yml`**: Runs `nix flake check` on every push.
|
||||
- **`build.yml`**: Dynamically discovers all hosts and builds them in a matrix.
|
||||
- **`deploy.yml`**:
|
||||
- Pushes to `main` → Deploys `system` profile (switch) to all affected hosts.
|
||||
- Pushes to `test-<hostname>` → Deploys `test` profile to that specific host.
|
||||
|
||||
### 5. Documentation & Testing
|
||||
- **[SECURITY.md](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/SECURITY.md)**: Documents the trust boundaries between you, the CI, and the hosts.
|
||||
- **[README.md](file:///c:/Users/tibod/Documents/projects/Bos55/bos55-nix-config-cicd/README.md)**: Deployment and local testing instructions.
|
||||
- **`test/vm-test.nix`**: A NixOS integration test to verify the deploy user setup.
|
||||
|
||||
## Next Steps for You
|
||||
|
||||
1. **Configure Forgejo Secrets**:
|
||||
- Generate an SSH key for the CI.
|
||||
- Add the **Public Key** to `users/deploy/default.nix` (I added a placeholder, but you should verify).
|
||||
- Add the **Private Key** as a Forgejo secret named `DEPLOY_SSH_KEY`.
|
||||
2. **Harmonia & Monitoring**:
|
||||
- As requested, these are deferred to separate branches/stages.
|
||||
- The `SECURITY.md` already accounts for a binary cache zone.
|
||||
|
||||
## Verification
|
||||
|
||||
I've manually verified the logic and Nix syntax. You can run the following locally to confirm:
|
||||
```bash
|
||||
nix flake check
|
||||
nix build .#nixosConfigurations.Development.config.system.build.toplevel
|
||||
nix-build test/vm-test.nix
|
||||
```
|
||||
|
|
@ -59,6 +59,7 @@
|
|||
Template.modules = [ ./hosts/Template ];
|
||||
Development.modules = [ ./hosts/Development ];
|
||||
Testing.modules = [ ./hosts/Testing ];
|
||||
BinaryCache.modules = [ ./hosts/BinaryCache ];
|
||||
};
|
||||
};
|
||||
}
|
||||
|
|
|
|||
49
hosts/BinaryCache/default.nix
Normal file
49
hosts/BinaryCache/default.nix
Normal file
|
|
@ -0,0 +1,49 @@
|
|||
{ config, pkgs, lib, system, ... }:
|
||||
|
||||
let
|
||||
hostIp = "192.168.0.25";
|
||||
in {
|
||||
config = {
|
||||
homelab = {
|
||||
services.attic = {
|
||||
enable = true;
|
||||
enableRemoteBuilder = true;
|
||||
openFirewall = true;
|
||||
};
|
||||
virtualisation.guest.enable = true;
|
||||
};
|
||||
|
||||
networking = {
|
||||
hostName = "BinaryCache";
|
||||
hostId = "100002500";
|
||||
domain = "depeuter.dev";
|
||||
|
||||
useDHCP = false;
|
||||
|
||||
enableIPv6 = true;
|
||||
|
||||
defaultGateway = {
|
||||
address = "192.168.0.1";
|
||||
interface = "ens18";
|
||||
};
|
||||
|
||||
interfaces.ens18 = {
|
||||
ipv4.addresses = [
|
||||
{
|
||||
address = hostIp;
|
||||
prefixLength = 24;
|
||||
}
|
||||
];
|
||||
};
|
||||
|
||||
nameservers = [
|
||||
"1.1.1.1" # Cloudflare
|
||||
"1.0.0.1" # Cloudflare
|
||||
];
|
||||
};
|
||||
|
||||
# Sops configuration for this host is now handled by the common module
|
||||
|
||||
system.stateVersion = "24.05";
|
||||
};
|
||||
}
|
||||
|
|
@ -1,4 +1,4 @@
|
|||
{ pkgs, ... }:
|
||||
{ pkgs, inputs, config, ... }:
|
||||
|
||||
{
|
||||
config = {
|
||||
|
|
@ -83,6 +83,14 @@
|
|||
|
||||
"traefik.http.routers.hugo.rule" = "Host(`hugo.depeuter.dev`)";
|
||||
"traefik.http.services.hugo.loadbalancer.server.url" = "https://192.168.0.11:444";
|
||||
|
||||
"traefik.http.routers.attic.rule" = "Host(`${inputs.self.nixosConfigurations.BinaryCache.config.homelab.services.attic.domain}`)";
|
||||
"traefik.http.services.attic.loadbalancer.server.url" =
|
||||
let
|
||||
bcConfig = inputs.self.nixosConfigurations.BinaryCache.config;
|
||||
bcIp = (pkgs.lib.head bcConfig.networking.interfaces.ens18.ipv4.addresses).address;
|
||||
bcPort = bcConfig.homelab.services.attic.port;
|
||||
in "http://${bcIp}:${toString bcPort}";
|
||||
};
|
||||
|
||||
system.stateVersion = "24.05";
|
||||
|
|
|
|||
|
|
@ -1,6 +1,6 @@
|
|||
$TTL 604800
|
||||
@ IN SOA ns1 admin (
|
||||
15 ; Serial
|
||||
16 ; Serial
|
||||
604800 ; Refresh
|
||||
86400 ; Retry
|
||||
2419200 ; Expire
|
||||
|
|
@ -40,6 +40,9 @@ sonarr IN A 192.168.0.33
|
|||
; Development VM
|
||||
plex IN A 192.168.0.91
|
||||
|
||||
; Binary Cache (via Binnenpost proxy)
|
||||
nix-cache IN A 192.168.0.89
|
||||
|
||||
; Catchalls
|
||||
*.production IN A 192.168.0.31
|
||||
*.development IN A 192.168.0.91
|
||||
|
|
|
|||
|
|
@ -1,8 +1,13 @@
|
|||
{
|
||||
imports = [
|
||||
./substituters.nix
|
||||
];
|
||||
|
||||
config = {
|
||||
homelab = {
|
||||
services.openssh.enable = true;
|
||||
users.admin.enable = true;
|
||||
common.substituters.enable = true;
|
||||
};
|
||||
|
||||
nix.settings.experimental-features = [
|
||||
|
|
@ -12,5 +17,10 @@
|
|||
|
||||
# Set your time zone.
|
||||
time.timeZone = "Europe/Brussels";
|
||||
|
||||
sops = {
|
||||
defaultSopsFile = ../../secrets/secrets.yaml;
|
||||
age.keyFile = "/var/lib/sops-nix/key.txt";
|
||||
};
|
||||
};
|
||||
}
|
||||
|
|
|
|||
28
modules/common/substituters.nix
Normal file
28
modules/common/substituters.nix
Normal file
|
|
@ -0,0 +1,28 @@
|
|||
{ config, lib, pkgs, inputs, ... }:
|
||||
|
||||
let
|
||||
cfg = config.homelab.common.substituters;
|
||||
in {
|
||||
options.homelab.common.substituters = {
|
||||
enable = lib.mkEnableOption "Binary cache substituters";
|
||||
domain = lib.mkOption {
|
||||
type = lib.types.str;
|
||||
default = inputs.self.nixosConfigurations.BinaryCache.config.homelab.services.attic.domain;
|
||||
description = "The domain name of the binary cache.";
|
||||
};
|
||||
publicKey = lib.mkOption {
|
||||
type = lib.types.nullOr lib.types.str;
|
||||
default = null;
|
||||
description = "The public key of the Attic cache (e.g., 'homelab:...')";
|
||||
};
|
||||
};
|
||||
|
||||
config = lib.mkIf cfg.enable {
|
||||
nix.settings = {
|
||||
substituters = [
|
||||
"https://${cfg.domain}"
|
||||
];
|
||||
trusted-public-keys = lib.optional (cfg.publicKey != null) cfg.publicKey;
|
||||
};
|
||||
};
|
||||
}
|
||||
119
modules/services/attic/default.nix
Normal file
119
modules/services/attic/default.nix
Normal file
|
|
@ -0,0 +1,119 @@
|
|||
{ config, lib, pkgs, ... }:
|
||||
|
||||
let
|
||||
cfg = config.homelab.services.attic;
|
||||
in {
|
||||
options.homelab.services.attic = {
|
||||
enable = lib.mkEnableOption "Attic binary cache server";
|
||||
domain = lib.mkOption {
|
||||
type = lib.types.str;
|
||||
default = "nix-cache.depeuter.dev";
|
||||
description = "The domain name for the Attic server.";
|
||||
};
|
||||
port = lib.mkOption {
|
||||
type = lib.types.port;
|
||||
default = 8080;
|
||||
description = "The port Attic server listens on.";
|
||||
};
|
||||
databaseName = lib.mkOption {
|
||||
type = lib.types.str;
|
||||
default = "attic";
|
||||
description = "The name of the PostgreSQL database.";
|
||||
};
|
||||
dbContainerName = lib.mkOption {
|
||||
type = lib.types.str;
|
||||
default = "attic-db";
|
||||
description = "The name of the PostgreSQL container.";
|
||||
};
|
||||
storagePath = lib.mkOption {
|
||||
type = lib.types.str;
|
||||
default = "/var/lib/atticd/storage";
|
||||
description = "The path where Attic store's its blobs.";
|
||||
};
|
||||
openFirewall = lib.mkOption {
|
||||
type = lib.types.bool;
|
||||
default = false;
|
||||
description = "Whether to open the firewall port for Attic.";
|
||||
};
|
||||
enableRemoteBuilder = lib.mkOption {
|
||||
type = lib.types.bool;
|
||||
default = false;
|
||||
description = "Whether to enable remote build capabilities on this host.";
|
||||
};
|
||||
};
|
||||
|
||||
config = lib.mkIf cfg.enable {
|
||||
sops.secrets = {
|
||||
"attic/db-password" = { };
|
||||
"attic/server-token-secret" = { };
|
||||
};
|
||||
|
||||
services.atticd = {
|
||||
enable = true;
|
||||
environmentFile = config.sops.secrets."attic/server-token-secret".path;
|
||||
|
||||
settings = {
|
||||
listen = "[::]:${toString cfg.port}";
|
||||
allowed-hosts = [ cfg.domain ];
|
||||
api-endpoint = "https://${cfg.domain}/";
|
||||
|
||||
database.url = "postgresql://${cfg.databaseName}@${cfg.dbContainerName}:5432/${cfg.databaseName}";
|
||||
|
||||
storage = {
|
||||
type = "local";
|
||||
path = cfg.storagePath;
|
||||
};
|
||||
|
||||
chunking = {
|
||||
min-size = 16384; # 16 KiB
|
||||
avg-size = 65536; # 64 KiB
|
||||
max-size = 262144; # 256 KiB
|
||||
};
|
||||
};
|
||||
};
|
||||
|
||||
homelab.virtualisation.containers.enable = true;
|
||||
|
||||
virtualisation.oci-containers.containers."${cfg.dbContainerName}" = {
|
||||
image = "postgres:15-alpine";
|
||||
autoStart = true;
|
||||
# We still map it to host for Attic (running on host) to connect to it via bridge IP or name
|
||||
# if we set up networking/DNS correctly.
|
||||
ports = [
|
||||
"5432:5432/tcp"
|
||||
];
|
||||
environment = {
|
||||
POSTGRES_USER = cfg.databaseName;
|
||||
POSTGRES_PASSWORD_FILE = config.sops.secrets."attic/db-password".path;
|
||||
POSTGRES_DB = cfg.databaseName;
|
||||
};
|
||||
volumes = [
|
||||
"attic-db:/var/lib/postgresql/data"
|
||||
];
|
||||
};
|
||||
|
||||
# Map the container name to localhost if Attic is on the host
|
||||
networking.extraHosts = ''
|
||||
127.0.0.1 ${cfg.dbContainerName}
|
||||
'';
|
||||
|
||||
networking.firewall.allowedTCPPorts = lib.mkIf cfg.openFirewall [ cfg.port ];
|
||||
|
||||
# Remote build host configuration
|
||||
nix.settings.trusted-users = lib.mkIf cfg.enableRemoteBuilder [ "root" "@wheel" "builder" ];
|
||||
|
||||
users.users.builder = lib.mkIf cfg.enableRemoteBuilder {
|
||||
isNormalUser = true;
|
||||
group = "builder";
|
||||
openssh.authorizedKeys.keys = [
|
||||
# Placeholders - user should provide actual keys
|
||||
"ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIFrp6aM62Bf7bj1YM5AlAWuNrANU3N5e8+LtbbpmZPKS"
|
||||
];
|
||||
};
|
||||
users.groups.builder = lib.mkIf cfg.enableRemoteBuilder {};
|
||||
|
||||
# Only open SSH if remote builder is enabled
|
||||
services.openssh.ports = lib.mkIf cfg.enableRemoteBuilder [ 22 ];
|
||||
networking.firewall.allowedTCPPorts = lib.mkIf cfg.enableRemoteBuilder [ 22 ];
|
||||
};
|
||||
}
|
||||
|
|
@ -1,6 +1,7 @@
|
|||
{
|
||||
imports = [
|
||||
./actions
|
||||
./attic
|
||||
./openssh
|
||||
];
|
||||
}
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue