Traefik — Reverse Proxy & TLS Terminator
Project: Traefik Proxy
Host: traefik / traefik.home.lab
IP: 10.69.20.40
VLAN: 20 (SERVERS)
Web UI: https://traefik.home.helix9.org (LAN-only or Authentik)
Role path: roles/traefik/
Playbook: playbooks/traefik.yml
Overview
Traefik is the single ingress point for every HTTP(S) service in the home network. It terminates TLS, fetches and renews ACME certificates via Cloudflare DNS-01, and routes requests to backends across the SERVERS, MGMT, and DMZ zones. It also fronts the Authentik forward-auth flow for services that need SSO.
Request flow:
client backend
│ ▲
│ *.home.helix9.org → VyOS DNS forwarder → 10.69.20.40 │
│ │
▼ │
┌──────────────────┐ middleware ┌─────────────────┐ │
│ Traefik :443 │ ─ local-only ─►│ Authentik fwd │ │
│ TLS terminator │ ─ authentik ─►│ auth (9000) │ │
└────────┬─────────┘ └─────────────────┘ │
│ HTTP / HTTPS (proxmox-transport) │
└──────────────────────────────────────────────────┘
Three properties define the architecture:
- File provider only — no Docker label discovery, no Kubernetes CRDs. Routers and services are declared statically in templated YAML rendered by Ansible.
- DNS-01 ACME — certificates are issued via Cloudflare's API, so Traefik does not need to be reachable from the public internet to renew. Wildcards (
*.home.helix9.org) are possible with no firewall holes. - Split-horizon DNS —
*.home.helix9.orgresolves to10.69.20.40for clients inside the network (via Technitium static mappings), and to the public VPS IP outside. Same Traefik handles both.
Infrastructure
LXC Container
| Setting | Value |
|---|---|
| Node | pve02 |
| IP | 10.69.20.40/24 |
| Gateway | 10.69.20.1 |
| CPU | 2 cores |
| RAM | 1024 MB |
| Swap | 512 MB |
| Disk | 8 GB |
| Template | rockylinux-10-default_20251001_amd64 |
| OS family | RedHat (uses dnf + firewalld) |
| Unprivileged | yes |
| Tags | servers, traefik, ingress |
Provisioned via Terraform from inventory/host_vars/traefik/vars.yml. See New Host for the standard LXC-creation workflow.
Ansible
| Item | Path |
|---|---|
| Role | roles/traefik/ |
| Playbook | playbooks/traefik.yml |
| Host vars | inventory/host_vars/traefik/vars.yml |
| Vault keys | vault_traefik_acme_email, vault_traefik_cf_dns_token |
Deploy:
ansible-playbook playbooks/traefik.yml
The role is idempotent. It downloads the pinned Traefik binary only when the installed version does not match traefik_version, then renders three template files and reloads the systemd unit.
Installation Layout
| Path | Purpose |
|---|---|
/usr/local/bin/traefik | Binary (capability cap_net_bind_service=+ep) |
/etc/traefik/traefik.yml | Static config (rendered) |
/etc/traefik/dynamic/dynamic.yml | Middlewares + TLS options (rendered) |
/etc/traefik/dynamic/services.yml | Routers + backends (rendered) |
/etc/traefik/acme/acme.json | ACME state (mode 0600, owned by traefik) |
/var/log/traefik/traefik.log | Daemon log |
/var/log/traefik/access.log | HTTP access log |
/etc/systemd/system/traefik.service | Hardened unit |
The binary is run as the unprivileged traefik user with cap_net_bind_service so it can bind 80/443 without root. The systemd unit applies NoNewPrivileges, ProtectSystem=full, ProtectHome, PrivateTmp, and PrivateDevices. Only the ACME and log directories are writable.
Static Configuration
Rendered from roles/traefik/templates/traefik.yml.j2.
Entrypoints
| Name | Address | Behaviour |
|---|---|---|
web | :80 | Permanent redirect to websecure (HTTPS) |
websecure | :443 | Primary TLS entrypoint |
metrics | :8082 | Prometheus scrape (internal-only) |
Providers
Only the file provider is enabled, watching /etc/traefik/dynamic/. watch: false — config reloads happen via Ansible handler (Restart traefik), not file inotify. This is deliberate: changes go through Git → Ansible, not hand-edits.
Certificate Resolver
| Setting | Value |
|---|---|
| Resolver name | letsencrypt |
| Challenge | DNS-01 |
| DNS provider | cloudflare |
| Storage | /etc/traefik/acme/acme.json |
| API resolvers | 1.1.1.1:53, 9.9.9.9:53 |
The Cloudflare API token is injected at runtime via systemd Environment=CF_DNS_API_TOKEN=..., sourced from the Ansible vault key vault_traefik_cf_dns_token. The token requires only Zone:DNS:Edit scope on helix9.org.
Metrics & Dashboard
- Prometheus metrics: enabled, scraped on
:8082with entry-point and service labels. - Dashboard: enabled, served on the
api@internalservice, exposed only via thetraefik-dashboardrouter (see below) — never on:8080insecurely.
Dynamic — Middlewares & TLS
Rendered from roles/traefik/templates/dynamic.yml.j2.
Middlewares
| Name | Type | Purpose |
|---|---|---|
secure-headers | headers | HSTS 1y + preload, no-sniff, XSS filter, strict-origin referrer |
local-only | ipAllowList | Permit 10.69.0.0/16 and 192.168.178.0/24 only |
authentik | forwardAuth | Delegate auth to Authentik outpost on :9000/outpost.goauthentik.io/auth/traefik |
local-or-auth | chain | local-only first, then authentik — used for the dashboard |
The authentik middleware passes a fixed set of X-authentik-* headers (username, groups, email, name, uid, JWT, meta) downstream so backends can read identity claims without re-authenticating.
TLS options (default)
| Setting | Value |
|---|---|
minVersion | VersionTLS12 |
sniStrict | true |
| Cipher suites | TLS 1.3 AEAD suites + ECDHE-ECDSA / ECDHE-RSA AES-256-GCM |
sniStrict: true rejects connections that don't present an SNI matching a known router — limits scanner noise.
Routers & Services
Rendered from roles/traefik/templates/services.yml.j2. Hostnames mostly resolve from Ansible inventory hostvars[<name>].ansible_host, so service IPs follow inventory/hosts.yml. A few are hardcoded (e.g. mediastack at 10.69.20.6x) because those hosts are not yet in inventory.
Routing rule patterns
There are five exposure patterns in use. They are not interchangeable — each has a different security posture:
| Pattern | How | Where used |
|---|---|---|
| Public | Bare Host() rule, no middleware | paperless, pulse, onedev, docusaurus, dns, uptimekuma, pbs |
| Public + Authentik | Host() + middlewares: [authentik] | proxmox-pve01/02, copyparty, openclaw |
| LAN-only via router rule | Host() AND ClientIP(...) in the same rule | seerr, sonarr, jellyfin-local |
| LAN-only via middleware | Host() + middlewares: [local-only] | radarr, sabnzbd |
| LAN bypass + external Authentik | Two routers, different priorities | jellyfin |
The two LAN-only patterns are functionally similar but differ in failure mode:
- Router rule (
ClientIP) — non-matching clients get404 Not Found. The router doesn't even match. - Middleware (
local-only) — non-matching clients hit the router but get403 Forbidden. The router matches, the middleware rejects.
The 404 form is preferable because it leaks less information about which services exist. Prefer ClientIP in the router rule for any new LAN-only entry.
Dual-router pattern (Jellyfin)
Two routers share the same Host() and differ on priority:
jellyfin-local:
rule: "Host(`jellyfin.home.helix9.org`) && (ClientIP(`10.69.0.0/16`) || ClientIP(`192.168.178.0/24`))"
priority: 100
# no middleware
jellyfin:
rule: "Host(`jellyfin.home.helix9.org`)"
priority: 50
middlewares: [authentik]
Higher priority + more specific match wins for LAN clients (no Authentik prompt — the Jellyfin app handles its own auth). External clients fall through to the lower-priority router and are gated by Authentik.
Path-priority pattern (OpenClaw)
openclaw-api:
rule: "Host(`openclaw.helix9.org`) && PathPrefix(`/__openclaw__/`)"
priority: 200
# no middleware — API path
openclaw:
rule: "Host(`openclaw.helix9.org`)"
priority: 100
middlewares: [authentik]
The API prefix bypasses Authentik (machine-to-machine), the rest of the host requires it.
Authentik outpost — priority 1000
authentik-outpost:
rule: "PathPrefix(`/outpost.goauthentik.io/`)"
priority: 1000
This must outrank every per-service router. Any host with the Authentik forward-auth middleware redirects unauthenticated users to /outpost.goauthentik.io/... on the same host — and the outpost router needs to claim that path on every host. Hence the very high priority and the lack of Host() constraint.
Proxmox transport
Proxmox uses self-signed certificates on :8006 (PVE) and :8007 (PBS). A dedicated serversTransport named proxmox-transport sets insecureSkipVerify: true for those backends only — no other backend opts into TLS skip.
serversTransports:
proxmox-transport:
insecureSkipVerify: true
Current router inventory
Note: the
dnsrouter pointing attechnitium:5380is currently inactive. The Technitium LXC at10.69.20.53is still running but is not in the live DNS path — VyOS forwards directly to upstream resolvers. The router stays inservices.yml.j2so the path is one playbook run away from being live again, but treat the entry as dormant.
| Router | Host | Backend | Middleware |
|---|---|---|---|
paperless | paperless.home.helix9.org | paperless:8000 | — |
pulse | pulse.home.helix9.org | pulse:7655 | — |
uptimekuma | status.home.helix9.org | uptime-kuma:3001 | — |
onedev | onedev.home.helix9.org | onedev:6610 | — |
docusaurus | docs.home.helix9.org | docusaurus:8080 | — |
dns (inactive) | dns.home.helix9.org | technitium:5380 | — |
pbs | pbs01.home.helix9.org | pbs01:8007 (proxmox-transport) | — |
proxmox-pve01 | pve01.home.helix9.org | 10.69.10.5:8006 (proxmox-transport) | authentik |
proxmox-pve02 | pve02.home.helix9.org | pve02:8006 (proxmox-transport) | authentik |
authentik | auth.home.helix9.org | authentik:9000 | — |
authentik-outpost | PathPrefix(/outpost.goauthentik.io/) | authentik:9000 | — |
copyparty | copyparty.home.helix9.org | copyparty:3923 | authentik |
openclaw | openclaw.helix9.org | openclaw:18789 | authentik |
openclaw-api | openclaw.helix9.org/__openclaw__/ | openclaw:18789 | — |
jellyfin-local | jellyfin.home.helix9.org (LAN) | 10.69.20.64:8096 | — |
jellyfin | jellyfin.home.helix9.org (external) | 10.69.20.64:8096 | authentik |
seerr-local | seerr.home.helix9.org (LAN) | 10.69.20.65:5055 | — |
sonarr-local | sonarr.home.helix9.org (LAN) | 10.69.20.62:8989 | — |
radarr | radarr.home.helix9.org | 10.69.20.63:7878 | local-only |
sabnzbd | sabnzbd.home.helix9.org | 10.69.20.61:8080 | local-only |
traefik-dashboard | traefik.home.helix9.org | api@internal | local-or-auth |
Adding a New Service
The end-to-end checklist for exposing a new backend (e.g. foo.home.helix9.org → 10.69.20.99:8000):
- Inventory — add the host to
inventory/hosts.ymlwith the rightansible_hostIP. Runansible-playbook playbooks/vyos_dns.ymlto push the shortname → IP mapping into VyOS. - DNS — add
foo.home.helix9.org→10.69.20.40to the static-host mappings on VyOS (currently hand-curated; see Home Router — Static Host Mappings). External access also needs a Cloudflare record pointing to the VPS public IP. - Firewall — confirm the SERVERS zone reaches the backend port. For non-SERVERS backends (DMZ, MGMT) add a specific rule in the relevant cross-zone policy on VyOS.
- Traefik router — append a router + service to
roles/traefik/templates/services.yml.j2. Pick the routing pattern from the table above. For LAN-only, preferClientIPin the router rule. - Deploy —
ansible-playbook playbooks/traefik.yml. The handler restarts Traefik; the certificate is requested on first hit. - Verify —
curl -I https://foo.home.helix9.orgfrom a TRUSTED client; checkjournalctl -u traefikfor ACME success. Theacme.jsonfile should grow on disk.
If the service does its own SSO (Authelia-style backends, Authentik itself), skip the authentik middleware.
ACME Certificates
Storage
Certificates and account keys live in /etc/traefik/acme/acme.json (0600, owned by traefik). This is the single most important file on the host — losing it triggers a full re-issue on next request, which is fine, but rate-limited by Let's Encrypt (50 certs / week / domain).
Back it up. The role supports seeding a fresh install from a controller-side copy via traefik_acme_seed_src:
# in host_vars/traefik/vars.yml or extra-vars
traefik_acme_seed_src: "/path/to/acme.json"
traefik_acme_seed_force: false # true to overwrite existing
This was used to migrate existing certs from the old podman host without re-issuing.
Renewal
Traefik renews automatically 30 days before expiry. No cron, no hooks. The DNS-01 challenge runs every renewal — the Cloudflare API token must remain valid.
DNS-01 mechanics
For each cert request:
- Traefik calls Cloudflare API to create a
_acme-challenge.<host>TXT record. - Lets Encrypt's CA queries that TXT record from authoritative Cloudflare nameservers.
- Traefik deletes the TXT record on success.
The internal resolvers (1.1.1.1, 9.9.9.9) are used by Traefik to verify the TXT record's propagation before notifying Let's Encrypt — they need to remain reachable from the SERVERS zone.
Wildcard certs
Currently every service has its own cert. To switch to a wildcard *.home.helix9.org, add a tls.domains block to one router and Traefik will request the wildcard once, then reuse it. Worth doing if the service count grows past ~30 (rate-limit headroom).
Authentik Integration
Forward-auth is configured at the middleware level (see dynamic.yml.j2). The flow:
- Client requests
https://copyparty.home.helix9.org. - Traefik calls
http://authentik:9000/outpost.goauthentik.io/auth/traefikwith the original request headers. - If the user has a valid Authentik session cookie, the outpost returns
200and the request continues to copyparty withX-authentik-*identity headers. - If not, the outpost returns
302to/outpost.goauthentik.io/start?rd=<original-url>. - The high-priority
authentik-outpostrouter (priority 1000) catches/outpost.goauthentik.io/*on every host and proxies to the same outpost backend, which renders the login UI. - After login, the outpost redirects back to the original URL.
Backends that read the identity headers (e.g. X-authentik-email) get user info without doing their own auth. Backends that don't are still gated — the request only reaches them after the outpost approves.
Common pitfall: if you forget to add the Authentik middleware to a new router, the service is publicly accessible at *.home.helix9.org (which resolves externally via the VPS). LAN-only services should use local-only instead, never rely on "nobody knows the URL".
Operations
Reload / restart
systemctl reload traefik # SIGUSR1 — log rotation only
systemctl restart traefik # full restart, ACME state preserved
Config changes require a full restart (watch: false). The Ansible handler does this automatically.
Logs
| Stream | Path |
|---|---|
| Daemon | /var/log/traefik/traefik.log |
| Access | /var/log/traefik/access.log |
| systemd | journalctl -u traefik -f |
Log level is INFO. Bump to DEBUG for ACME troubleshooting only (very noisy).
Health checks
curl -sk https://localhost/ -H "Host: traefik.home.helix9.org" # dashboard ping
curl -s http://localhost:8082/metrics | head # Prometheus metrics
ss -tlnp | grep traefik # 80, 443, 8082 bound
Firewall (host)
Rocky's firewalld opens 80 and 443 only. The Prometheus endpoint on :8082 is only reachable from the local container — Pulse / Prometheus would need an explicit firewalld rule to scrape it, currently not opened.
Dashboard access
traefik.home.helix9.org is gated by the local-or-auth chain: any LAN client (10.69.0.0/16, 192.168.178.0/24) gets straight through; anything else must authenticate via Authentik.
Known Gotchas
watch: false— editing/etc/traefik/dynamic/*.ymldirectly does nothing until restart. This is intentional, but surprising the first time. Always go through Ansible.pve01hardcoded —traefik_pve01_ip: 10.69.10.5is indefaults/main.ymlbecause pve01 is not yet in inventory. Move tohostvars[pve01].ansible_hostonce it is.- Mediastack hardcoded IPs — Jellyfin, Sonarr, Radarr, Sabnzbd, Seerr use literal
10.69.20.6xaddresses. Same fix: add them to inventory and switch tohostvars[].ansible_host. - Cloudflare token scope — must be Zone:DNS:Edit on
helix9.org, not the global API key. Rotating the token requires re-running the playbook (the env var is read at process start). - Outpost priority — if you ever add a router with
priority: >= 1000, double-check it doesn't shadow theauthentik-outpostPathPrefix matcher. Login flows will silently break. sniStrict— clients that don't send SNI (rare, ancient clients, some monitoring probes) get a TLS error before any router matches. Disable per-router only if you have a known offender.- ACME storage permissions — if
acme.jsonbecomes group-readable for any reason, Traefik refuses to start. Re-run the role to fix mode. - Restart drops connections — Traefik is not graceful on restart; in-flight long-poll connections (Home Assistant, Synapse) will drop. Restart during low-traffic windows or accept brief blips.
Related Docs
- Home Router — Static Host Mappings — split-horizon DNS source of truth
- Technitium DNS — (currently inactive — LXC retained but not in DNS path)
- Pulse — Proxmox monitoring, fronted by Traefik via
pulse.home.helix9.org - Paperless — example of a public-pattern backend
- Ansible Setup — playbook + vault structure