The startup issue came from a timing dependency around coturn TLS certs:
- `matrix-coturn.service` depends on
`matrix-traefik-certs-dumper-wait-for-domain@<matrix-fqdn>.service`
- That waiter succeeds only after Traefik has obtained and dumped a cert for
the Matrix hostname (typically driven by homeserver labels/routes becoming
active)
- If coturn is started too early, it can block/fail waiting for cert files
that are not yet present
Historically, coturn priority was mode-dependent:
- `one-by-one`: coturn at 1500 (delayed after homeserver)
- other modes: coturn at 900 (before homeserver)
This could still trigger undesirable startup ordering and confusing behavior
in non-`one-by-one` modes, especially during initial bootstrap/restart flows
where cert availability lags service startup.
This change makes ordering explicit and consistent:
1. Introduce `matrix_homeserver_systemd_service_manager_priority` (default 1000)
in `roles/custom/matrix-base/defaults/main.yml`.
2. Use that variable for the homeserver service entry in
`group_vars/matrix_servers`.
3. Set coturn priority relative to homeserver priority in all modes:
`matrix_homeserver_systemd_service_manager_priority + 500`.
4. Update inline documentation comments in `group_vars/matrix_servers` to
match the new behavior and rationale.
Result:
- Homeserver/coturn ordering is deterministic and mode-agnostic.
- Coturn is intentionally started later than the homeserver by default,
reducing first-start certificate wait/fail races.
- Priority intent is now centralized and configurable via a dedicated
homeserver priority variable.
- Coturn may still be stated earlier, because the homeserver typically
has a `Wants` "dependency" on it, but that's alright