* feat: support container_network=host across all roles + systemd templates
Mirror the pattern Slavi introduced for matrix-coturn (aafa8f0) across the
fork: every 'Ensure X container network is created' task gets a
'when: <var> not in ["", "host"]' guard so MDAD does not try to
docker_network create a network literally named 'host' (returns 403,
since host is a pre-defined Docker network).
Mirror the same guard in every systemd unit template that does
'ExecStartPre=docker network connect <addnet> <container>' loops over
matrix_<role>_container_additional_networks: skip the connects when the
container is on host networking (where additional --network attaches
are invalid).
Unblocks DiD setups where MDAD-managed containers share their host's
network namespace (matrix-mdad outer compose service joined to central
postgres/openldap networks) to reach external services on the outer
Docker daemon.
* Simplify container network guards (!= 'host') and fix duplicate when
Guarding on the empty string ('') as well was misleading: systemd unit
templates still render an unconditional --network= flag, so an empty
network value produces a broken docker create command. Only 'host' is
actually supported, so only guard on that. This also matches the
existing convention in the Traefik role
(when: traefik_container_network != 'host').
Also fix a duplicate when key in the meshtastic-relay role, where the
network-creation task already had a when condition - the two are now
combined into a list.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
---------
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
Co-authored-by: Claude Fable 5 <noreply@anthropic.com>
MAS now connects to the playbook-managed Postgres via a UNIX socket by
default (when available), matching the approach already used by Synapse.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
People often report and ask about these "failures".
More-so previously, when the `docker kill/rm` output was collected,
but it still happens now when people do `systemctl status
matrix-something` and notice that it says "FAILURE".
Suppressing to avoid further time being wasted on saying "this is
expected".
Reverts b1b4ba501fdfaa, 90c9801c560b6, a3c84f78ca9c65a, ..
I haven't really traced it (yet), but on some servers, I'm observing
`ansible-playbook ... --tags=start` completing very slowly, waiting
to stop services. I can't reproduce this on all Matrix servers I manage.
I suspect that either the systemd version is to blame or that some
specific service is not responding well to some `docker kill/rm` command.
`ExecStop` seems to work great in all cases and it's what we've been
using for a very long time, so I'm reverting to that.
These are just defensive cleanup tasks that we run.
In the good case, there's nothing to kill or remove, so they trigger an
error like this:
> Error response from daemon: Cannot kill container: something: No such container: something
and:
> Error: No such container: something
People often ask us if this is a problem, so instead of always having to
answer with "no, this is to be expected", we'd rather eliminate it now
and make logs cleaner.
In the event that:
- a container is really stuck and needs cleanup using kill/rm
- and cleanup fails, and we fail to report it because of error
suppression (`2>/dev/null`)
.. we'd still get an error when launching ("container name already in use .."),
so it shouldn't be too hard to investigate.
The Docker 19.04 -> 20.10 upgrade contains the following change
in `/usr/lib/systemd/system/docker.service`:
```
-BindsTo=containerd.service
-After=network-online.target firewalld.service containerd.service
+After=network-online.target firewalld.service containerd.service multi-user.target
-Requires=docker.socket
+Requires=docker.socket containerd.service
Wants=network-online.target
```
The `multi-user.target` requirement in `After` seems to be in conflict
with our `WantedBy=multi-user.target` and `After=docker.service` /
`Requires=docker.service` definitions, causing the following error on
startup for all of our systemd services:
> Job matrix-synapse.service/start deleted to break ordering cycle starting with multi-user.target/start
A workaround which appears to work is to add `DefaultDependencies=no`
to all of our services.