| @@ -1,3 +1,12 @@ | |||
| # 2020-08-21 | |||
| ## rust-synapse-compress-state support | |||
| The playbook can now help you use [rust-synapse-compress-state](https://github.com/matrix-org/rust-synapse-compress-state) to compress the state groups in your Synapse database. | |||
| See our [Compressing state with rust-synapse-compress-state](docs/maintenance-synapse.md#compressing-state-with-rust-synapse-compress-state) documentation page to get started. | |||
| # 2020-07-22 | |||
| ## Synapse Admin support | |||
| @@ -22,6 +22,8 @@ If you are using an [external Postgres server](configuring-playbook-external-pos | |||
| ## Vacuuming PostgreSQL | |||
| Deleting lots data from Postgres does not make it release disk space, until you perform a `VACUUM` operation. | |||
| To perform a `FULL` Postgres [VACUUM](https://www.postgresql.org/docs/current/sql-vacuum.html), run the playbook with `--tags=run-postgres-vacuum`. | |||
| Example: | |||
| @@ -42,7 +44,7 @@ docker run \ | |||
| --rm \ | |||
| --network=matrix \ | |||
| --env-file=/matrix/postgres/env-postgres-psql \ | |||
| postgres:12.1-alpine \ | |||
| postgres:12.4-alpine \ | |||
| pg_dumpall -h matrix-postgres \ | |||
| | gzip -c \ | |||
| > /postgres.sql.gz | |||
| @@ -9,75 +9,74 @@ Table of contents: | |||
| - [Purging old data with the Purge History API](#purging-old-data-with-the-purge-history-api), for when you wish to delete in-use (but old) data from the Synapse database | |||
| - [Synapse maintenance](#synapse-maintenance) | |||
| - [Purging unused data with synapse-janitor](#purging-unused-data-with-synapse-janitor) | |||
| - [Vacuuming Postgres](#vacuuming-postgres) | |||
| - [Purging old data with the Purge History API](#purging-old-data-with-the-purge-history-api) | |||
| - [Compressing state with rust-synapse-compress-state](#compressing-state-with-rust-synapse-compress-state) | |||
| - [Purging unused data with synapse-janitor](#purging-unused-data-with-synapse-janitor) | |||
| - [Browse and manipulate the database](#browse-and-manipulate-the-database) | |||
| - [Browse and manipulate the database](#browse-and-manipulate-the-database), for when you really need to take matters into your own hands | |||
| ## Purging unused data with synapse-janitor | |||
| **NOTE**: There are [reports](https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/465) that **synapse-janitor is dangerous to use and causes database corruption**. You may wish to refrain from using it. | |||
| ## Purging old data with the Purge History API | |||
| When you **leave** and **forget** a room, Synapse can clean up its data, but currently doesn't. | |||
| This **unused and unreachable data** remains in your database forever. | |||
| You can use the **Purge History API** to delete in-use (but old) data. | |||
| There are external tools (like [synapse-janitor](https://github.com/xwiki-labs/synapse_scripts)), which are meant to solve this problem. | |||
| **This is destructive** (especially for non-federated rooms), because it means **people will no longer have access to history past a certain point**. | |||
| To ask the playbook to run synapse-janitor, execute: | |||
| Synapse's [Purge History API](https://github.com/matrix-org/synapse/blob/master/docs/admin_api/purge_history_api.rst) can be used to purge on a per-room basis. | |||
| ```bash | |||
| ansible-playbook -i inventory/hosts setup.yml --tags=run-postgres-synapse-janitor,start | |||
| ``` | |||
| To make use of this API, **you'll need an admin access token** first. You can find your access token in the setting of some clients (like Element). | |||
| Alternatively, you can log in and obtain a new access token like this: | |||
| **Note**: this will automatically stop Synapse temporarily and restart it later. | |||
| ``` | |||
| curl \ | |||
| --data '{"identifier": {"type": "m.id.user", "user": "YOUR_MATRIX_USERNAME" }, "password": "YOUR_MATRIX_PASSWORD", "type": "m.login.password", "device_id": "Synapse-Purge-History-API"}' \ | |||
| https://matrix.DOMAIN/_matrix/client/r0/login | |||
| ``` | |||
| Follow the [Purge History API](https://github.com/matrix-org/synapse/blob/master/docs/admin_api/purge_history_api.rst) documentation page for the actual purging instructions. | |||
| ### Vacuuming Postgres | |||
| After deleting data, you may wish to run a [`FULL` Postgres `VACUUM`](./maintenance-postgres.md#vacuuming-postgresql). | |||
| Running synapse-janitor potentially deletes a lot of data from the Postgres database. | |||
| However, disk space only ever gets released after a [`FULL` Postgres `VACUUM`](./maintenance-postgres.md#vacuuming-postgresql). | |||
| It's easiest if you ask the playbook to run both synapse-janitor and a `VACUUM FULL` in one call: | |||
| ## Compressing state with rust-synapse-compress-state | |||
| ```bash | |||
| ansible-playbook -i inventory/hosts setup.yml --tags=run-postgres-synapse-janitor,run-postgres-vacuum,start | |||
| ``` | |||
| [rust-synapse-compress-state](https://github.com/matrix-org/rust-synapse-compress-state) can be used to optimize some `_state` tables used by Synapse. | |||
| **Note**: this will automatically stop Synapse temporarily and restart it later. You'll also need plenty of available disk space in your Postgres data directory (usually `/matrix/postgres/data`). | |||
| This tool should be safe to use (even when Synapse is running), but it's always a good idea to [make Postgres backups](./maintenance-postgres.md#backing-up-postgresql) first. | |||
| To ask the playbook to run rust-synapse-compress-state, execute: | |||
| ## Purging old data with the Purge History API | |||
| ``` | |||
| ansible-playbook -i inventory/hosts setup.yml --tags=rust-synapse-compress-state | |||
| ``` | |||
| If [purging unused and unreachable data](#purging-unused-data-with-synapse-janitor) is not enough for you, you can start deleting in-use (but old) data. | |||
| By default, all rooms with more than `100000` state group rows will be compressed. | |||
| If you need to adjust this, pass: `--extra-vars='matrix_synapse_rust_synapse_compress_state_min_state_groups_required=SOME_NUMBER_HERE'` to the command above. | |||
| **This is destructive** (especially for non-federated rooms), because it means **people will no longer have access to history past a certain point**. | |||
| After state compression, you may wish to run a [`FULL` Postgres `VACUUM`](./maintenance-postgres.md#vacuuming-postgresql). | |||
| Synapse provides a [Purge History API](https://github.com/matrix-org/synapse/blob/master/docs/admin_api/purge_history_api.rst) that you can use to purge on a per-room basis. | |||
| To make use of this API, **you'll need an admin access token** first. You can find your access token in the setting of some clients (like Element). | |||
| Alternatively, you can log in and obtain a new access token like this: | |||
| ## Purging unused data with synapse-janitor | |||
| ``` | |||
| curl \ | |||
| --data '{"identifier": {"type": "m.id.user", "user": "YOUR_MATRIX_USERNAME" }, "password": "YOUR_MATRIX_PASSWORD", "type": "m.login.password", "device_id": "Synapse-Purge-History-API"}' \ | |||
| https://matrix.DOMAIN/_matrix/client/r0/login | |||
| ``` | |||
| **NOTE**: There are [reports](https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/465) that **synapse-janitor is dangerous to use and causes database corruption**. You may wish to refrain from using it. | |||
| Follow the [Purge History API](https://github.com/matrix-org/synapse/blob/master/docs/admin_api/purge_history_api.rst) documentation page for the actual purging instructions. | |||
| When you **leave** and **forget** a room, Synapse can clean up its data, but currently doesn't. | |||
| This **unused and unreachable data** remains in your database forever. | |||
| Don't forget that disk space only ever gets released after a [`FULL` Postgres `VACUUM`](./maintenance-postgres.md#vacuuming-postgresql) - something the playbook can help you with. | |||
| There are external tools (like [synapse-janitor](https://github.com/xwiki-labs/synapse_scripts)), which are meant to solve this problem. | |||
| To ask the playbook to run synapse-janitor, execute: | |||
| ## Compressing state with rust-synapse-compress-state | |||
| ```bash | |||
| ansible-playbook -i inventory/hosts setup.yml --tags=run-postgres-synapse-janitor,start | |||
| ``` | |||
| [rust-synapse-compress-state](https://github.com/matrix-org/rust-synapse-compress-state) can be used to optimize some `_state` tables used by Synapse. | |||
| **Note**: this will automatically stop Synapse temporarily and restart it later. | |||
| Unfortunately, at this time the playbook can't help you run this **experimental tool**. | |||
| Running synapse-janitor potentially deletes a lot of data from the Postgres database. | |||
| You may wish to run a [`FULL` Postgres `VACUUM`](./maintenance-postgres.md#vacuuming-postgresql) after that. | |||
| Since it's also experimental, you may wish to stay away from it, or at least [make Postgres backups](./maintenance-postgres.md#backing-up-postgresql) first. | |||
| ## Browse and manipulate the database | |||
| @@ -101,6 +101,7 @@ run_postgres_vacuum: true | |||
| run_synapse_register_user: true | |||
| run_synapse_update_user_password: true | |||
| run_synapse_import_media_store: true | |||
| run_synapse_rust_synapse_compress_state: true | |||
| run_setup: true | |||
| run_self_check: true | |||
| run_start: true | |||
| @@ -364,6 +364,13 @@ matrix_synapse_redaction_retention_period: 7d | |||
| matrix_synapse_user_ips_max_age: 28d | |||
| matrix_synapse_rust_synapse_compress_state_docker_image: "devture/rust-synapse-compress-state:v0.1.0" | |||
| matrix_synapse_rust_synapse_compress_state_docker_image_force_pull: "{{ matrix_synapse_rust_synapse_compress_state_docker_image.endswith(':latest') }}" | |||
| matrix_synapse_rust_synapse_compress_state_base_path: "{{ matrix_base_data_path }}/rust-synapse-compress-state" | |||
| # Default Synapse configuration template which covers the generic use case. | |||
| # You can customize it by controlling the various variables inside it. | |||
| # | |||
| @@ -43,6 +43,11 @@ | |||
| tags: | |||
| - update-user-password | |||
| - import_tasks: "{{ role_path }}/tasks/rust-synapse-compress-state/main.yml" | |||
| when: run_synapse_rust_synapse_compress_state|bool | |||
| tags: | |||
| - rust-synapse-compress-state | |||
| - name: Mark matrix-synapse role as executed | |||
| set_fact: | |||
| matrix_synapse_role_executed: true | |||
| @@ -0,0 +1,48 @@ | |||
| - debug: | |||
| msg: "Compressing room `{{ room_details.room_id }}` having {{ room_details.count }} state group rows" | |||
| - name: Generate rust-synapse-compress-state room compression command | |||
| set_fact: | |||
| matrix_synapse_rust_synapse_compress_state_compress_room_command: >- | |||
| {{ matrix_host_command_docker }} run --rm --name matrix-rust-synapse-compress-state-compress-room | |||
| --user={{ matrix_user_uid }}:{{ matrix_user_gid }} | |||
| --cap-drop=ALL | |||
| --network={{ matrix_docker_network }} | |||
| -v {{ matrix_synapse_rust_synapse_compress_state_base_path }}:/work | |||
| {{ matrix_synapse_rust_synapse_compress_state_docker_image }} | |||
| /synapse-compress-state -t -o /work/state-compressor.sql | |||
| -p "host={{ matrix_synapse_database_host }} user={{ matrix_synapse_database_user }} password={{ matrix_synapse_database_password }} dbname={{ matrix_synapse_database_database }}" | |||
| -r '{{ room_details.room_id }}' | |||
| - name: Run rust-synapse-compress-state room compression command (SQL generation) | |||
| command: "{{ matrix_synapse_rust_synapse_compress_state_compress_room_command }}" | |||
| async: "{{ matrix_synapse_rust_synapse_compress_state_compress_room_time }}" | |||
| poll: 10 | |||
| register: matrix_synapse_rust_synapse_compress_state_compress_room_command_result | |||
| - debug: var="matrix_synapse_rust_synapse_compress_state_compress_room_command_result" | |||
| - name: Generate Postgres compression SQL import command | |||
| set_fact: | |||
| matrix_synapse_rust_synapse_compress_state_psql_import_command: >- | |||
| {{ matrix_host_command_docker }} run --rm --name matrix-rust-synapse-compress-state-psql-import | |||
| --user={{ matrix_user_uid }}:{{ matrix_user_gid }} | |||
| --cap-drop=ALL | |||
| --network={{ matrix_docker_network }} | |||
| --env-file={{ matrix_postgres_base_path }}/env-postgres-psql | |||
| -v {{ matrix_synapse_rust_synapse_compress_state_base_path }}:/work:ro | |||
| --entrypoint=/bin/sh | |||
| {{ matrix_postgres_docker_image_latest }} | |||
| -c "cat /work/state-compressor.sql | | |||
| psql -v ON_ERROR_STOP=1 -h matrix-postgres" | |||
| - name: Import compression SQL into Postgres | |||
| command: "{{ matrix_synapse_rust_synapse_compress_state_psql_import_command }}" | |||
| async: "{{ matrix_synapse_rust_synapse_compress_state_psql_import_time }}" | |||
| poll: 10 | |||
| register: matrix_synapse_rust_synapse_compress_state_psql_import_command_result | |||
| - name: Clean up | |||
| file: | |||
| path: "{{ matrix_synapse_rust_synapse_compress_state_base_path }}/state-compressor.sql" | |||
| state: absent | |||
| @@ -0,0 +1,118 @@ | |||
| # Pre-checks | |||
| - name: Fail if Postgres not enabled | |||
| fail: | |||
| msg: "Postgres via the matrix-postgres role is not enabled (`matrix_postgres_enabled`). Cannot use rust-synapse-compress-state." | |||
| when: "not matrix_postgres_enabled|bool" | |||
| # Defaults | |||
| - name: Set matrix_synapse_rust_synapse_compress_state_find_rooms_command_wait_time, if not provided | |||
| set_fact: | |||
| matrix_synapse_rust_synapse_compress_state_find_rooms_command_wait_time: 15 | |||
| when: "matrix_synapse_rust_synapse_compress_state_find_rooms_command_wait_time|default('') == ''" | |||
| - name: Set matrix_synapse_rust_synapse_compress_state_compress_room_time, if not provided | |||
| set_fact: | |||
| matrix_synapse_rust_synapse_compress_state_compress_room_time: 1800 | |||
| when: "matrix_synapse_rust_synapse_compress_state_compress_room_time|default('') == ''" | |||
| - name: Set matrix_synapse_rust_synapse_compress_state_psql_import_time, if not provided | |||
| set_fact: | |||
| matrix_synapse_rust_synapse_compress_state_psql_import_time: 1800 | |||
| when: "matrix_synapse_rust_synapse_compress_state_psql_import_time|default('') == ''" | |||
| - name: Set matrix_synapse_rust_synapse_compress_state_min_state_groups_required, if not provided | |||
| set_fact: | |||
| # The minimum number of state groups we're looking for before we consider a room eligible for compression. | |||
| # Rooms with a smaller state groups count will not be compressed. | |||
| matrix_synapse_rust_synapse_compress_state_min_state_groups_required: 100000 | |||
| when: "matrix_synapse_rust_synapse_compress_state_min_state_groups_required|default('') == ''" | |||
| # Actual compression work | |||
| - name: Ensure rust-synapse-compress-state paths exist | |||
| file: | |||
| path: "{{ matrix_synapse_rust_synapse_compress_state_base_path }}" | |||
| state: directory | |||
| mode: 0750 | |||
| owner: "{{ matrix_user_username }}" | |||
| group: "{{ matrix_user_groupname }}" | |||
| - name: Ensure rust-synapse-compress-state image is pulled | |||
| docker_image: | |||
| name: "{{ matrix_synapse_rust_synapse_compress_state_docker_image }}" | |||
| source: "{{ 'pull' if ansible_version.major > 2 or ansible_version.minor > 7 else omit }}" | |||
| force_source: "{{ matrix_synapse_rust_synapse_compress_state_docker_image_force_pull if ansible_version.major > 2 or ansible_version.minor >= 8 else omit }}" | |||
| force: "{{ omit if ansible_version.major > 2 or ansible_version.minor >= 8 else matrix_synapse_rust_synapse_compress_state_docker_image_force_pull }}" | |||
| - name: Generate rust-synapse-compress-state room find command | |||
| set_fact: | |||
| matrix_synapse_rust_synapse_compress_state_find_rooms_command: >- | |||
| {{ matrix_host_command_docker }} run --rm --name matrix-rust-synapse-compress-state-find-rooms | |||
| --user={{ matrix_user_uid }}:{{ matrix_user_gid }} | |||
| --cap-drop=ALL | |||
| --network={{ matrix_docker_network }} | |||
| --env-file={{ matrix_postgres_base_path }}/env-postgres-psql | |||
| {{ matrix_postgres_docker_image_latest }} | |||
| psql -v ON_ERROR_STOP=1 -h matrix-postgres {{ matrix_synapse_database_database }} -c | |||
| 'SELECT array_to_json(array_agg(row_to_json (r))) FROM (SELECT room_id, count(*) AS count FROM state_groups_state GROUP BY room_id HAVING count(*) > {{ matrix_synapse_rust_synapse_compress_state_min_state_groups_required }} ORDER BY count DESC) r;' | |||
| - name: Find rooms eligible for compression with rust-synapse-compress-state | |||
| command: "{{ matrix_synapse_rust_synapse_compress_state_find_rooms_command }}" | |||
| async: "{{ matrix_synapse_rust_synapse_compress_state_find_rooms_command_wait_time }}" | |||
| poll: 10 | |||
| register: matrix_synapse_rust_synapse_compress_state_find_rooms_command_result | |||
| # We expect the output to be like this: | |||
| # | |||
| # "stdout_lines": [ | |||
| # " array_to_json ", | |||
| # "----------------------------------------------------------------------------------------------------------------------------", | |||
| # " [{\"room_id\":\"!some-id\",\"count\":2461329},{\"room_id\":\"!another-id\",\"count\":512017}]", | |||
| # "(1 row)" | |||
| # ] | |||
| # | |||
| # Row 3 (out of 4) contains the actual result. | |||
| # | |||
| # Row 3 contains a space when there's no result. | |||
| - block: | |||
| - debug: var="matrix_synapse_rust_synapse_compress_state_find_rooms_command_result" | |||
| - name: Fail if room find result is not what we expect | |||
| fail: | |||
| msg: >- | |||
| Expecting 4 lines in the "find rooms" result. | |||
| when: "matrix_synapse_rust_synapse_compress_state_find_rooms_command_result.failed or matrix_synapse_rust_synapse_compress_state_find_rooms_command_result.stdout_lines|length != 4" | |||
| - block: | |||
| # matrix_synapse_rust_synapse_compress_state_eligible_rooms is a list | |||
| # of dictionaries like this: {'room_id': '!some-id', 'count': 2461329} | |||
| - set_fact: | |||
| matrix_synapse_rust_synapse_compress_state_eligible_rooms: "{{ matrix_synapse_rust_synapse_compress_state_find_rooms_command_result.stdout_lines[2] | from_json }}" | |||
| - name: Display rooms that will be compressed | |||
| debug: | |||
| msg: >- | |||
| The following rooms contain more than {{ matrix_synapse_rust_synapse_compress_state_min_state_groups_required }} state group rows | |||
| (configurable via `matrix_synapse_rust_synapse_compress_state_min_state_groups_required`) | |||
| and will be compressed: | |||
| {{ matrix_synapse_rust_synapse_compress_state_eligible_rooms }} | |||
| - name: Compress room state | |||
| include_tasks: "{{ role_path }}/tasks/rust-synapse-compress-state/compress_room.yml" | |||
| with_items: "{{ matrix_synapse_rust_synapse_compress_state_eligible_rooms }}" | |||
| loop_control: | |||
| loop_var: room_details | |||
| when: "matrix_synapse_rust_synapse_compress_state_find_rooms_command_result.stdout_lines[2] != ' '" | |||
| - name: Show notice about lack of rooms to compress | |||
| debug: | |||
| msg: >- | |||
| No rooms were found to contain more than {{ matrix_synapse_rust_synapse_compress_state_min_state_groups_required }} state group rows | |||
| (configurable via `matrix_synapse_rust_synapse_compress_state_min_state_groups_required`), | |||
| so there's nothing to compress. | |||
| when: "matrix_synapse_rust_synapse_compress_state_find_rooms_command_result.stdout_lines[2] == ' '" | |||