Add workers settings to configuration manual (#14086)

* Add workers settings to configuration manual
* Update `pusher_instances`
* update url to python logger
* update headlines
* update links after headline change
* remove link from `daemon process`

There is no docs in Synapse for this

* extend example for `federation_sender_instances` and `pusher_instances`
* more infos about stream writers
* add link to DAG
* update `pusher_instances`
* update `worker_listeners`
* update `stream_writers`
* Update `worker_name`

Co-authored-by: David Robertson <davidr@element.io>
This commit is contained in:
Dirk Klimpel 2022-10-27 15:39:47 +02:00 committed by GitHub
parent 4dc05f3019
commit 1357ae869f
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
5 changed files with 291 additions and 82 deletions

1
changelog.d/14086.doc Normal file
View file

@ -0,0 +1 @@
Add workers settings to [configuration manual](https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#individual-worker-configuration).

View file

@ -6,7 +6,7 @@
# Synapse also supports structured logging for machine readable logs which can
# be ingested by ELK stacks. See [2] for details.
#
# [1]: https://docs.python.org/3.7/library/logging.config.html#configuration-dictionary-schema
# [1]: https://docs.python.org/3/library/logging.config.html#configuration-dictionary-schema
# [2]: https://matrix-org.github.io/synapse/latest/structured_logging.html
version: 1

View file

@ -99,7 +99,7 @@ modules:
config: {}
```
---
## Server ##
## Server
Define your homeserver name and other base options.
@ -159,7 +159,7 @@ including _matrix/...). This is the same URL a user might enter into the
'Custom Homeserver URL' field on their client. If you use Synapse with a
reverse proxy, this should be the URL to reach Synapse via the proxy.
Otherwise, it should be the URL to reach Synapse's client HTTP listener (see
'listeners' below).
['listeners'](#listeners) below).
Defaults to `https://<server_name>/`.
@ -570,7 +570,7 @@ Example configuration:
delete_stale_devices_after: 1y
```
## Homeserver blocking ##
## Homeserver blocking
Useful options for Synapse admins.
---
@ -922,7 +922,7 @@ retention:
interval: 1d
```
---
## TLS ##
## TLS
Options related to TLS.
@ -1012,7 +1012,7 @@ federation_custom_ca_list:
- myCA3.pem
```
---
## Federation ##
## Federation
Options related to federation.
@ -1071,7 +1071,7 @@ Example configuration:
allow_device_name_lookup_over_federation: true
```
---
## Caching ##
## Caching
Options related to caching.
@ -1185,7 +1185,7 @@ file in Synapse's `contrib` directory, you can send a `SIGHUP` signal by using
`systemctl reload matrix-synapse`.
---
## Database ##
## Database
Config options related to database settings.
---
@ -1332,20 +1332,21 @@ databases:
cp_max: 10
```
---
## Logging ##
## Logging
Config options related to logging.
---
### `log_config`
This option specifies a yaml python logging config file as described [here](https://docs.python.org/3.7/library/logging.config.html#configuration-dictionary-schema).
This option specifies a yaml python logging config file as described
[here](https://docs.python.org/3/library/logging.config.html#configuration-dictionary-schema).
Example configuration:
```yaml
log_config: "CONFDIR/SERVERNAME.log.config"
```
---
## Ratelimiting ##
## Ratelimiting
Options related to ratelimiting in Synapse.
Each ratelimiting configuration is made of two parameters:
@ -1576,7 +1577,7 @@ Example configuration:
federation_rr_transactions_per_room_per_second: 40
```
---
## Media Store ##
## Media Store
Config options related to Synapse's media store.
---
@ -1766,7 +1767,7 @@ url_preview_ip_range_blacklist:
- 'ff00::/8'
- 'fec0::/10'
```
----
---
### `url_preview_ip_range_whitelist`
This option sets a list of IP address CIDR ranges that the URL preview spider is allowed
@ -1860,7 +1861,7 @@ Example configuration:
- 'fr;q=0.8'
- '*;q=0.7'
```
----
---
### `oembed`
oEmbed allows for easier embedding content from a website. It can be
@ -1877,7 +1878,7 @@ oembed:
- oembed/my_providers.json
```
---
## Captcha ##
## Captcha
See [here](../../CAPTCHA_SETUP.md) for full details on setting up captcha.
@ -1926,7 +1927,7 @@ Example configuration:
recaptcha_siteverify_api: "https://my.recaptcha.site"
```
---
## TURN ##
## TURN
Options related to adding a TURN server to Synapse.
---
@ -1947,7 +1948,7 @@ Example configuration:
```yaml
turn_shared_secret: "YOUR_SHARED_SECRET"
```
----
---
### `turn_username` and `turn_password`
The Username and password if the TURN server needs them and does not use a token.
@ -2366,7 +2367,7 @@ Example configuration:
```yaml
session_lifetime: 24h
```
----
---
### `refresh_access_token_lifetime`
Time that an access token remains valid for, if the session is using refresh tokens.
@ -2422,7 +2423,7 @@ nonrefreshable_access_token_lifetime: 24h
```
---
## Metrics ###
## Metrics
Config options related to metrics.
---
@ -2519,7 +2520,7 @@ Example configuration:
report_stats_endpoint: https://example.com/report-usage-stats/push
```
---
## API Configuration ##
## API Configuration
Config settings related to the client/server API
---
@ -2619,7 +2620,7 @@ Example configuration:
form_secret: <PRIVATE STRING>
```
---
## Signing Keys ##
## Signing Keys
Config options relating to signing keys
---
@ -2728,7 +2729,7 @@ Example configuration:
key_server_signing_keys_path: "key_server_signing_keys.key"
```
---
## Single sign-on integration ##
## Single sign-on integration
The following settings can be used to make Synapse use a single sign-on
provider for authentication, instead of its internal password database.
@ -3348,7 +3349,7 @@ email:
email_validation: "[%(server_name)s] Validate your email"
```
---
## Push ##
## Push
Configuration settings related to push notifications
---
@ -3381,7 +3382,7 @@ push:
group_unread_count_by_room: false
```
---
## Rooms ##
## Rooms
Config options relating to rooms.
---
@ -3627,7 +3628,7 @@ default_power_level_content_override:
```
---
## Opentracing ##
## Opentracing
Configuration options related to Opentracing support.
---
@ -3670,14 +3671,71 @@ opentracing:
false
```
---
## Workers ##
Configuration options related to workers.
## Coordinating workers
Configuration options related to workers which belong in the main config file
(usually called `homeserver.yaml`).
A Synapse deployment can scale horizontally by running multiple Synapse processes
called _workers_. Incoming requests are distributed between workers to handle higher
loads. Some workers are privileged and can accept requests from other workers.
As a result, the worker configuration is divided into two parts.
1. The first part (in this section of the manual) defines which shardable tasks
are delegated to privileged workers. This allows unprivileged workers to make
request a privileged worker to act on their behalf.
1. [The second part](#individual-worker-configuration)
controls the behaviour of individual workers in isolation.
For guidance on setting up workers, see the [worker documentation](../../workers.md).
---
### `worker_replication_secret`
A shared secret used by the replication APIs on the main process to authenticate
HTTP requests from workers.
The default, this value is omitted (equivalently `null`), which means that
traffic between the workers and the main process is not authenticated.
Example configuration:
```yaml
worker_replication_secret: "secret_secret"
```
---
### `start_pushers`
Controls sending of push notifications on the main process. Set to `false`
if using a [pusher worker](../../workers.md#synapseapppusher). Defaults to `true`.
Example configuration:
```yaml
start_pushers: false
```
---
### `pusher_instances`
It is possible to run multiple [pusher workers](../../workers.md#synapseapppusher),
in which case the work is balanced across them. Use this setting to list the pushers by
[`worker_name`](#worker_name). Ensure the main process and all pusher workers are
restarted after changing this option.
If no or only one pusher worker is configured, this setting is not necessary.
The main process will send out push notifications by default if you do not disable
it by setting [`start_pushers: false`](#start_pushers).
Example configuration:
```yaml
start_pushers: false
pusher_instances:
- pusher_worker1
- pusher_worker2
```
---
### `send_federation`
Controls sending of outbound federation transactions on the main process.
Set to false if using a federation sender worker. Defaults to true.
Set to `false` if using a [federation sender worker](../../workers.md#synapseappfederation_sender).
Defaults to `true`.
Example configuration:
```yaml
@ -3686,8 +3744,9 @@ send_federation: false
---
### `federation_sender_instances`
It is possible to run multiple federation sender workers, in which case the
work is balanced across them. Use this setting to list the senders.
It is possible to run multiple
[federation sender worker](../../workers.md#synapseappfederation_sender), in which
case the work is balanced across them. Use this setting to list the senders.
This configuration setting must be shared between all federation sender workers, and if
changed all federation sender workers must be stopped at the same time and then
@ -3696,14 +3755,19 @@ events may be dropped).
Example configuration:
```yaml
send_federation: false
federation_sender_instances:
- federation_sender1
```
---
### `instance_map`
When using workers this should be a map from worker name to the
When using workers this should be a map from [`worker_name`](#worker_name) to the
HTTP replication listener of the worker, if configured.
Each worker declared under [`stream_writers`](../../workers.md#stream-writers) needs
a HTTP replication listener, and that listener should be included in the `instance_map`.
(The main process also needs an HTTP replication listener, but it should not be
listed in the `instance_map`.)
Example configuration:
```yaml
@ -3716,8 +3780,11 @@ instance_map:
### `stream_writers`
Experimental: When using workers you can define which workers should
handle event persistence and typing notifications. Any worker
specified here must also be in the `instance_map`.
handle writing to streams such as event persistence and typing notifications.
Any worker specified here must also be in the [`instance_map`](#instance_map).
See the list of available streams in the
[worker documentation](../../workers.md#stream-writers).
Example configuration:
```yaml
@ -3728,29 +3795,18 @@ stream_writers:
---
### `run_background_tasks_on`
The worker that is used to run background tasks (e.g. cleaning up expired
data). If not provided this defaults to the main process.
The [worker](../../workers.md#background-tasks) that is used to run
background tasks (e.g. cleaning up expired data). If not provided this
defaults to the main process.
Example configuration:
```yaml
run_background_tasks_on: worker1
```
---
### `worker_replication_secret`
A shared secret used by the replication APIs to authenticate HTTP requests
from workers.
By default this is unused and traffic is not authenticated.
Example configuration:
```yaml
worker_replication_secret: "secret_secret"
```
### `redis`
Configuration for Redis when using workers. This *must* be enabled when
using workers (unless using old style direct TCP configuration).
Configuration for Redis when using workers. This *must* be enabled when using workers.
This setting has the following sub-options:
* `enabled`: whether to use Redis support. Defaults to false.
* `host` and `port`: Optional host and port to use to connect to redis. Defaults to
@ -3765,7 +3821,123 @@ redis:
port: 6379
password: <secret_password>
```
## Background Updates ##
---
## Individual worker configuration
These options configure an individual worker, in its worker configuration file.
They should be not be provided when configuring the main process.
Note also the configuration above for
[coordinating a cluster of workers](#coordinating-workers).
For guidance on setting up workers, see the [worker documentation](../../workers.md).
---
### `worker_app`
The type of worker. The currently available worker applications are listed
in [worker documentation](../../workers.md#available-worker-applications).
The most common worker is the
[`synapse.app.generic_worker`](../../workers.md#synapseappgeneric_worker).
Example configuration:
```yaml
worker_app: synapse.app.generic_worker
```
---
### `worker_name`
A unique name for the worker. The worker needs a name to be addressed in
further parameters and identification in log files. We strongly recommend
giving each worker a unique `worker_name`.
Example configuration:
```yaml
worker_name: generic_worker1
```
---
### `worker_replication_host`
The HTTP replication endpoint that it should talk to on the main Synapse process.
The main Synapse process defines this with a `replication` resource in
[`listeners` option](#listeners).
Example configuration:
```yaml
worker_replication_host: 127.0.0.1
```
---
### `worker_replication_http_port`
The HTTP replication port that it should talk to on the main Synapse process.
The main Synapse process defines this with a `replication` resource in
[`listeners` option](#listeners).
Example configuration:
```yaml
worker_replication_http_port: 9093
```
---
### `worker_listeners`
A worker can handle HTTP requests. To do so, a `worker_listeners` option
must be declared, in the same way as the [`listeners` option](#listeners)
in the shared config.
Workers declared in [`stream_writers`](#stream_writers) will need to include a
`replication` listener here, in order to accept internal HTTP requests from
other workers.
Example configuration:
```yaml
worker_listeners:
- type: http
port: 8083
resources:
- names: [client, federation]
```
---
### `worker_daemonize`
Specifies whether the worker should be started as a daemon process.
If Synapse is being managed by [systemd](../../systemd-with-workers/README.md), this option
must be omitted or set to `false`.
Defaults to `false`.
Example configuration:
```yaml
worker_daemonize: true
```
---
### `worker_pid_file`
When running a worker as a daemon, we need a place to store the
[PID](https://en.wikipedia.org/wiki/Process_identifier) of the worker.
This option defines the location of that "pid file".
This option is required if `worker_daemonize` is `true` and ignored
otherwise. It has no default.
See also the [`pid_file` option](#pid_file) option for the main Synapse process.
Example configuration:
```yaml
worker_pid_file: DATADIR/generic_worker1.pid
```
---
### `worker_log_config`
This option specifies a yaml python logging config file as described
[here](https://docs.python.org/3/library/logging.config.html#configuration-dictionary-schema).
See also the [`log_config` option](#log_config) option for the main Synapse process.
Example configuration:
```yaml
worker_log_config: /etc/matrix-synapse/generic-worker-log.yaml
```
---
## Background Updates
Configuration settings related to background updates.
---

View file

@ -88,10 +88,12 @@ shared configuration file.
### Shared configuration
Normally, only a couple of changes are needed to make an existing configuration
file suitable for use with workers. First, you need to enable an "HTTP replication
listener" for the main process; and secondly, you need to enable redis-based
replication. Optionally, a shared secret can be used to authenticate HTTP
traffic between workers. For example:
file suitable for use with workers. First, you need to enable an
["HTTP replication listener"](usage/configuration/config_documentation.md#listeners)
for the main process; and secondly, you need to enable
[redis-based replication](usage/configuration/config_documentation.md#redis).
Optionally, a [shared secret](usage/configuration/config_documentation.md#worker_replication_secret)
can be used to authenticate HTTP traffic between workers. For example:
```yaml
# extend the existing `listeners` section. This defines the ports that the
@ -111,25 +113,28 @@ redis:
enabled: true
```
See the [configuration manual](usage/configuration/config_documentation.html) for the full documentation of each option.
See the [configuration manual](usage/configuration/config_documentation.md)
for the full documentation of each option.
Under **no circumstances** should the replication listener be exposed to the
public internet; replication traffic is:
* always unencrypted
* unauthenticated, unless `worker_replication_secret` is configured
* unauthenticated, unless [`worker_replication_secret`](usage/configuration/config_documentation.md#worker_replication_secret)
is configured
### Worker configuration
In the config file for each worker, you must specify:
* The type of worker (`worker_app`). The currently available worker applications are listed below.
* A unique name for the worker (`worker_name`).
* The type of worker ([`worker_app`](usage/configuration/config_documentation.md#worker_app)).
The currently available worker applications are listed [below](#available-worker-applications).
* A unique name for the worker ([`worker_name`](usage/configuration/config_documentation.md#worker_name)).
* The HTTP replication endpoint that it should talk to on the main synapse process
(`worker_replication_host` and `worker_replication_http_port`)
* If handling HTTP requests, a `worker_listeners` option with an `http`
listener, in the same way as the [`listeners`](usage/configuration/config_documentation.md#listeners)
option in the shared config.
([`worker_replication_host`](usage/configuration/config_documentation.md#worker_replication_host) and
[`worker_replication_http_port`](usage/configuration/config_documentation.md#worker_replication_http_port)).
* If handling HTTP requests, a [`worker_listeners`](usage/configuration/config_documentation.md#worker_listeners) option
with an `http` listener.
* If handling the `^/_matrix/client/v3/keys/upload` endpoint, the HTTP URI for
the main process (`worker_main_http_uri`).
@ -146,7 +151,6 @@ plain HTTP endpoint on port 8083 separately serving various endpoints, e.g.
Obviously you should configure your reverse-proxy to route the relevant
endpoints to the worker (`localhost:8083` in the above example).
### Running Synapse with workers
Finally, you need to start your worker processes. This can be done with either
@ -288,7 +292,8 @@ For multiple workers not handling the SSO endpoints properly, see
[#9427](https://github.com/matrix-org/synapse/issues/9427).
Note that a [HTTP listener](usage/configuration/config_documentation.md#listeners)
with `client` and `federation` `resources` must be configured in the `worker_listeners`
with `client` and `federation` `resources` must be configured in the
[`worker_listeners`](usage/configuration/config_documentation.md#worker_listeners)
option in the worker config.
#### Load balancing
@ -331,9 +336,10 @@ of the main process to a particular worker.
To enable this, the worker must have a
[HTTP `replication` listener](usage/configuration/config_documentation.md#listeners) configured,
have a `worker_name` and be listed in the `instance_map` config. The same worker
can handle multiple streams, but unless otherwise documented, each stream can only
have a single writer.
have a [`worker_name`](usage/configuration/config_documentation.md#worker_name)
and be listed in the [`instance_map`](usage/configuration/config_documentation.md#instance_map)
config. The same worker can handle multiple streams, but unless otherwise documented,
each stream can only have a single writer.
For example, to move event persistence off to a dedicated worker, the shared
configuration would include:
@ -360,9 +366,26 @@ streams and the endpoints associated with them:
##### The `events` stream
The `events` stream experimentally supports having multiple writers, where work
is sharded between them by room ID. Note that you *must* restart all worker
instances when adding or removing event persisters. An example `stream_writers`
The `events` stream experimentally supports having multiple writer workers, where load
is sharded between them by room ID. Each writer is called an _event persister_. They are
responsible for
- receiving new events,
- linking them to those already in the room [DAG](development/room-dag-concepts.md),
- persisting them to the DB, and finally
- updating the events stream.
Because load is sharded in this way, you *must* restart all worker instances when
adding or removing event persisters.
An `event_persister` should not be mistaken for an `event_creator`.
An `event_creator` listens for requests from clients to create new events and does
so. It will then pass those events over HTTP replication to any configured event
persisters (or the main process if none are configured).
Note that `event_creator`s and `event_persister`s are implemented using the same
[`synapse.app.generic_worker`](#synapse.app.generic_worker).
An example [`stream_writers`](usage/configuration/config_documentation.md#stream_writers)
configuration with multiple writers:
```yaml
@ -416,16 +439,18 @@ worker. Background tasks are run periodically or started via replication. Exactl
which tasks are configured to run depends on your Synapse configuration (e.g. if
stats is enabled). This worker doesn't handle any REST endpoints itself.
To enable this, the worker must have a `worker_name` and can be configured to run
background tasks. For example, to move background tasks to a dedicated worker,
the shared configuration would include:
To enable this, the worker must have a unique
[`worker_name`](usage/configuration/config_documentation.md#worker_name)
and can be configured to run background tasks. For example, to move background tasks
to a dedicated worker, the shared configuration would include:
```yaml
run_background_tasks_on: background_worker
```
You might also wish to investigate the `update_user_directory_from_worker` and
`media_instance_running_background_jobs` settings.
You might also wish to investigate the
[`update_user_directory_from_worker`](#updating-the-user-directory) and
[`media_instance_running_background_jobs`](#synapseappmedia_repository) settings.
An example for a dedicated background worker instance:
@ -478,13 +503,17 @@ worker application type.
### `synapse.app.pusher`
Handles sending push notifications to sygnal and email. Doesn't handle any
REST endpoints itself, but you should set `start_pushers: False` in the
REST endpoints itself, but you should set
[`start_pushers: false`](usage/configuration/config_documentation.md#start_pushers) in the
shared configuration file to stop the main synapse sending push notifications.
To run multiple instances at once the `pusher_instances` option should list all
pusher instances by their worker name, e.g.:
To run multiple instances at once the
[`pusher_instances`](usage/configuration/config_documentation.md#pusher_instances)
option should list all pusher instances by their
[`worker_name`](usage/configuration/config_documentation.md#worker_name), e.g.:
```yaml
start_pushers: false
pusher_instances:
- pusher_worker1
- pusher_worker2
@ -512,15 +541,20 @@ Note this worker cannot be load-balanced: only one instance should be active.
### `synapse.app.federation_sender`
Handles sending federation traffic to other servers. Doesn't handle any
REST endpoints itself, but you should set `send_federation: False` in the
shared configuration file to stop the main synapse sending this traffic.
REST endpoints itself, but you should set
[`send_federation: false`](usage/configuration/config_documentation.md#send_federation)
in the shared configuration file to stop the main synapse sending this traffic.
If running multiple federation senders then you must list each
instance in the `federation_sender_instances` option by their `worker_name`.
instance in the
[`federation_sender_instances`](usage/configuration/config_documentation.md#federation_sender_instances)
option by their
[`worker_name`](usage/configuration/config_documentation.md#worker_name).
All instances must be stopped and started when adding or removing instances.
For example:
```yaml
send_federation: false
federation_sender_instances:
- federation_sender1
- federation_sender2
@ -547,7 +581,9 @@ Handles the media repository. It can handle all endpoints starting with:
^/_synapse/admin/v1/quarantine_media/.*$
^/_synapse/admin/v1/users/.*/media$
You should also set `enable_media_repo: False` in the shared configuration
You should also set
[`enable_media_repo: False`](usage/configuration/config_documentation.md#enable_media_repo)
in the shared configuration
file to stop the main synapse running background jobs related to managing the
media repository. Note that doing so will prevent the main process from being
able to handle the above endpoints.

View file

@ -53,7 +53,7 @@ DEFAULT_LOG_CONFIG = Template(
# Synapse also supports structured logging for machine readable logs which can
# be ingested by ELK stacks. See [2] for details.
#
# [1]: https://docs.python.org/3.7/library/logging.config.html#configuration-dictionary-schema
# [1]: https://docs.python.org/3/library/logging.config.html#configuration-dictionary-schema
# [2]: https://matrix-org.github.io/synapse/latest/structured_logging.html
version: 1