2016-08-19 20:55:57 +03:00
|
|
|
Scaling synapse via workers
|
2017-11-21 16:22:43 +03:00
|
|
|
===========================
|
2016-08-19 20:55:57 +03:00
|
|
|
|
|
|
|
Synapse has experimental support for splitting out functionality into
|
|
|
|
multiple separate python processes, helping greatly with scalability. These
|
|
|
|
processes are called 'workers', and are (eventually) intended to scale
|
|
|
|
horizontally independently.
|
|
|
|
|
2017-11-21 16:22:43 +03:00
|
|
|
All of the below is highly experimental and subject to change as Synapse evolves,
|
|
|
|
but documenting it here to help folks needing highly scalable Synapses similar
|
|
|
|
to the one running matrix.org!
|
|
|
|
|
2016-08-19 20:55:57 +03:00
|
|
|
All processes continue to share the same database instance, and as such, workers
|
|
|
|
only work with postgres based synapse deployments (sharing a single sqlite
|
|
|
|
across multiple processes is a recipe for disaster, plus you should be using
|
|
|
|
postgres anyway if you care about scalability).
|
|
|
|
|
|
|
|
The workers communicate with the master synapse process via a synapse-specific
|
2017-04-11 18:19:52 +03:00
|
|
|
TCP protocol called 'replication' - analogous to MySQL or Postgres style
|
2016-08-19 20:55:57 +03:00
|
|
|
database replication; feeding a stream of relevant data to the workers so they
|
|
|
|
can be kept in sync with the main synapse process and database state.
|
|
|
|
|
2017-11-21 16:22:43 +03:00
|
|
|
Configuration
|
|
|
|
-------------
|
|
|
|
|
|
|
|
To make effective use of the workers, you will need to configure an HTTP
|
|
|
|
reverse-proxy such as nginx or haproxy, which will direct incoming requests to
|
|
|
|
the correct worker, or to the main synapse instance. Note that this includes
|
|
|
|
requests made to the federation port. The caveats regarding running a
|
|
|
|
reverse-proxy on the federation port still apply (see
|
|
|
|
https://github.com/matrix-org/synapse/blob/master/README.rst#reverse-proxying-the-federation-port).
|
|
|
|
|
2018-02-12 20:18:07 +03:00
|
|
|
To enable workers, you need to add two replication listeners to the master
|
|
|
|
synapse, e.g.::
|
2016-08-19 20:55:57 +03:00
|
|
|
|
|
|
|
listeners:
|
2018-02-12 20:18:07 +03:00
|
|
|
# The TCP replication port
|
2016-08-19 20:55:57 +03:00
|
|
|
- port: 9092
|
2016-08-19 21:16:55 +03:00
|
|
|
bind_address: '127.0.0.1'
|
2017-04-11 18:19:52 +03:00
|
|
|
type: replication
|
2018-02-12 20:18:07 +03:00
|
|
|
# The HTTP replication port
|
|
|
|
- port: 9093
|
|
|
|
bind_address: '127.0.0.1'
|
|
|
|
type: http
|
|
|
|
resources:
|
|
|
|
- names: [replication]
|
2016-08-19 20:55:57 +03:00
|
|
|
|
2018-02-12 20:18:07 +03:00
|
|
|
Under **no circumstances** should these replication API listeners be exposed to
|
|
|
|
the public internet; it currently implements no authentication whatsoever and is
|
2017-04-11 18:19:52 +03:00
|
|
|
unencrypted.
|
2016-08-19 21:16:55 +03:00
|
|
|
|
2018-02-12 20:18:07 +03:00
|
|
|
(Roughly, the TCP port is used for streaming data from the master to the
|
2018-02-13 20:53:56 +03:00
|
|
|
workers, and the HTTP port for the workers to send data to the main
|
2018-02-12 20:18:07 +03:00
|
|
|
synapse process.)
|
|
|
|
|
2017-11-21 16:22:43 +03:00
|
|
|
You then create a set of configs for the various worker processes. These
|
|
|
|
should be worker configuration files, and should be stored in a dedicated
|
2018-04-04 17:45:51 +03:00
|
|
|
subdirectory, to allow synctl to manipulate them. An additional configuration
|
|
|
|
for the master synapse process will need to be created because the process will
|
|
|
|
not be started automatically. That configuration should look like this::
|
|
|
|
|
|
|
|
worker_app: synapse.app.homeserver
|
|
|
|
daemonize: true
|
2016-08-19 20:55:57 +03:00
|
|
|
|
|
|
|
Each worker configuration file inherits the configuration of the main homeserver
|
|
|
|
configuration file. You can then override configuration specific to that worker,
|
|
|
|
e.g. the HTTP listener that it provides (if any); logging configuration; etc.
|
|
|
|
You should minimise the number of overrides though to maintain a usable config.
|
|
|
|
|
2017-11-21 16:22:43 +03:00
|
|
|
You must specify the type of worker application (``worker_app``). The currently
|
|
|
|
available worker applications are listed below. You must also specify the
|
2018-02-12 20:18:07 +03:00
|
|
|
replication endpoints that it's talking to on the main synapse process.
|
|
|
|
``worker_replication_host`` should specify the host of the main synapse,
|
|
|
|
``worker_replication_port`` should point to the TCP replication listener port and
|
|
|
|
``worker_replication_http_port`` should point to the HTTP replication port.
|
2016-08-19 20:55:57 +03:00
|
|
|
|
2018-02-13 20:53:56 +03:00
|
|
|
Currently, only the ``event_creator`` worker requires specifying
|
|
|
|
``worker_replication_http_port``.
|
|
|
|
|
2016-08-19 20:55:57 +03:00
|
|
|
For instance::
|
|
|
|
|
|
|
|
worker_app: synapse.app.synchrotron
|
|
|
|
|
|
|
|
# The replication listener on the synapse to talk to.
|
2017-04-11 18:19:52 +03:00
|
|
|
worker_replication_host: 127.0.0.1
|
|
|
|
worker_replication_port: 9092
|
2018-02-12 20:18:07 +03:00
|
|
|
worker_replication_http_port: 9093
|
2016-08-19 20:55:57 +03:00
|
|
|
|
|
|
|
worker_listeners:
|
|
|
|
- type: http
|
|
|
|
port: 8083
|
|
|
|
resources:
|
|
|
|
- names:
|
|
|
|
- client
|
|
|
|
|
|
|
|
worker_daemonize: True
|
|
|
|
worker_pid_file: /home/matrix/synapse/synchrotron.pid
|
|
|
|
worker_log_config: /home/matrix/synapse/config/synchrotron_log_config.yaml
|
|
|
|
|
2016-08-19 21:16:55 +03:00
|
|
|
...is a full configuration for a synchrotron worker instance, which will expose a
|
2017-11-21 16:22:43 +03:00
|
|
|
plain HTTP ``/sync`` endpoint on port 8083 separately from the ``/sync`` endpoint provided
|
2016-08-19 20:55:57 +03:00
|
|
|
by the main synapse.
|
|
|
|
|
2017-11-21 16:22:43 +03:00
|
|
|
Obviously you should configure your reverse-proxy to route the relevant
|
|
|
|
endpoints to the worker (``localhost:8083`` in the above example).
|
2016-08-19 20:55:57 +03:00
|
|
|
|
|
|
|
Finally, to actually run your worker-based synapse, you must pass synctl the -a
|
|
|
|
commandline option to tell it to operate on all the worker configurations found
|
|
|
|
in the given directory, e.g.::
|
|
|
|
|
|
|
|
synctl -a $CONFIG/workers start
|
|
|
|
|
|
|
|
Currently one should always restart all workers when restarting or upgrading
|
|
|
|
synapse, unless you explicitly know it's safe not to. For instance, restarting
|
2016-08-19 21:16:55 +03:00
|
|
|
synapse without restarting all the synchrotrons may result in broken typing
|
2016-08-19 20:55:57 +03:00
|
|
|
notifications.
|
|
|
|
|
|
|
|
To manipulate a specific worker, you pass the -w option to synctl::
|
|
|
|
|
2016-08-19 21:16:55 +03:00
|
|
|
synctl -w $CONFIG/workers/synchrotron.yaml restart
|
2016-08-19 20:55:57 +03:00
|
|
|
|
2017-11-21 16:22:43 +03:00
|
|
|
|
|
|
|
Available worker applications
|
|
|
|
-----------------------------
|
|
|
|
|
|
|
|
``synapse.app.pusher``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Handles sending push notifications to sygnal and email. Doesn't handle any
|
|
|
|
REST endpoints itself, but you should set ``start_pushers: False`` in the
|
|
|
|
shared configuration file to stop the main synapse sending these notifications.
|
|
|
|
|
|
|
|
Note this worker cannot be load-balanced: only one instance should be active.
|
|
|
|
|
|
|
|
``synapse.app.synchrotron``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
The synchrotron handles ``sync`` requests from clients. In particular, it can
|
|
|
|
handle REST endpoints matching the following regular expressions::
|
|
|
|
|
|
|
|
^/_matrix/client/(v2_alpha|r0)/sync$
|
|
|
|
^/_matrix/client/(api/v1|v2_alpha|r0)/events$
|
|
|
|
^/_matrix/client/(api/v1|r0)/initialSync$
|
|
|
|
^/_matrix/client/(api/v1|r0)/rooms/[^/]+/initialSync$
|
|
|
|
|
|
|
|
The above endpoints should all be routed to the synchrotron worker by the
|
|
|
|
reverse-proxy configuration.
|
|
|
|
|
|
|
|
It is possible to run multiple instances of the synchrotron to scale
|
|
|
|
horizontally. In this case the reverse-proxy should be configured to
|
|
|
|
load-balance across the instances, though it will be more efficient if all
|
|
|
|
requests from a particular user are routed to a single instance. Extracting
|
|
|
|
a userid from the access token is currently left as an exercise for the reader.
|
|
|
|
|
|
|
|
``synapse.app.appservice``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Handles sending output traffic to Application Services. Doesn't handle any
|
|
|
|
REST endpoints itself, but you should set ``notify_appservices: False`` in the
|
|
|
|
shared configuration file to stop the main synapse sending these notifications.
|
|
|
|
|
|
|
|
Note this worker cannot be load-balanced: only one instance should be active.
|
|
|
|
|
|
|
|
``synapse.app.federation_reader``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Handles a subset of federation endpoints. In particular, it can handle REST
|
|
|
|
endpoints matching the following regular expressions::
|
|
|
|
|
|
|
|
^/_matrix/federation/v1/event/
|
|
|
|
^/_matrix/federation/v1/state/
|
|
|
|
^/_matrix/federation/v1/state_ids/
|
|
|
|
^/_matrix/federation/v1/backfill/
|
|
|
|
^/_matrix/federation/v1/get_missing_events/
|
|
|
|
^/_matrix/federation/v1/publicRooms
|
|
|
|
|
|
|
|
The above endpoints should all be routed to the federation_reader worker by the
|
|
|
|
reverse-proxy configuration.
|
|
|
|
|
|
|
|
``synapse.app.federation_sender``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Handles sending federation traffic to other servers. Doesn't handle any
|
|
|
|
REST endpoints itself, but you should set ``send_federation: False`` in the
|
|
|
|
shared configuration file to stop the main synapse sending this traffic.
|
|
|
|
|
|
|
|
Note this worker cannot be load-balanced: only one instance should be active.
|
|
|
|
|
|
|
|
``synapse.app.media_repository``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Handles the media repository. It can handle all endpoints starting with::
|
|
|
|
|
|
|
|
/_matrix/media/
|
|
|
|
|
2017-11-21 16:29:39 +03:00
|
|
|
You should also set ``enable_media_repo: False`` in the shared configuration
|
|
|
|
file to stop the main synapse running background jobs related to managing the
|
|
|
|
media repository.
|
|
|
|
|
2017-11-21 16:22:43 +03:00
|
|
|
Note this worker cannot be load-balanced: only one instance should be active.
|
|
|
|
|
|
|
|
``synapse.app.client_reader``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Handles client API endpoints. It can handle REST endpoints matching the
|
|
|
|
following regular expressions::
|
|
|
|
|
|
|
|
^/_matrix/client/(api/v1|r0|unstable)/publicRooms$
|
|
|
|
|
|
|
|
``synapse.app.user_dir``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Handles searches in the user directory. It can handle REST endpoints matching
|
|
|
|
the following regular expressions::
|
|
|
|
|
|
|
|
^/_matrix/client/(api/v1|r0|unstable)/user_directory/search$
|
|
|
|
|
|
|
|
``synapse.app.frontend_proxy``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Proxies some frequently-requested client endpoints to add caching and remove
|
|
|
|
load from the main synapse. It can handle REST endpoints matching the following
|
|
|
|
regular expressions::
|
|
|
|
|
|
|
|
^/_matrix/client/(api/v1|r0|unstable)/keys/upload
|
|
|
|
|
|
|
|
It will proxy any requests it cannot handle to the main synapse instance. It
|
|
|
|
must therefore be configured with the location of the main instance, via
|
|
|
|
the ``worker_main_http_uri`` setting in the frontend_proxy worker configuration
|
|
|
|
file. For example::
|
|
|
|
|
|
|
|
worker_main_http_uri: http://127.0.0.1:8008
|
2018-02-06 20:23:13 +03:00
|
|
|
|
|
|
|
|
|
|
|
``synapse.app.event_creator``
|
|
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
|
|
|
|
Handles non-state event creation. It can handle REST endpoints matching:
|
|
|
|
|
|
|
|
^/_matrix/client/(api/v1|r0|unstable)/rooms/.*/send
|
|
|
|
|
|
|
|
It will create events locally and then send them on to the main synapse
|
|
|
|
instance to be persisted and handled.
|