The main use case is to see how many requests are being made, and how
many are second/third/etc attempts. If there are large number of retries
then that likely indicates a delivery problem.
If the script fails (or is CTRL-C'ed) between porting some of the events table and copying of the sequences then the port script will immediately die if run again due to the postgres DB having inconsistencies between sequences and tables.
The fix is to move the porting of sequences to before porting the tables, so that there is never a period where the Postgres DB is inconsistent. To do that we need to change how we port the sequences so that it calculates the values from the SQLite DB rather than the Postgres DB.
Fixes#8619
This should hopefully speed up `get_auth_chain_difference` a bit in the case of repeated state res on the same rooms.
`get_auth_chain_difference` does a breadth first walk of the auth graphs by repeatedly looking up events' auth events. Different state resolutions on the same room will end up doing a lot of the same event to auth events lookups, so by caching them we should speed things up in cases of repeated state resolutions on the same room.
`adbapi.ConnectionPool` let's you turn on auto reconnect of DB connections. This is off by default.
As far as I can tell if its not enabled dead connections never get removed from the pool.
Maybe helps #8574
* Check support room has only two users
* Create 8728.bugfix
* Update synapse/server_notices/server_notices_manager.py
Co-authored-by: Erik Johnston <erik@matrix.org>
Co-authored-by: Erik Johnston <erik@matrix.org>
If SSO login is used (e.g. SAML) in a multi worker setup, it should be mentioned that currently all SAML logins must run on the same worker, see https://github.com/matrix-org/synapse/issues/7530
Also, if you are using different ports (for example 443 and 8448) in a reverse proxy for client and federation, the path `/_matrix/media` on the client and federation port must point to the listener of the `media_repository` worker, otherwise you'll get a 404 on the federation port for the path `/_matrix/media`, if a remote server is trying to get the media object on federation port, see https://github.com/matrix-org/synapse/issues/8695
This PR adds some documentation that:
* Describes who the audience for the `docs/`, `docs/dev/` and `docs/admin/` directories are, as well as Synapse's wiki page.
* Stresses that we'd like all documentation to be down in markdown.
I idly noticed that these lists were out of sync with each other, causing us to miss a table in a test case (`local_invites`). Let's consolidate this list instead to prevent this from happening in the future.
This PR fixes two things:
* Corrects the copy/paste error of telling the client their displayname is wrong when they are submitting an `avatar_url`.
* Returns a `M_INVALID_PARAM` instead of `M_UNKNOWN` for non-str type parameters.
Reported by @t3chguy.
* Tie together matches_user_in_member_list and get_users_in_room
* changelog
* Remove type to fix mypy
* Add `on_invalidate` to the function signature in the hopes that may make things work well
* Remove **kwargs
* Update 8676.bugfix
* Tie together matches_user_in_member_list and get_users_in_room
* changelog
* Remove type to fix mypy
* Add `on_invalidate` to the function signature in the hopes that may make things work well
* Remove **kwargs
* Update 8676.bugfix
We do it this way round so that only the "owner" can delete the access token (i.e. `/logout/all` by the "owner" also deletes that token, but `/logout/all` by the "target user" doesn't).
A future PR will add an API for creating such a token.
When the target user and authenticated entity are different the `Processed request` log line will be logged with a: `{@admin:server as @bob:server} ...`. I'm not convinced by that format (especially since it adds spaces in there, making it harder to use `cut -d ' '` to chop off the start of log lines). Suggestions welcome.
Cached functions accept an `on_invalidate` function, which we failed to add to the type signature. It's rarely used in the files that we have typed, which is why we haven't noticed it before.
otherwise non-state events get written as `<FrozenEvent ... state_key='None'>`
which is indistinguishable from state events with the actual state_key `None`.
This modifies the configuration of structured logging to be usable from
the standard Python logging configuration.
This also separates the formatting of logs from the transport allowing
JSON logs to files or standard logs to sockets.
Not being able to serialise `frozendicts` is fragile, and it's annoying to have
to think about which serialiser you want. There's no real downside to
supporting frozendicts, so let's just have one json encoder.
I was trying to make it so that we didn't have to start a background task when handling RDATA, but that is a bigger job (due to all the code in `generic_worker`). However I still think not pulling the event from the DB may help reduce some DB usage due to replication, even if most workers will simply go and pull that event from the DB later anyway.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
This allows trailing commas in multi-line arg lists.
Minor, but we might as well keep our formatting current with regard to
our minimum supported Python version.
Signed-off-by: Dan Callahan <danc@element.io>
The test runner isn't present in the `[all]` set of extras, so the
previous instructions did not work without also installing `[test]`.
Note that this does not include the `[lint]` extras, since those do not
install on all supported Python versions (specifically, isort 5.x
requires Python 3.6, while we still support 3.5). Instructions for that
are included in our pull request template, so we should be fine there.
I've also dropped the `--no-use-pep517` arg to `pip install` since it
seems to have been added to address a temporary regression in pip 19.1
which was fixed in pip 19.1.1 the following month.
Lastly, updated the example output of the test suite to set more
realistic expectations around run time.
Signed-off-by: Dan Callahan <danc@element.io>
This is a requirement for [knocking](https://github.com/matrix-org/synapse/pull/6739), and is abstracting some code that was originally used by the invite flow. I'm separating it out into this PR as it's a fairly contained change.
For a bit of context: when you invite a user to a room, you send them [stripped state events](https://matrix.org/docs/spec/server_server/unstable#put-matrix-federation-v2-invite-roomid-eventid) as part of `invite_room_state`. This is so that their client can display useful information such as the room name and avatar. The same requirement applies to knocking, as it would be nice for clients to be able to display a list of rooms you've knocked on - room name and avatar included.
The reason we're sending membership events down as well is in the case that you are invited to a room that does not have an avatar or name set. In that case, the client should use the displayname/avatar of the inviter. That information is located in the inviter's membership event.
This is optional as knocks don't really have any user in the room to link up to. When you knock on a room, your knock is sent by you and inserted into the room. It wouldn't *really* make sense to show the avatar of a random user - plus it'd be a data leak. So I've opted not to send membership events to the client here. The UX on the client for when you knock on a room without a name/avatar is a separate problem.
In essence this is just moving some inline code to a reusable store method.
it seems to be possible that only one of them ends up to be cached.
when this was the case, the missing one was not fetched via federation,
and clients then failed to validate cross-signed devices.
Signed-off-by: Jonas Jelten <jj@sft.lol>
Split admin API for reported events in detail und list view.
API was introduced with #8217 in synapse v.1.21.0.
It makes the list (`GET /_synapse/admin/v1/event_reports`) less complex and provides a better overview.
The details can be queried with: `GET /_synapse/admin/v1/event_reports/<report_id>`.
It is similar to room and users API.
It is a kind of regression in `GET /_synapse/admin/v1/event_reports`. `event_json` was removed. But the api was introduced one version before and it is an admin API (not under spec).
Signed-off-by: Dirk Klimpel dirk@klimpel.org
==============================
Bugfixes
--------
- Fix bugs where ephemeral events were not sent to appservices. Broke in v1.22.0rc1. ([\#8648](https://github.com/matrix-org/synapse/issues/8648), [\#8656](https://github.com/matrix-org/synapse/issues/8656))
- Fix `user_daily_visits` table to not have duplicate rows per user/device due to multiple user agents. Broke in v1.22.0rc1. ([\#8654](https://github.com/matrix-org/synapse/issues/8654))
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEBTGR3/RnAzBGUif3pULk7RsPrAkFAl+W6QIQHGVyaWtAbWF0
cml4Lm9yZwAKCRClQuTtGw+sCR/EB/9Afs/QiqD7uJ7XtAASwlOxcpSrvB6xSMzU
VIiYLrUSCvHe8SwcPG6nToy69hHmwLBPu9e8gvlQkOY4nJpW9VTGvhFcP5eCgeKd
vDp+elVD6BPg+6S3RaI/F5hyQ7gn9qZ/WexkVixDkVs5eAL3gNlZcjw9RwaT9ei9
fEOYlMi1CRSRzKVBPMEFCAvOpJJ8rA/iq4JK7+P0TplkhD+HydG0cYr6OoL7CmaD
L/au565Ynp6o0hSpNpTFqW6C5W5JRvz/TyG9/N95h9+WqubZzsPPDUzh98B3eBoX
43Wu9Qoc3mNamf0++HbHGUPI8dRV1gssxL01hs/iLUZz7CxXY+Ir
=Z3sQ
-----END PGP SIGNATURE-----
Merge tag 'v1.22.0rc2' into develop
Synapse 1.22.0rc2 (2020-10-26)
==============================
Bugfixes
--------
- Fix bugs where ephemeral events were not sent to appservices. Broke in v1.22.0rc1. ([\#8648](https://github.com/matrix-org/synapse/issues/8648), [\#8656](https://github.com/matrix-org/synapse/issues/8656))
- Fix `user_daily_visits` table to not have duplicate rows per user/device due to multiple user agents. Broke in v1.22.0rc1. ([\#8654](https://github.com/matrix-org/synapse/issues/8654))
* Fix user_daily_visits to not have duplicate rows for UA.
Fixes#8641.
* Newsfile
* Fix typo.
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
#8567 started a span for every background process. This is good as it means all Synapse code that gets run should be in a span (unless in the sentinel logging context), but it means we generate about 15x the number of spans as we did previously.
This PR attempts to reduce that number by a) not starting one for send commands to Redis, and b) deferring starting background processes until after we're sure they're necessary.
I don't really know how much this will help.
* Limit AS transactions to 100 events
* Update changelog.d/8606.feature
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
* Add tests
* Update synapse/appservice/scheduler.py
Co-authored-by: Andrew Morgan <1342360+anoadragon453@users.noreply.github.com>
I noticed in https://github.com/matrix-org/synapse/issues/8575 that the `end_error` variable in `synapse_port_db` is set to an `Exception`, even though later we expect it to be a `str`.
This PR simply casts an exception raised to a string. I'm doing this instead of having `end_error` be of type exception as we explicitly set `end_error` to a str here:
d25eb8f370/scripts/synapse_port_db (L542-L547)
This whole file could probably use some heavy refactoring, but until then at least this fix will prevent exception contents from being hidden from us and users.
* Add `DeferredCache.get_immediate` method
A bunch of things that are currently calling `DeferredCache.get` are only
really interested in the result if it's completed. We can optimise and simplify
this case.
* Remove unused 'default' parameter to DeferredCache.get()
* another get_immediate instance
* type annotations for LruCache
* changelog
* Apply suggestions from code review
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
* review comments
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
This implements a more standard API for instantiating a homeserver and
moves some of the dependency injection into the test suite.
More concretely this stops using `setattr` on all `kwargs` passed to `HomeServer`.
This PR makes several changes to the `./scripts-dev/lint.sh` script, which lints the codebase with a number of tools:
* Adds usage information, with `-h` flag to show it. Otherwise it will show when providing an unknown flag.
* Adds option `-d` which will check both staged and unstaged files that have changed since the last commit and add them to the list of files to lint.
- Note that only files without an extension, or with a `.py` extension will be allowed. This prevents editing bash scripts causing the linters to break on non-python files.
* Improves the print-out of which files/directories are being linted.
We asserted that the IDs returned by postgres sequence was greater than
any we had seen, however this is technically racey as we may update the
current positions out of order.
We now assert that the sequences are correct on startup, so the
assertion is no longer really required, so we remove them.
Autocommit means that we don't wrap the functions in transactions, and instead get executed directly. Introduced in #8456. This will help:
1. reduce the number of `could not serialize access due to concurrent delete` errors that we see (though there are a few functions that often cause serialization errors that we don't fix here);
2. improve the DB performance, as it no longer needs to deal with the overhead of `REPEATABLE READ` isolation levels; and
3. improve wall clock speed of these functions, as we no longer need to send `BEGIN` and `COMMIT` to the DB.
Some notes about the differences between autocommit mode and our default `REPEATABLE READ` transactions:
1. Currently `autocommit` only applies when using PostgreSQL, and is ignored when using SQLite (due to silliness with [Twisted DB classes](https://twistedmatrix.com/trac/ticket/9998)).
2. Autocommit functions may get retried on error, which means they can get applied *twice* (or more) to the DB (since they are not in a transaction the previous call would not get rolled back). This means that the functions need to be idempotent (or otherwise not care about being called multiple times). Read queries, simple deletes, and updates/upserts that replace rows (rather than generating new values from existing rows) are all idempotent.
3. Autocommit functions no longer get executed in [`REPEATABLE READ`](https://www.postgresql.org/docs/current/transaction-iso.html) isolation level, and so data can change queries, which is fine for single statement queries.
We asserted that the IDs returned by postgres sequence was greater than
any we had seen, however this is technically racey as we may update the
current positions out of order.
We now assert that the sequences are correct on startup, so the
assertion is no longer really required, so we remove them.
* Fix outbound federaion with multiple event persisters.
We incorrectly notified federation senders that the minimum persisted
stream position had advanced when we got an `RDATA` from an event
persister.
Notifying of federation senders already correctly happens in the
notifier, so we just delete the offending line.
* Change some interfaces to use RoomStreamToken.
By enforcing use of `RoomStreamTokens` we make it less likely that
people pass in random ints that they got from somewhere random.
Currently background proccesses stream the events stream use the "minimum persisted position" (i.e. `get_current_token()`) rather than the vector clock style tokens. This is broadly fine as it doesn't matter if the background processes lag a small amount. However, in extreme cases (i.e. SyTests) where we only write to one event persister the background processes will never make progress.
This PR changes it so that the `MultiWriterIDGenerator` keeps the current position of a given instance as up to date as possible (i.e using the latest token it sees if its not in the process of persisting anything), and then periodically announces that over replication. This then allows the "minimum persisted position" to advance, albeit with a small lag.
This could, very occasionally, cause:
```
tests.test_visibility.FilterEventsForServerTestCase.test_large_room
===============================================================================
[ERROR]
Traceback (most recent call last):
File "/src/tests/rest/media/v1/test_media_storage.py", line 86, in test_ensure_media_is_in_local_cache
self.wait_on_thread(x)
File "/src/tests/unittest.py", line 296, in wait_on_thread
self.reactor.advance(0.01)
File "/src/.tox/py35/lib/python3.5/site-packages/twisted/internet/task.py", line 826, in advance
self._sortCalls()
File "/src/.tox/py35/lib/python3.5/site-packages/twisted/internet/task.py", line 787, in _sortCalls
self.calls.sort(key=lambda a: a.getTime())
builtins.ValueError: list modified during sort
tests.rest.media.v1.test_media_storage.MediaStorageTests.test_ensure_media_is_in_local_cache
```
This PR allows Synapse modules making use of the `ModuleApi` to create and send non-membership events into a room. This can useful to have modules send messages, or change power levels in a room etc. Note that they must send event through a user that's already in the room.
The non-membership event limitation is currently arbitrary, as it's another chunk of work and not necessary at the moment.