synapse

mirror of https://github.com/element-hq/synapse.git synced 2024-11-25 02:55:46 +03:00

Author	SHA1	Message	Date
Eric Eastwood	a5e16a4ab5	Sliding Sync: Reset `forgotten` status when membership changes (like rejoining a room) (#17835 ) Reset `sliding_sync_membership_snapshots` -> `forgotten` status when membership changes (like rejoining a room). Fix https://github.com/element-hq/synapse/issues/17781 ### What was the problem before? Previously, if someone used `/forget` on one of their rooms, it would update `sliding_sync_membership_snapshots` as expected but when someone rejoined the room (or had any membership change), the upsert didn't overwrite and reset the `forgotten` status so it remained `forgotten` and invisible down the Sliding Sync endpoint.	2024-10-22 11:06:46 +01:00
Eric Eastwood	adda2a4613	Sliding Sync: Slight optimization when fetching state for the room (`get_events_as_list(...)`) (#17718 ) Spawning from @kegsay [pointing out](https://matrix.to/#/!cnVVNLKqgUzNTOFQkz:matrix.org/$ExOO7J8uPUQSyH-9Uxc_QCa8jlXX9uK4VRtkSC0EI3o?via=element.io&via=matrix.org&via=jki.re) that the Sliding Sync endpoint doesn't handle a large room with a lot of state well on initial sync (requesting all state via `required_state: [ ["",""] ]`) (it just takes forever). After investigating further, the slow part is just `get_events_as_list(...)` fetching all of the current state ID's out for the room (which can be 100k+ events for rooms with a lot of membership). This is just a slow thing in Synapse in general and the same thing happens in Sync v2 or the `/state` endpoint. --- The only idea I had to improve things was to use `batch_iter` to only try fetching a fixed amount at a time instead of working with large maps, lists, and sets. This doesn't seem to have much effect though. There is already a `batch_iter(event_ids, 200)` in `_fetch_event_rows(...)` for when we actually have to touch the database and that's inside a queue to deduplicate work. I did notice one slight optimization to use `get_events_as_list(...)` directly instead of `get_events(...)`. `get_events(...)` just turns the result from `get_events_as_list(...)` into a dict and since we're just iterating over the events, we don't need the dict/map.	2024-10-14 13:47:35 +01:00
Eric Eastwood	c2e5e9e67c	Sliding Sync: Avoid fetching left rooms and add back `newly_left` rooms (#17725 ) Performance optimization: We can avoid fetching rooms that the user has left themselves (which could be a significant amount), then only add back rooms that the user has `newly_left` (left in the token range of an incremental sync). It's a lot faster to fetch less rooms than fetch them all and throw them away in most cases. Since the user only leaves a room (or is state reset out) once in a blue moon, we can avoid a lot of work. Based on @erikjohnston's branch, erikj/ss_perf --------- Co-authored-by: Erik Johnston <erik@matrix.org>	2024-09-19 10:07:18 -05:00
Eric Eastwood	16af80b8fb	Sliding Sync: Use Sliding Sync tables for sorting (#17693 ) Use Sliding Sync tables for sorting (`bulk_get_last_event_pos_in_room_before_stream_ordering(...)` -> `_bulk_get_max_event_pos(...)`)	2024-09-11 12:16:24 -05:00
Erik Johnston	596b96411b	Sliding sync: various fixups to the background update (#17652 )	2024-09-11 15:38:46 +01:00
Erik Johnston	588e5b521d	Sliding Sync: Retrieve fewer events from DB in sync (#17688 ) When using timeline limit of 1 we end up fetching 2 events from the DB purely to tell if the response was "limited" or not. Lets not do that.	2024-09-10 09:52:42 +01:00
Eric Eastwood	e1ed959a68	Sliding Sync: Get `bump_stamp` from new sliding sync tables because it's faster (#17658 ) Get `bump_stamp` from [new sliding sync tables](https://github.com/element-hq/synapse/pull/17512) which should be faster (performance) than flipping through the latest events in the room.	2024-09-09 16:41:25 +01:00
Eric Eastwood	26f81fb5be	Sliding Sync: Fix outlier re-persisting causing problems with sliding sync tables (#17635 ) Fix outlier re-persisting causing problems with sliding sync tables Follow-up to https://github.com/element-hq/synapse/pull/17512 When running on `matrix.org`, we discovered that a remote invite is first persisted as an `outlier` and then re-persisted again where it is de-outliered. The first the time, the `outlier` is persisted with one `stream_ordering` but when persisted again and de-outliered, it is assigned a different `stream_ordering` that won't end up being used. Since we call `_calculate_sliding_sync_table_changes()` before `_update_outliers_txn()` which fixes this discrepancy (always use the `stream_ordering` from the first time it was persisted), we're working with an unreliable `stream_ordering` value that will possibly be unused and not make it into the `events` table.	2024-08-30 08:53:57 +01:00
Erik Johnston	bb80894391	Fix background update for sliding sync (#17631 ) This reverts commit `ab414f2ab8`. Introduced in https://github.com/element-hq/synapse/pull/17599	2024-08-29 16:58:53 +01:00
Eric Eastwood	1a6b718f8c	Sliding Sync: Pre-populate room data for quick filtering/sorting (#17512 ) Pre-populate room data for quick filtering/sorting in the Sliding Sync API Spawning from https://github.com/element-hq/synapse/pull/17450#discussion_r1697335578 This PR is acting as the Synapse version `N+1` step in the gradual migration being tracked by https://github.com/element-hq/synapse/issues/17623 Adding two new database tables: - `sliding_sync_joined_rooms`: A table for storing room meta data that the local server is still participating in. The info here can be shared across all `Membership.JOIN`. Keyed on `(room_id)` and updated when the relevant room current state changes or a new event is sent in the room. - `sliding_sync_membership_snapshots`: A table for storing a snapshot of room meta data at the time of the local user's membership. Keyed on `(room_id, user_id)` and only updated when a user's membership in a room changes. Also adds background updates to populate these tables with all of the existing data. We want to have the guarantee that if a row exists in the sliding sync tables, we are able to rely on it (accurate data). And if a row doesn't exist, we use a fallback to get the same info until the background updates fill in the rows or a new event comes in triggering it to be fully inserted. This means we need a couple extra things in place until we bump `SCHEMA_COMPAT_VERSION` and run the foreground update in the `N+2` part of the gradual migration. For context on why we can't rely on the tables without these things see [1]. 1. On start-up, block until we clear out any rows for the rooms that have had events since the max-`stream_ordering` of the `sliding_sync_joined_rooms` table (compare to max-`stream_ordering` of the `events` table). For `sliding_sync_membership_snapshots`, we can compare to the max-`stream_ordering` of `local_current_membership` - This accounts for when someone downgrades their Synapse version and then upgrades it again. This will ensure that we don't have any stale/out-of-date data in the `sliding_sync_joined_rooms`/`sliding_sync_membership_snapshots` tables since any new events sent in rooms would have also needed to be written to the sliding sync tables. For example a new event needs to bump `event_stream_ordering` in `sliding_sync_joined_rooms` table or some state in the room changing (like the room name). Or another example of someone's membership changing in a room affecting `sliding_sync_membership_snapshots`. 1. Add another background update that will catch-up with any rows that were just deleted from the sliding sync tables (based on the activity in the `events`/`local_current_membership`). The rooms that need recalculating are added to the `sliding_sync_joined_rooms_to_recalculate` table. 1. Making sure rows are fully inserted. Instead of partially inserting, we need to check if the row already exists and fully insert all data if not. All of this extra functionality can be removed once the `SCHEMA_COMPAT_VERSION` is bumped with support for the new sliding sync tables so people can no longer downgrade (the `N+2` part of the gradual migration). <details> <summary><sup>[1]</sup></summary> For `sliding_sync_joined_rooms`, since we partially insert rows as state comes in, we can't rely on the existence of the row for a given `room_id`. We can't even rely on looking at whether the background update has finished. There could still be partial rows from when someone reverted their Synapse version after the background update finished, had some state changes (or new rooms), then upgraded again and more state changes happen leaving a partial row. For `sliding_sync_membership_snapshots`, we insert items as a whole except for the `forgotten` column ~~so we can rely on rows existing and just need to always use a fallback for the `forgotten` data. We can't use the `forgotten` column in the table for the same reasons above about `sliding_sync_joined_rooms`.~~ We could have an out-of-date membership from when someone reverted their Synapse version. (same problems as outlined for `sliding_sync_joined_rooms` above) Discussed in an [internal meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7) </details> ### TODO - [x] Update `stream_ordering`/`bump_stamp` - [x] Handle remote invites - [x] Handle state resets - [x] Consider adding `sender` so we can filter `LEAVE` memberships and distinguish from kicks. - [x] We should add it to be able to tell leaves from kicks - [x] Consider adding `tombstone` state to help address https://github.com/element-hq/synapse/issues/17540 - [x] We should add it `tombstone_successor_room_id` - [x] Consider adding `forgotten` status to avoid extra lookup/table-join on `room_memberships` - [x] We should add it - [x] Background update to fill in values for all joined rooms and non-join membership - [x] Clean-up tables when room is deleted - [ ] Make sure tables are useful to our use case - First explored in https://github.com/element-hq/synapse/compare/erikj/ss_use_new_tables - Also explored in `76b5a576eb` - [x] Plan for how can we use this with a fallback - See plan discussed above in main area of the issue description - Discussed in an [internal meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.dz5x6ef4mxz7) - [x] Plan for how we can rely on this new table without a fallback - Synapse version `N+1`: (this PR) Bump `SCHEMA_VERSION` to `87`. Add new tables and background update to backfill all rows. Since this is a new table, we don't have to add any `NOT VALID` constraints and validate them when the background update completes. Read from new tables with a fallback in cases where the rows aren't filled in yet. - Synapse version `N+2`: Bump `SCHEMA_VERSION` to `88` and bump `SCHEMA_COMPAT_VERSION` to `87` because we don't want people to downgrade and miss writes while they are on an older version. Add a foreground update to finish off the backfill so we can read from new tables without the fallback. Application code can now rely on the new tables being populated. - Discussed in an [internal meeting](https://docs.google.com/document/d/1MnuvPkaCkT_wviSQZ6YKBjiWciCBFMd-7hxyCO-OCbQ/edit#bookmark=id.hh7shg4cxdhj) ### Dev notes ``` SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase SYNAPSE_POSTGRES=1 SYNAPSE_POSTGRES_USER=postgres SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.storage.test_events.SlidingSyncPrePopulatedTablesTestCase ``` ``` SYNAPSE_TEST_LOG_LEVEL=INFO poetry run trial tests.handlers.test_sliding_sync.FilterRoomsTestCase ``` Reference: - [Development docs on background updates and worked examples of gradual migrations ](`1dfa59b238/docs/development/database_schema.md (background-updates)`) - A real example of a gradual migration: https://github.com/matrix-org/synapse/pull/15649#discussion_r1213779514 - Adding `rooms.creator` field that needed a background update to backfill data, https://github.com/matrix-org/synapse/pull/10697 - Adding `rooms.room_version` that needed a background update to backfill data, https://github.com/matrix-org/synapse/pull/6729 - Adding `room_stats_state.room_type` that needed a background update to backfill data, https://github.com/matrix-org/synapse/pull/13031 - Tables from MSC2716: `insertion_events`, `insertion_event_edges`, `insertion_event_extremities`, `batch_events` - `current_state_events` updated in `synapse/storage/databases/main/events.py` --- ``` persist_event (adds to queue) _persist_event_batch _persist_events_and_state_updates (assigns `stream_ordering` to events) _persist_events_txn _store_event_txn _update_metadata_tables_txn _store_room_members_txn _update_current_state_txn ``` --- > Concatenated Indexes [...] (also known as multi-column, composite or combined index) > > [...] key consists of multiple columns. > > We can take advantage of the fact that the first index column is always usable for searching > > -- https://use-the-index-luke.com/sql/where-clause/the-equals-operator/concatenated-keys --- Dealing with `portdb` (`synapse/_scripts/synapse_port_db.py`), https://github.com/element-hq/synapse/pull/17512#discussion_r1725998219 --- <details> <summary>SQL queries:</summary> Both of these are equivalent and work in SQLite and Postgres Options 1: ```sql WITH data_table (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) AS ( VALUES ( ?, ?, ?, (SELECT membership FROM room_memberships WHERE event_id = ?), (SELECT stream_ordering FROM events WHERE event_id = ?), {", ".join("?" for _ in insert_values)} ) ) INSERT INTO sliding_sync_non_join_memberships (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) SELECT * FROM data_table WHERE membership != ? ON CONFLICT (room_id, user_id) DO UPDATE SET membership_event_id = EXCLUDED.membership_event_id, membership = EXCLUDED.membership, event_stream_ordering = EXCLUDED.event_stream_ordering, {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)} ``` Option 2: ```sql INSERT INTO sliding_sync_non_join_memberships (room_id, user_id, membership_event_id, membership, event_stream_ordering, {", ".join(insert_keys)}) SELECT column1 as room_id, column2 as user_id, column3 as membership_event_id, column4 as membership, column5 as event_stream_ordering, {", ".join("column" + str(i) for i in range(6, 6 + len(insert_keys)))} FROM ( VALUES ( ?, ?, ?, (SELECT membership FROM room_memberships WHERE event_id = ?), (SELECT stream_ordering FROM events WHERE event_id = ?), {", ".join("?" for _ in insert_values)} ) ) as v WHERE membership != ? ON CONFLICT (room_id, user_id) DO UPDATE SET membership_event_id = EXCLUDED.membership_event_id, membership = EXCLUDED.membership, event_stream_ordering = EXCLUDED.event_stream_ordering, {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)} ``` If we don't need the `membership` condition, we could use: ```sql INSERT INTO sliding_sync_non_join_memberships (room_id, membership_event_id, user_id, membership, event_stream_ordering, {", ".join(insert_keys)}) VALUES ( ?, ?, ?, (SELECT membership FROM room_memberships WHERE event_id = ?), (SELECT stream_ordering FROM events WHERE event_id = ?), {", ".join("?" for _ in insert_values)} ) ON CONFLICT (room_id, user_id) DO UPDATE SET membership_event_id = EXCLUDED.membership_event_id, membership = EXCLUDED.membership, event_stream_ordering = EXCLUDED.event_stream_ordering, {", ".join(f"{key} = EXCLUDED.{key}" for key in insert_keys)} ``` </details> ### Pull Request Checklist <!-- Please read https://element-hq.github.io/synapse/latest/development/contributing_guide.html before submitting your pull request --> * [x] Pull request is based on the develop branch * [x] Pull request includes a [changelog file](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#changelog). The entry should: - Be a short description of your change which makes sense to users. "Fixed a bug that prevented receiving messages from other servers." instead of "Moved X method from `EventStore` to `EventWorkerStore`.". - Use markdown where necessary, mostly for `code blocks`. - End with either a period (.) or an exclamation mark (!). - Start with a capital letter. - Feel free to credit yourself, by adding a sentence "Contributed by @github_username." or "Contributed by [Your Name]." to the end of the entry. * [x] [Code style](https://element-hq.github.io/synapse/latest/code_style.html) is correct (run the [linters](https://element-hq.github.io/synapse/latest/development/contributing_guide.html#run-the-linters)) --------- Co-authored-by: Erik Johnston <erik@matrix.org>	2024-08-29 16:09:51 +01:00
Eric Eastwood	11db575218	Sliding Sync: Use `stream_ordering` based timeline pagination for incremental sync (#17510 ) Some checks are pending Build release artifacts / Build sdist (push) Waiting to run Build release artifacts / Attach assets to release (push) Blocked by required conditions Tests / changes (push) Waiting to run Tests / lint-clippy-nightly (push) Blocked by required conditions Tests / lint-rustfmt (push) Blocked by required conditions Tests / lint-readme (push) Blocked by required conditions Tests / check-sampleconfig (push) Blocked by required conditions Tests / check-schema-delta (push) Blocked by required conditions Tests / check-lockfile (push) Waiting to run Tests / lint (push) Blocked by required conditions Tests / Typechecking (push) Blocked by required conditions Tests / lint-crlf (push) Waiting to run Tests / lint-newsfile (push) Waiting to run Tests / lint-pydantic (push) Blocked by required conditions Tests / lint-clippy (push) Blocked by required conditions Tests / linting-done (push) Blocked by required conditions Tests / calculate-test-jobs (push) Blocked by required conditions Tests / trial (push) Blocked by required conditions Tests / trial-olddeps (push) Blocked by required conditions Tests / trial-pypy (all, pypy-3.8) (push) Blocked by required conditions Tests / sytest (push) Blocked by required conditions Tests / export-data (push) Blocked by required conditions Tests / portdb (11, 3.8) (push) Blocked by required conditions Tests / portdb (15, 3.11) (push) Blocked by required conditions Tests / complement (monolith, Postgres) (push) Blocked by required conditions Tests / complement (monolith, SQLite) (push) Blocked by required conditions Tests / complement (workers, Postgres) (push) Blocked by required conditions Tests / cargo-test (push) Blocked by required conditions Tests / cargo-bench (push) Blocked by required conditions Tests / tests-done (push) Blocked by required conditions Use `stream_ordering` based `timeline` pagination for incremental `/sync` in Sliding Sync. Previously, we were always using a `topological_ordering` but we should only be using that for historical scenarios (initial `/sync`, newly joined, or haven't sent the room down the connection before). This is slightly different than what the [spec suggests](https://spec.matrix.org/v1.10/client-server-api/#syncing) > Events are ordered in this API according to the arrival time of the event on the homeserver. This can conflict with other APIs which order events based on their partial ordering in the event graph. This can result in duplicate events being received (once per distinct API called). Clients SHOULD de-duplicate events based on the event ID when this happens. But we've had a [discussion below in this PR](https://github.com/element-hq/synapse/pull/17510#discussion_r1699105569) and this matches what Sync v2 already does and seems like it makes sense. Created a spec issue https://github.com/matrix-org/matrix-spec/issues/1917 to clarify this. Related issues: - https://github.com/matrix-org/matrix-spec/issues/1917 - https://github.com/matrix-org/matrix-spec/issues/852 - https://github.com/matrix-org/matrix-spec-proposals/pull/4033	2024-08-07 11:27:50 -05:00
Eric Eastwood	3fee32ed6b	Order `heroes` by `stream_ordering` (as spec'ed) (#17435 ) Some checks are pending Build release artifacts / Build sdist (push) Waiting to run Build release artifacts / Attach assets to release (push) Blocked by required conditions Tests / lint (push) Blocked by required conditions Tests / changes (push) Waiting to run Tests / check-sampleconfig (push) Blocked by required conditions Tests / check-schema-delta (push) Blocked by required conditions Tests / check-lockfile (push) Waiting to run Tests / Typechecking (push) Blocked by required conditions Tests / lint-crlf (push) Waiting to run Tests / lint-newsfile (push) Waiting to run Tests / lint-pydantic (push) Blocked by required conditions Tests / lint-clippy (push) Blocked by required conditions Tests / lint-clippy-nightly (push) Blocked by required conditions Tests / lint-rustfmt (push) Blocked by required conditions Tests / lint-readme (push) Blocked by required conditions Tests / linting-done (push) Blocked by required conditions Tests / calculate-test-jobs (push) Blocked by required conditions Tests / trial (push) Blocked by required conditions Tests / trial-olddeps (push) Blocked by required conditions Tests / trial-pypy (all, pypy-3.8) (push) Blocked by required conditions Tests / sytest (push) Blocked by required conditions Tests / export-data (push) Blocked by required conditions Tests / portdb (11, 3.8) (push) Blocked by required conditions Tests / portdb (15, 3.11) (push) Blocked by required conditions Tests / complement (monolith, Postgres) (push) Blocked by required conditions Tests / complement (monolith, SQLite) (push) Blocked by required conditions Tests / complement (workers, Postgres) (push) Blocked by required conditions Tests / cargo-test (push) Blocked by required conditions Tests / cargo-bench (push) Blocked by required conditions Tests / tests-done (push) Blocked by required conditions The spec specifically mentions `stream_ordering` but that's a Synapse specific concept. In any case, the essence of the spec is basically the first 5 members of the room which `stream_ordering` accomplishes. Split off from https://github.com/element-hq/synapse/pull/17419#discussion_r1671342794 ## Spec compliance > This should be the first 5 members of the room, ordered by stream ordering, which are joined or invited. The list must never include the client’s own user ID. When no joined or invited members are available, this should consist of the banned and left users. > > -- https://spec.matrix.org/v1.10/client-server-api/#_matrixclientv3sync_roomsummary Related to https://github.com/matrix-org/matrix-spec/issues/1334	2024-07-17 13:10:15 -05:00
Eric Eastwood	3fef535ff2	Add `rooms.bump_stamp` to Sliding Sync `/sync` for easier client-side sorting (#17395 ) `bump_stamp` corresponds to the `stream_ordering` of the latest `DEFAULT_BUMP_EVENT_TYPES` in the room. This helps clients sort more readily without them needing to pull in a bunch of the timeline to determine the last activity. `bump_event_types` is a thing because for example, we don't want display name changes to mark the room as unread and bump it to the top. For encrypted rooms, we just have to consider any activity as a bump because we can't see the content and the client has to figure it out for themselves. Outside of Synapse, `bump_stamp` is just a free-form counter so other implementations could use `received_ts`or `origin_server_ts` (see the [Security considerations section in MSC3575 about the potential pitfalls of using `origin_server_ts`](https://github.com/matrix-org/matrix-spec-proposals/blob/kegan/sync-v3/proposals/3575-sync.md#security-considerations)). It doesn't have any guarantee about always going up. In the Synapse case, it could go down if an event was redacted/removed (or purged in cases of retention policies). In the future, we could add `bump_event_types` as [MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575) mentions if people need to customize the event types. --- In the Sliding Sync proxy, a similar [`timestamp` field was added](https://github.com/matrix-org/sliding-sync/pull/247) for the same purpose but the name is not obvious what it pertains to or what it's for. The `timestamp` field was also added to Ruma in https://github.com/ruma/ruma/pull/1622	2024-07-08 13:17:08 -05:00
Eric Eastwood	fa91655805	Return some room data in Sliding Sync `/sync` (#17320 ) Some checks failed Build release artifacts / Attach assets to release (push) Blocked by required conditions Tests / lint-crlf (push) Waiting to run Tests / lint-newsfile (push) Waiting to run Tests / lint-pydantic (push) Blocked by required conditions Tests / lint-clippy (push) Blocked by required conditions Tests / lint-clippy-nightly (push) Blocked by required conditions Tests / lint-rustfmt (push) Blocked by required conditions Tests / lint-readme (push) Blocked by required conditions Tests / linting-done (push) Blocked by required conditions Tests / calculate-test-jobs (push) Blocked by required conditions Tests / Typechecking (push) Blocked by required conditions Tests / changes (push) Waiting to run Tests / check-sampleconfig (push) Blocked by required conditions Tests / check-schema-delta (push) Blocked by required conditions Tests / check-lockfile (push) Waiting to run Tests / lint (push) Blocked by required conditions Tests / trial (push) Blocked by required conditions Tests / trial-olddeps (push) Blocked by required conditions Tests / trial-pypy (all, pypy-3.8) (push) Blocked by required conditions Tests / sytest (push) Blocked by required conditions Tests / export-data (push) Blocked by required conditions Tests / portdb (11, 3.8) (push) Blocked by required conditions Tests / portdb (15, 3.11) (push) Blocked by required conditions Tests / complement (monolith, Postgres) (push) Blocked by required conditions Tests / complement (monolith, SQLite) (push) Blocked by required conditions Tests / complement (workers, Postgres) (push) Blocked by required conditions Tests / cargo-test (push) Blocked by required conditions Tests / cargo-bench (push) Blocked by required conditions Tests / tests-done (push) Blocked by required conditions / Check locked dependencies have sdists (push) Has been cancelled - Timeline events - Stripped `invite_state` Based on [MSC3575](https://github.com/matrix-org/matrix-spec-proposals/pull/3575): Sliding Sync	2024-07-02 11:07:05 -05:00
Erik Johnston	554a92601a	Reintroduce "Reduce device lists replication traffic."" (#17361 ) Reintroduces https://github.com/element-hq/synapse/pull/17333 Turns out the reason for revert was down two master instances running	2024-06-25 10:34:34 +01:00
Erik Johnston	a98cb87bee	Revert "Reduce device lists replication traffic." (#17360 ) Reverts element-hq/synapse#17333 It looks like master was still sending out replication RDATA with the old format... somehow	2024-06-25 09:57:34 +01:00
Erik Johnston	930a64b6c1	Reintroduce #17291 . (#17338 ) This is #17291 (which got reverted), with some added fixups, and change so that tests actually pick up the error. The problem was that we were not calculating any new chain IDs due to a missing `not` in a condition.	2024-06-24 14:40:28 +00:00
Erik Johnston	cf711ac03c	Reduce device lists replication traffic. (#17333 ) Reduce the replication traffic of device lists, by not sending every destination that needs to be sent the device list update over replication. Instead a "hosts to send to have been calculated" notification over replication, and then federation senders read the destinations from the DB. For non federation senders this should heavily reduce the impact of a user in many large rooms changing a device.	2024-06-24 14:15:13 +01:00
Erik Johnston	4243c1f074	Revert "Handle large chain calc better (#17291 )" (#17334 ) Some checks failed Tests / tests-done (push) Has been cancelled Tests / lint-newsfile (push) Has been cancelled Tests / changes (push) Has been cancelled Tests / check-lockfile (push) Has been cancelled Tests / lint-crlf (push) Has been cancelled Tests / lint-clippy (push) Has been cancelled Tests / lint-clippy-nightly (push) Has been cancelled Tests / lint-rustfmt (push) Has been cancelled Deploy the documentation / GitHub Pages (push) Has been cancelled Build release artifacts / Build .deb packages (push) Has been cancelled Tests / lint-pydantic (push) Has been cancelled Build release artifacts / Attach assets to release (push) Has been cancelled Tests / check-sampleconfig (push) Has been cancelled Tests / check-schema-delta (push) Has been cancelled Tests / lint (push) Has been cancelled Tests / Typechecking (push) Has been cancelled Tests / linting-done (push) Has been cancelled Tests / calculate-test-jobs (push) Has been cancelled Tests / trial (push) Has been cancelled Tests / trial-olddeps (push) Has been cancelled Tests / trial-pypy (all, pypy-3.8) (push) Has been cancelled Tests / sytest (push) Has been cancelled Tests / export-data (push) Has been cancelled Tests / portdb (11, 3.8) (push) Has been cancelled Tests / portdb (15, 3.11) (push) Has been cancelled Tests / complement (monolith, Postgres) (push) Has been cancelled Tests / complement (monolith, SQLite) (push) Has been cancelled Tests / complement (workers, Postgres) (push) Has been cancelled Tests / cargo-test (push) Has been cancelled Tests / cargo-bench (push) Has been cancelled This reverts commit `bdf82efea5` (#17291) This seems to have stopped persisting auth chains for new events, and so is causing state res to fall back to the slow methods	2024-06-19 17:39:33 +01:00
Erik Johnston	bdf82efea5	Handle large chain calc better (#17291 ) We calculate the auth chain links outside of the main persist event transaction to ensure that we do not block other event sending during the calculation.	2024-06-19 10:33:53 +01:00
Eric Eastwood	e5b8a3e37f	Add `stream_ordering` sort to Sliding Sync `/sync` (#17293 ) Sort is no longer configurable and we always sort rooms by the `stream_ordering` of the last event in the room or the point where the user can see up to in cases of leave/ban/invite/knock.	2024-06-17 11:27:14 -05:00
Quentin Gliech	e88332b5f4	Merge branch 'release-v1.109' into develop	2024-06-17 15:51:16 +02:00
Quentin Gliech	f983a77ab0	Set our own stream position from the current sequence value on startup (#17309 )	2024-06-17 11:50:00 +00:00
Erik Johnston	a3cb244755	Automatically apply SQL for inconsistent sequence (#17305 ) Some checks failed Tests / lint-newsfile (push) Has been cancelled Tests / changes (push) Has been cancelled Tests / check-lockfile (push) Has been cancelled Tests / lint-crlf (push) Has been cancelled Tests / lint-pydantic (push) Has been cancelled Tests / lint-clippy (push) Has been cancelled Tests / lint-clippy-nightly (push) Has been cancelled Tests / lint-rustfmt (push) Has been cancelled Deploy the documentation / GitHub Pages (push) Has been cancelled Build release artifacts / Build .deb packages (push) Has been cancelled Build release artifacts / Attach assets to release (push) Has been cancelled Tests / check-sampleconfig (push) Has been cancelled Tests / check-schema-delta (push) Has been cancelled Tests / lint (push) Has been cancelled Tests / Typechecking (push) Has been cancelled Tests / linting-done (push) Has been cancelled Tests / calculate-test-jobs (push) Has been cancelled Tests / trial (push) Has been cancelled Tests / trial-olddeps (push) Has been cancelled Tests / trial-pypy (all, pypy-3.8) (push) Has been cancelled Tests / sytest (push) Has been cancelled Tests / export-data (push) Has been cancelled Tests / portdb (11, 3.8) (push) Has been cancelled Tests / portdb (15, 3.11) (push) Has been cancelled Tests / complement (monolith, Postgres) (push) Has been cancelled Tests / complement (monolith, SQLite) (push) Has been cancelled Tests / complement (workers, Postgres) (push) Has been cancelled Tests / cargo-test (push) Has been cancelled Tests / cargo-bench (push) Has been cancelled Tests / tests-done (push) Has been cancelled Rather than forcing the server operator to apply the SQL manually. This should be safe, as there should be only one writer for these sequences.	2024-06-14 16:40:29 +01:00
Eric Eastwood	8c58eb7f17	Add `event.internal_metadata.instance_name` (#17300 ) Add `event.internal_metadata.instance_name` (the worker instance that persisted the event) to go alongside the existing `event.internal_metadata.stream_ordering`. `instance_name` is useful to properly compare and query for events with a token since you need to compare both the `stream_ordering` and `instance_name` against the vector clock/`instance_map` in the `RoomStreamToken`. This is pre-requisite work and may be used in https://github.com/element-hq/synapse/pull/17293 Adding `event.internal_metadata.instance_name` was first mentioned in the initial Sliding Sync PR while pairing with @erikjohnston, see `09609cb0db (diff-5cd773fb307aa754bd3948871ba118b1ef0303f4d72d42a2d21e38242bf4e096R405-R410)`	2024-06-13 11:32:50 -05:00
Eric Eastwood	ebdce69f6a	Fix `get_last_event_in_room_before_stream_ordering(...)` finding the wrong last event (#17295 ) Some checks failed Build release artifacts / Build sdist (push) Waiting to run Build release artifacts / Attach assets to release (push) Blocked by required conditions Tests / lint (push) Blocked by required conditions Tests / Typechecking (push) Blocked by required conditions Tests / lint-crlf (push) Waiting to run Tests / lint-newsfile (push) Waiting to run Tests / lint-pydantic (push) Blocked by required conditions Tests / lint-clippy (push) Blocked by required conditions Tests / lint-clippy-nightly (push) Blocked by required conditions Tests / lint-rustfmt (push) Blocked by required conditions Tests / linting-done (push) Blocked by required conditions Tests / calculate-test-jobs (push) Blocked by required conditions Tests / trial (push) Blocked by required conditions Tests / portdb (15, 3.11) (push) Blocked by required conditions Tests / changes (push) Waiting to run Tests / check-sampleconfig (push) Blocked by required conditions Tests / check-schema-delta (push) Blocked by required conditions Tests / check-lockfile (push) Waiting to run Tests / trial-olddeps (push) Blocked by required conditions Tests / trial-pypy (all, pypy-3.8) (push) Blocked by required conditions Tests / sytest (push) Blocked by required conditions Tests / export-data (push) Blocked by required conditions Tests / portdb (11, 3.8) (push) Blocked by required conditions Tests / complement (monolith, Postgres) (push) Blocked by required conditions Tests / complement (monolith, SQLite) (push) Blocked by required conditions Tests / complement (workers, Postgres) (push) Blocked by required conditions Tests / cargo-test (push) Blocked by required conditions Tests / cargo-bench (push) Blocked by required conditions Tests / tests-done (push) Blocked by required conditions / Check locked dependencies have sdists (push) Has been cancelled PR where this was introduced: https://github.com/matrix-org/synapse/pull/14817 ### What does this affect? `get_last_event_in_room_before_stream_ordering(...)` is used in Sync v2 in a lot of different state calculations. `get_last_event_in_room_before_stream_ordering(...)` is also used in `/rooms/{roomId}/members`	2024-06-13 11:00:52 -05:00
Erik Johnston	aabf577166	Handle hyphens in user dir search porperly (#17254 ) Some checks are pending Build release artifacts / Build wheels on ${{ matrix.os }} for ${{ matrix.arch }} (x86_64, ${{ startsWith(github.ref, 'refs/pull/') }}, ubuntu-20.04) (push) Waiting to run Build release artifacts / Build sdist (push) Waiting to run Build release artifacts / Attach assets to release (push) Blocked by required conditions Tests / changes (push) Waiting to run Tests / check-sampleconfig (push) Blocked by required conditions Tests / check-schema-delta (push) Blocked by required conditions Tests / check-lockfile (push) Waiting to run Tests / lint (push) Blocked by required conditions Tests / calculate-test-jobs (push) Blocked by required conditions Tests / Typechecking (push) Blocked by required conditions Tests / lint-crlf (push) Waiting to run Tests / lint-newsfile (push) Waiting to run Tests / lint-pydantic (push) Blocked by required conditions Tests / lint-clippy (push) Blocked by required conditions Tests / lint-clippy-nightly (push) Blocked by required conditions Tests / lint-rustfmt (push) Blocked by required conditions Tests / linting-done (push) Blocked by required conditions Tests / trial (push) Blocked by required conditions Tests / trial-olddeps (push) Blocked by required conditions Tests / trial-pypy (all, pypy-3.8) (push) Blocked by required conditions Tests / sytest (push) Blocked by required conditions Tests / export-data (push) Blocked by required conditions Tests / portdb (11, 3.8) (push) Blocked by required conditions Tests / portdb (15, 3.11) (push) Blocked by required conditions Tests / complement (monolith, Postgres) (push) Blocked by required conditions Tests / complement (monolith, SQLite) (push) Blocked by required conditions Tests / complement (workers, Postgres) (push) Blocked by required conditions Tests / cargo-test (push) Blocked by required conditions Tests / cargo-bench (push) Blocked by required conditions Tests / tests-done (push) Blocked by required conditions c.f. #16675	2024-06-05 10:40:34 +01:00
Erik Johnston	d16910ca02	Replaces all usages of `StreamIdGenerator` with `MultiWriterIdGenerator` (#17229 ) Replaces all usages of `StreamIdGenerator` with `MultiWriterIdGenerator`, which is safer.	2024-05-30 11:07:32 +00:00
Erik Johnston	466f344547	Move towards using `MultiWriterIdGenerator` everywhere (#17226 ) Some checks are pending Build release artifacts / Build wheels on ${{ matrix.os }} for ${{ matrix.arch }} (x86_64, ${{ startsWith(github.ref, 'refs/pull/') }}, ubuntu-20.04) (push) Waiting to run Build release artifacts / Build sdist (push) Waiting to run Build release artifacts / Attach assets to release (push) Blocked by required conditions Tests / check-lockfile (push) Waiting to run Tests / lint (push) Blocked by required conditions Tests / calculate-test-jobs (push) Blocked by required conditions Tests / changes (push) Waiting to run Tests / check-sampleconfig (push) Blocked by required conditions Tests / check-schema-delta (push) Blocked by required conditions Tests / Typechecking (push) Blocked by required conditions Tests / lint-crlf (push) Waiting to run Tests / lint-newsfile (push) Waiting to run Tests / lint-pydantic (push) Blocked by required conditions Tests / lint-clippy (push) Blocked by required conditions Tests / lint-clippy-nightly (push) Blocked by required conditions Tests / lint-rustfmt (push) Blocked by required conditions Tests / linting-done (push) Blocked by required conditions Tests / trial (push) Blocked by required conditions Tests / trial-olddeps (push) Blocked by required conditions Tests / trial-pypy (all, pypy-3.8) (push) Blocked by required conditions Tests / sytest (push) Blocked by required conditions Tests / export-data (push) Blocked by required conditions Tests / portdb (11, 3.8) (push) Blocked by required conditions Tests / portdb (15, 3.11) (push) Blocked by required conditions Tests / complement (monolith, Postgres) (push) Blocked by required conditions Tests / complement (monolith, SQLite) (push) Blocked by required conditions Tests / complement (workers, Postgres) (push) Blocked by required conditions Tests / cargo-test (push) Blocked by required conditions Tests / cargo-bench (push) Blocked by required conditions Tests / tests-done (push) Blocked by required conditions There is a problem with `StreamIdGenerator` where it can go backwards over restarts when a stream ID is requested but then not inserted into the DB. This is problematic if we want to land #17215, and is generally a potential cause for all sorts of nastiness. Instead of trying to fix `StreamIdGenerator`, we may as well move to `MultiWriterIdGenerator` that does not suffer from this problem (the latest positions are stored in `stream_positions` table). This involves adding SQLite support to the class. This only changes id generators that were already using `MultiWriterIdGenerator` under postgres, a separate PR will move the rest of the uses of `StreamIdGenerator` over.	2024-05-29 12:19:10 +00:00
Shay	37558d5e4c	Add support for MSC3823 - Account Suspension (#17051 )	2024-05-01 17:45:17 +01:00
Melvyn Laïly	59710437e4	Return the search terms as search highlights for SQLite instead of nothing (#17000 ) Fixes https://github.com/element-hq/synapse/issues/16999 and https://github.com/element-hq/element-android/pull/8729 by returning the search terms as search highlights.	2024-04-26 09:43:52 +01:00
Erik Johnston	55b0aa847a	Fix GHSA-3h7q-rfh9-xm4v Weakness in auth chain indexing allows DoS from remote room members through disk fill and high CPU usage. A remote Matrix user with malicious intent, sharing a room with Synapse instances before 1.104.1, can dispatch specially crafted events to exploit a weakness in how the auth chain cover index is calculated. This can induce high CPU consumption and accumulate excessive data in the database of such instances, resulting in a denial of service. Servers in private federations, or those that do not federate, are not affected.	2024-04-23 15:25:49 +01:00
dependabot[bot]	1e68b56a62	Bump black from 23.10.1 to 24.2.0 (#16936 )	2024-03-13 16:46:44 +00:00
Erik Johnston	23740eaa3d	Correctly mention previous copyright (#16820 ) During the migration the automated script to update the copyright headers accidentally got rid of some of the existing copyright lines. Reinstate them.	2024-01-23 11:26:48 +00:00
Erik Johnston	5d3850b038	Port `EventInternalMetadata` class to Rust (#16782 ) There are a couple of things we need to be careful of here: 1. The current python code does no validation when loading from the DB, so we need to be careful to ignore such errors (at least on jki.re there are some old events with internal metadata fields of the wrong type). 2. We want to be memory efficient, as we often have many hundreds of thousands of events in the cache at a time. --------- Co-authored-by: Quentin Gliech <quenting@element.io>	2024-01-08 14:06:48 +00:00
Patrick Cloke	8e1e62c9e0	Update license headers	2023-11-21 15:29:58 -05:00
Erik Johnston	ef5329a9f9	Revert "Add a Postgres `REPLICA IDENTITY` to tables that do not have an implicit one. This should allow use of Postgres logical replication. (#16456 )" (#16651 ) This reverts commit `69afe3f7a0`.	2023-11-16 16:48:48 +00:00
reivilibre	830988ae72	Fix test not detecting tables with missing primary keys and missing replica identities, then add more replica identities. (#16647 ) * Fix the CI query that did not detect all cases of missing primary keys * Add more missing REPLICA IDENTITY entries * Newsfile Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org> --------- Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org>	2023-11-16 12:26:27 +00:00
David Robertson	43d1aa75e8	Add an Admin API to temporarily grant the ability to update an existing cross-signing key without UIA (#16634 )	2023-11-15 17:28:10 +00:00
Patrick Cloke	f2f2c7c1f0	Use full GitHub links instead of bare issue numbers. (#16637 )	2023-11-15 08:02:11 -05:00
reivilibre	69afe3f7a0	Add a Postgres `REPLICA IDENTITY` to tables that do not have an implicit one. This should allow use of Postgres logical replication. (#16456 ) * Add Postgres replica identities to tables that don't have an implicit one Fixes #16224 * Newsfile Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org> * Move the delta to version 83 as we missed the boat for 82 * Add a test that all tables have a REPLICA IDENTITY * Extend the test to include when indices are deleted * isort * black * Fully qualify `oid` as it is a 'hidden attribute' in Postgres 11 * Update tests/storage/test_database.py Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com> * Add missed tables --------- Signed-off-by: Olivier Wilkinson (reivilibre) <oliverw@matrix.org> Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>	2023-11-13 16:03:22 +00:00
Patrick Cloke	ab3f1b3b53	Convert simple_select_one_txn and simple_select_one to return tuples. (#16612 )	2023-11-09 11:13:31 -05:00
David Robertson	91587d4cf9	Bulk-invalidate e2e cached queries after claiming keys (#16613 ) Co-authored-by: Patrick Cloke <patrickc@matrix.org>	2023-11-09 15:57:09 +00:00
Patrick Cloke	455ef04187	Avoid updating the same rows multiple times with simple_update_many_txn. (#16609 ) simple_update_many_txn had a bug in it which would cause each update to be applied twice.	2023-11-07 14:02:09 -05:00
Patrick Cloke	9738b1c497	Avoid executing no-op queries. (#16583 ) If simple_{insert,upsert,update}_many_txn is called without any data to modify then return instead of executing the query. This matches the behavior of simple_{select,delete}_many_txn.	2023-11-07 14:00:25 -05:00
Patrick Cloke	ec9ff389f4	More tests for the simple_* methods. (#16596 ) Expand tests for the simple_* database methods, additionally test against both PostgreSQL and SQLite variants.	2023-11-07 09:34:23 -05:00
Patrick Cloke	cfb6d38c47	Remove remaining usage of cursor_to_dict. (#16564 )	2023-10-31 13:13:28 -04:00
Patrick Cloke	679c691f6f	Remove more usages of cursor_to_dict. (#16551 ) Mostly to improve type safety.	2023-10-26 15:12:28 -04:00
Patrick Cloke	9407d5ba78	Convert simple_select_list and simple_select_list_txn to return lists of tuples (#16505 ) This should use fewer allocations and improves type hints.	2023-10-26 13:01:36 -04:00
Erik Johnston	8f35f8148e	Fix bug where a new writer advances their token too quickly (#16473 ) * Fix bug where a new writer advances their token too quickly When starting a new writer (for e.g. persisting events), the `MultiWriterIdGenerator` doesn't have a minimum token for it as there are no rows matching that new writer in the DB. This results in the the first stream ID it acquired being announced as persisted before it actually finishes persisting, if another writer gets and persists a subsequent stream ID. This is due to the logic of setting the minimum persisted position to the minimum known position of across all writers, and the new writer starts off not being considered. * Fix sending out POSITIONs when our token advances without update Broke in #14820 * For replication HTTP requests, only wait for minimal position	2023-10-23 16:57:30 +01:00

1 2 3 4 5 ...

735 commits