Commit graph

49 commits

Author SHA1 Message Date
Eric Eastwood
adda2a4613
Sliding Sync: Slight optimization when fetching state for the room (get_events_as_list(...)) (#17718)
Spawning from @kegsay [pointing
out](https://matrix.to/#/!cnVVNLKqgUzNTOFQkz:matrix.org/$ExOO7J8uPUQSyH-9Uxc_QCa8jlXX9uK4VRtkSC0EI3o?via=element.io&via=matrix.org&via=jki.re)
that the Sliding Sync endpoint doesn't handle a large room with a lot of
state well on initial sync (requesting all state via `required_state: [
["*","*"] ]`) (it just takes forever).

After investigating further, the slow part is just
`get_events_as_list(...)` fetching all of the current state ID's out for
the room (which can be 100k+ events for rooms with a lot of membership).
This is just a slow thing in Synapse in general and the same thing
happens in Sync v2 or the `/state` endpoint.


---

The only idea I had to improve things was to use `batch_iter` to only
try fetching a fixed amount at a time instead of working with large
maps, lists, and sets. This doesn't seem to have much effect though.

There is already a `batch_iter(event_ids, 200)` in
`_fetch_event_rows(...)` for when we actually have to touch the database
and that's inside a queue to deduplicate work.

I did notice one slight optimization to use `get_events_as_list(...)`
directly instead of `get_events(...)`. `get_events(...)` just turns the
result from `get_events_as_list(...)` into a dict and since we're just
iterating over the events, we don't need the dict/map.
2024-10-14 13:47:35 +01:00
Erik Johnston
23740eaa3d
Correctly mention previous copyright (#16820)
During the migration the automated script to update the copyright
headers accidentally got rid of some of the existing copyright lines.
Reinstate them.
2024-01-23 11:26:48 +00:00
Erik Johnston
5d3850b038
Port EventInternalMetadata class to Rust (#16782)
There are a couple of things we need to be careful of here:

1. The current python code does no validation when loading from the DB,
so we need to be careful to ignore such errors (at least on jki.re there
are some old events with internal metadata fields of the wrong type).
2. We want to be memory efficient, as we often have many hundreds of
thousands of events in the cache at a time.

---------

Co-authored-by: Quentin Gliech <quenting@element.io>
2024-01-08 14:06:48 +00:00
Patrick Cloke
8e1e62c9e0 Update license headers 2023-11-21 15:29:58 -05:00
David Robertson
43d1aa75e8
Add an Admin API to temporarily grant the ability to update an existing cross-signing key without UIA (#16634) 2023-11-15 17:28:10 +00:00
Patrick Cloke
f2f2c7c1f0
Use full GitHub links instead of bare issue numbers. (#16637) 2023-11-15 08:02:11 -05:00
David Robertson
91587d4cf9
Bulk-invalidate e2e cached queries after claiming keys (#16613)
Co-authored-by: Patrick Cloke <patrickc@matrix.org>
2023-11-09 15:57:09 +00:00
Patrick Cloke
9407d5ba78
Convert simple_select_list and simple_select_list_txn to return lists of tuples (#16505)
This should use fewer allocations and improves type hints.
2023-10-26 13:01:36 -04:00
Patrick Cloke
aa483cb4c9
Update ruff config (#16283)
Enable additional checks & clean-up unneeded configuration.
2023-09-08 11:24:36 -04:00
Erik Johnston
a2e0d4cd60
Fix rare bug that broke looping calls (#16210)
* Fix rare bug that broke looping calls

We can't interact with the reactor from the main thread via looping
call.

Introduced in v1.90.0 / #15791.

* Newsfile
2023-08-30 14:18:42 +01:00
Erik Johnston
eb0dbab15b
Fix database performance of read/write worker locks (#16061)
We were seeing serialization errors when taking out multiple read locks.

The transactions were retried, so isn't causing any failures.

Introduced in #15782.
2023-08-17 14:07:57 +01:00
Erik Johnston
ae55cc1e6b
Add ability to wait for locks and add locks to purge history / room deletion (#15791)
c.f. #13476
2023-07-31 10:58:03 +01:00
Erik Johnston
39d131b016
Add basic read/write lock (#15782) 2023-07-05 17:25:00 +01:00
Jason Little
21fea6b749
Prefill events after invalidate not before when persisting events (#15758)
Fixes #15757
2023-06-14 09:42:18 +01:00
Erik Johnston
c485ed1c5a
Clear event caches when we purge history (#15609)
This should help a little with #13476

---------

Co-authored-by: Patrick Cloke <patrickc@matrix.org>
2023-06-08 13:14:40 +01:00
dependabot[bot]
9bb2eac719
Bump black from 22.12.0 to 23.1.0 (#15103) 2023-02-22 15:29:09 -05:00
Patrick Cloke
42aea0d8af
Add final type hint to tests.unittest. (#15072)
Adds a return type to HomeServerTestCase.make_homeserver and deal
with any variables which are no longer Any.
2023-02-14 14:03:35 -05:00
Patrick Cloke
230a831c73
Attempt to delete more duplicate rows in receipts_linearized table. (#14915)
The previous assumption was that the stream_id column was unique
(for a room ID, receipt type, user ID tuple), but this turned out to be
incorrect.

Now find the max stream ID, then map this back to a database-specific
row identifier and delete other rows which match the (room ID, receipt type,
user ID) tuple, but *not* the row ID.
2023-02-01 15:45:10 -05:00
Patrick Cloke
82d3efa312
Skip processing stats for broken rooms. (#14873)
* Skip processing stats for broken rooms.

* Newsfragment

* Use a custom exception.
2023-01-23 11:36:20 +00:00
Patrick Cloke
3ac412b4e2
Require types in tests.storage. (#14646)
Adds missing type hints to `tests.storage` package
and does not allow untyped definitions.
2022-12-09 12:36:32 -05:00
Sean Quah
882277008c
Fix background updates failing to add unique indexes on receipts (#14453)
As part of the database migration to support threaded receipts, there is
a possible window in between
`73/08thread_receipts_non_null.sql.postgres` removing the original
unique constraints on `receipts_linearized` and `receipts_graph` and the
`reeipts_linearized_unique_index` and `receipts_graph_unique_index`
background updates from `72/08thread_receipts.sql` completing where
the unique constraints on `receipts_linearized` and `receipts_graph` are
missing. Any emulated upserts on these tables must therefore be
performed with a lock held, otherwise duplicate rows can end up in the
tables when there are concurrent emulated upserts. Fix the missing lock.

Note that emulated upserts no longer happen by default on sqlite, since
the minimum supported version of sqlite supports native upserts by
default now.

Finally, clean up any duplicate receipts that may have crept in before
trying to create the `receipts_graph_unique_index` and
`receipts_linearized_unique_index` unique indexes.

Signed-off-by: Sean Quah <seanq@matrix.org>
2022-11-16 15:01:22 +00:00
Andrew Morgan
828b5502cf
Remove _get_events_cache check optimisation from _have_seen_events_dict (#14161) 2022-10-18 10:33:21 +01:00
Eric Eastwood
29269d9d3f
Fix have_seen_event cache not being invalidated (#13863)
Fix https://github.com/matrix-org/synapse/issues/13856
Fix https://github.com/matrix-org/synapse/issues/13865

> Discovered while trying to make Synapse fast enough for [this MSC2716 test for importing many batches](https://github.com/matrix-org/complement/pull/214#discussion_r741678240). As an example, disabling the `have_seen_event` cache saves 10 seconds for each `/messages` request in that MSC2716 Complement test because we're not making as many federation requests for `/state` (speeding up `have_seen_event` itself is related to https://github.com/matrix-org/synapse/issues/13625) 
> 
> But this will also make `/messages` faster in general so we can include it in the [faster `/messages` milestone](https://github.com/matrix-org/synapse/milestone/11).
> 
> *-- https://github.com/matrix-org/synapse/issues/13856*


### The problem

`_invalidate_caches_for_event` doesn't run in monolith mode which means we never even tried to clear the `have_seen_event` and other caches. And even in worker mode, it only runs on the workers, not the master (AFAICT).

Additionally there was bug with the key being wrong so `_invalidate_caches_for_event` never invalidates the `have_seen_event` cache even when it does run.

Because we were using the `@cachedList` wrong, it was putting items in the cache under keys like `((room_id, event_id),)` with a `set` in a `set` (ex. `(('!TnCIJPKzdQdUlIyXdQ:test', '$Iu0eqEBN7qcyF1S9B3oNB3I91v2o5YOgRNPwi_78s-k'),)`) and we we're trying to invalidate with just `(room_id, event_id)` which did nothing.
2022-09-27 15:55:43 -05:00
reivilibre
c2fe48a6ff
Rename the EventFormatVersions enum values so that they line up with room version numbers. (#13706) 2022-09-07 11:08:20 +01:00
Nick Mills-Barrett
cc21a431f3
Async get event cache prep (#13242)
Some experimental prep work to enable external event caching based on #9379 & #12955. Doesn't actually move the cache at all, just lays the groundwork for async implemented caches.

Signed off by Nick @ Beeper (@Fizzadar)
2022-07-15 09:30:46 +00:00
Šimon Brandner
13e359aec8
Implement MSC3827: Filtering of /publicRooms by room type (#13031)
Signed-off-by: Šimon Brandner <simon.bra.ag@gmail.com>
2022-06-29 17:12:45 +00:00
Sumner Evans
bda4600399
LockStore: fix acquiring a lock via LockStore.try_acquire_lock (#12832)
Signed-off-by: Sumner Evans <sumner@beeper.com>
2022-05-30 09:41:13 +01:00
Erik Johnston
fcf951d5dc
Track in memory events using weakrefs (#10533) 2022-05-17 10:34:27 +01:00
Richard van der Hoff
96e0cdbc5a
Add a consistency check on events read from the database (#12620)
I've seen a few errors which can only plausibly be explained by the calculated
event id for an event being different from the ID of the event in the
database. It should be cheap to check this, so let's do so and raise an
exception.
2022-05-03 21:27:52 +01:00
Sean Quah
8a87b4435a
Handle cancellation in EventsWorkerStore._get_events_from_cache_or_db (#12529)
Multiple calls to `EventsWorkerStore._get_events_from_cache_or_db` can
reuse the same database fetch, which is initiated by the first call.
Ensure that cancelling the first call doesn't cancel the other calls
sharing the same database fetch.

Signed-off-by: Sean Quah <seanq@element.io>
2022-04-25 19:39:17 +01:00
Richard van der Hoff
f0b03186d9
Add type hints for tests/unittest.py. (#12347)
In particular, add type hints for get_success and friends, which are then helpful in a bunch of places.
2022-04-01 16:04:16 +00:00
reivilibre
c7b2f1ccdc
Back out in-flight state caching changes. (#12126) 2022-03-02 10:37:04 +00:00
reivilibre
c893632319
Order in-flight state group queries in biggest-first order (#11610)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2022-03-01 13:41:57 +00:00
Patrick Cloke
02d708568b
Replace assertEquals and friends with non-deprecated versions. (#12092) 2022-02-28 07:12:29 -05:00
Richard van der Hoff
e24ff8ebe3
Remove HomeServer.get_datastore() (#12031)
The presence of this method was confusing, and mostly present for backwards
compatibility. Let's get rid of it.

Part of #11733
2022-02-23 11:04:02 +00:00
reivilibre
dcb6a37837
Cap the number of in-flight requests for state from a single group (#11608) 2022-02-22 14:24:31 +00:00
reivilibre
546b9c9e64
Add more tests for in-flight state query duplication. (#12033) 2022-02-22 11:44:11 +00:00
reivilibre
284ea2025a
Track and deduplicate in-flight requests to _get_state_for_groups. (#10870)
Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
2022-02-18 17:23:31 +00:00
Sean Quah
c675a18071
Track ongoing event fetches correctly (again) (#11376)
The previous fix for the ongoing event fetches counter
(8eec25a1d9) was both insufficient and
incorrect.

When the database is unreachable, `_do_fetch` never gets run and so
`_event_fetch_ongoing` is never decremented.

The previous fix also moved the `_event_fetch_ongoing` decrement outside
of the `_event_fetch_lock` which allowed race conditions to corrupt the
counter.
2021-11-26 13:47:24 +00:00
Brendan Abolivier
0d88c4f903
Improve performance of remove_{hidden,deleted}_devices_from_device_inbox (#11421)
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2021-11-25 15:14:54 +00:00
Dirk Klimpel
4535532526
Delete messages for hidden devices from device_inbox (#11199) 2021-11-02 13:18:30 +00:00
Dirk Klimpel
8d46fac98e
Delete messages from device_inbox table when deleting device (#10969)
Fixes: #9346
2021-10-27 16:01:18 +01:00
David Robertson
370bca32e6
Don't drop user dir deltas when server leaves room (#10982)
Fix a long-standing bug where a batch of user directory changes would be
silently dropped if the server left a room early in the batch.

* Pull out `wait_for_background_update` in tests

Co-authored-by: Patrick Cloke <clokep@users.noreply.github.com>
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
2021-10-06 12:56:45 +00:00
Eric Eastwood
dc75fb7f05
Populate rooms.creator field for easy lookup (#10697)
Part of https://github.com/matrix-org/synapse/pull/10566

 - Fill in creator whenever we insert into the rooms table
 - Add background update to backfill any missing creator values
2021-09-01 16:27:58 +01:00
reivilibre
642a42edde
Flatten the synapse.rest.client package (#10600) 2021-08-17 11:57:58 +00:00
Erik Johnston
c37dad67ab
Improve event caching code (#10119)
Ensure we only load an event from the DB once when the same event is requested multiple times at once.
2021-08-04 13:54:51 +01:00
Erik Johnston
54389d5697
Fix dropping locks on shut down (#10433) 2021-07-20 14:24:25 +01:00
Erik Johnston
85d237eba7
Add a distributed lock (#10269)
This adds a simple best effort locking mechanism that works cross workers.
2021-06-29 19:15:47 +01:00
Richard van der Hoff
b4b2fd2ece
add a cache to have_seen_event (#9953)
Empirically, this helped my server considerably when handling gaps in Matrix HQ. The problem was that we would repeatedly call have_seen_events for the same set of (50K or so) auth_events, each of which would take many minutes to complete, even though it's only an index scan.
2021-06-01 12:04:47 +01:00