Don't lock up when joining large rooms (#16903)

Co-authored-by: Andrew Morgan <andrew@amorgan.xyz>
This commit is contained in:
Erik Johnston 2024-02-20 14:29:18 +00:00 committed by GitHub
parent c51a2240d1
commit cdbbf3653d
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
2 changed files with 18 additions and 9 deletions

1
changelog.d/16903.bugfix Normal file
View file

@ -0,0 +1 @@
Fix performance issue when joining very large rooms that can cause the server to lock up. Introduced in v1.100.0.

View file

@ -1757,17 +1757,25 @@ class FederationEventHandler:
events_and_contexts_to_persist.append((event, context))
for event in sorted_auth_events:
for i, event in enumerate(sorted_auth_events):
await prep(event)
await self.persist_events_and_notify(
room_id,
events_and_contexts_to_persist,
# Mark these events backfilled as they're historic events that will
# eventually be backfilled. For example, missing events we fetch
# during backfill should be marked as backfilled as well.
backfilled=True,
)
# The above function is typically not async, and so won't yield to
# the reactor. For large rooms let's yield to the reactor
# occasionally to ensure we don't block other work.
if (i + 1) % 1000 == 0:
await self._clock.sleep(0)
# Also persist the new event in batches for similar reasons as above.
for batch in batch_iter(events_and_contexts_to_persist, 1000):
await self.persist_events_and_notify(
room_id,
batch,
# Mark these events as backfilled as they're historic events that will
# eventually be backfilled. For example, missing events we fetch
# during backfill should be marked as backfilled as well.
backfilled=True,
)
@trace
async def _check_event_auth(