mirror of
https://github.com/element-hq/synapse.git
synced 2024-12-20 02:24:54 +03:00
Initial room and user statistics documentation
Signed-off-by: Olivier Wilkinson (reivilibre) <olivier@librepush.net>
This commit is contained in:
parent
20ec9698ac
commit
16e2ffd166
1 changed files with 146 additions and 0 deletions
146
docs/room_and_user_statistics.md
Normal file
146
docs/room_and_user_statistics.md
Normal file
|
@ -0,0 +1,146 @@
|
|||
Room and User Statistics
|
||||
========================
|
||||
|
||||
Synapse maintains room and user statistics (as well as a cache of room state),
|
||||
in various tables.
|
||||
|
||||
These can be used for administrative purposes but are also used when generating
|
||||
the public room directory. If these tables get stale or out of sync (possibly
|
||||
after database corruption), you may wish to regenerate them.
|
||||
|
||||
|
||||
# Synapse Administrator Documentation
|
||||
|
||||
## Various SQL scripts that you may find useful
|
||||
|
||||
### Delete stats, including historical stats
|
||||
|
||||
```sql
|
||||
DELETE FROM room_stats_current;
|
||||
DELETE FROM room_stats_historical;
|
||||
DELETE FROM user_stats_current;
|
||||
DELETE FROM user_stats_historical;
|
||||
```
|
||||
|
||||
### Regenerate stats (all subjects)
|
||||
|
||||
```sql
|
||||
BEGIN;
|
||||
DELETE FROM stats_incremental_position;
|
||||
INSERT INTO stats_incremental_position (
|
||||
state_delta_stream_id,
|
||||
total_events_min_stream_ordering,
|
||||
total_events_max_stream_ordering,
|
||||
is_background_contract
|
||||
) VALUES (NULL, NULL, NULL, FALSE), (NULL, NULL, NULL, TRUE);
|
||||
COMMIT;
|
||||
|
||||
DELETE FROM room_stats_current;
|
||||
DELETE FROM user_stats_current;
|
||||
```
|
||||
|
||||
then follow the steps below for **'Regenerate stats (missing subjects only)'**
|
||||
|
||||
### Regenerate stats (missing subjects only)
|
||||
|
||||
```sql
|
||||
-- Set up staging tables
|
||||
-- we depend on current_state_events_membership because this is used
|
||||
-- in our counting.
|
||||
INSERT INTO background_updates (update_name, progress_json) VALUES
|
||||
('populate_stats_prepare', '{}', 'current_state_events_membership');
|
||||
|
||||
-- Run through each room and update stats
|
||||
INSERT INTO background_updates (update_name, progress_json, depends_on) VALUES
|
||||
('populate_stats_process_rooms', '{}', 'populate_stats_prepare');
|
||||
|
||||
-- Run through each user and update stats.
|
||||
INSERT INTO background_updates (update_name, progress_json, depends_on) VALUES
|
||||
('populate_stats_process_users', '{}', 'populate_stats_process_rooms');
|
||||
|
||||
-- Clean up staging tables
|
||||
INSERT INTO background_updates (update_name, progress_json, depends_on) VALUES
|
||||
('populate_stats_cleanup', '{}', 'populate_stats_process_users');
|
||||
```
|
||||
|
||||
then **restart Synapse**.
|
||||
|
||||
|
||||
# Synapse Developer Documentation
|
||||
|
||||
## High-Level Concepts
|
||||
|
||||
### Definitions
|
||||
|
||||
* **subject**: Something we are tracking stats about – currently a room or user.
|
||||
* **current row**: An entry for a subject in the appropriate current statistics
|
||||
table. Each subject can have only one.
|
||||
* **historical row**: An entry for a subject in the appropriate historical
|
||||
statistics table. Each subject can have any number of these.
|
||||
|
||||
### Overview
|
||||
|
||||
Stats are maintained as time series. There are two kinds of column:
|
||||
|
||||
* absolute columns – where the value is correct for the time given by `end_ts`
|
||||
in the stats row. (Imagine a line graph for these values)
|
||||
* per-slice columns – where the value corresponds to how many of the occurrences
|
||||
occurred within the time slice given by `(end_ts − bucket_size)…end_ts`
|
||||
or `start_ts…end_ts`. (Imagine a histogram for these values)
|
||||
|
||||
Currently, only absolute columns are in use.
|
||||
|
||||
Stats are maintained in two tables (for each type): current and historical.
|
||||
|
||||
Current stats correspond to the present values. Each subject can only have one
|
||||
entry.
|
||||
|
||||
Historical stats correspond to values in the past. Subjects may have multiple
|
||||
entries.
|
||||
|
||||
## Concepts around the management of stats
|
||||
|
||||
### current rows
|
||||
|
||||
#### dirty current rows
|
||||
|
||||
Current rows can be **dirty**, which means that they have changed since the
|
||||
latest historical row for the same subject.
|
||||
**Dirty** current rows possess an end timestamp, `end_ts`.
|
||||
|
||||
#### old current rows and old collection
|
||||
|
||||
When a (necessarily dirty) current row has an `end_ts` in the past, it is said
|
||||
to be **old**.
|
||||
Old current rows must be copied into a historical row, and cleared of their dirty
|
||||
status, before further statistics can be tracked for that subject.
|
||||
The process which does this is referred to as **old collection**.
|
||||
|
||||
#### incomplete current rows
|
||||
|
||||
There are also **incomplete** current rows, which are current rows that do not
|
||||
contain a full count yet – this is because they are waiting for the regeneration
|
||||
process to give them an initial count. Incomplete current rows DO NOT contain
|
||||
correct and up-to-date values. As such, *incomplete rows are not old-collected*.
|
||||
Instead, old incomplete rows will be extended so they are no longer old.
|
||||
|
||||
### historical rows
|
||||
|
||||
Historical rows can always be considered to be valid for the time slice and
|
||||
end time specified. (This, of course, assumes a lack of defects in the code
|
||||
to track the statistics, and assumes integrity of the database).
|
||||
|
||||
Even still, there are two considerations that we may need to bear in mind:
|
||||
|
||||
* historical rows will not exist for every time slice – they will be omitted
|
||||
if there were no changes. In this case, the following assumptions can be
|
||||
made to interpolate/recreate missing rows:
|
||||
- absolute fields have the same values as in the preceding row
|
||||
- per-slice fields are zero (`0`)
|
||||
* historical rows will not be retained forever – rows older than a configurable
|
||||
time will be purged.
|
||||
|
||||
#### purge
|
||||
|
||||
The purging of historical rows is not yet implemented.
|
||||
|
Loading…
Reference in a new issue