Commit graph

118 commits

Author SHA1 Message Date
Till
699f5ca8c1
More rows.Close() and rows.Err() (#3262)
Looks like we missed some `rows.Close()`

Even though `rows.Err()` is mostly not necessary, we should be more
consistent in the DB layer.

[skip ci]
2023-11-09 08:42:33 +01:00
Sam Wedgwood
9a12420428
[pseudoID] More pseudo ID fixes (#3167)
Signed-off-by: `Sam Wedgwood <sam@wedgwood.dev>`
2023-08-15 12:37:04 +01:00
Till
297479ea49
Use pointer when passing the connection manager around (#3152)
As otherwise existing connections aren't reused.
2023-07-19 13:37:04 +02:00
Till
23cd7877a1
Add MXIDMapping for pseudoID rooms (#3112)
Add `MXIDMapping` on membership events when
creating/joining rooms.
2023-06-28 20:29:49 +02:00
Devon Hudson
5aaa539e3e
Fix senderID/key conversions 2023-06-14 16:42:09 +01:00
devonh
e4665979bf
Merge SenderID & Per Room User Key work (#3109) 2023-06-14 14:23:46 +00:00
Till
832ccc32f6
Add initial support for storing user room keys (#3098) 2023-06-12 12:45:42 +02:00
Till
61341aca50
Add tests for the UpDropEventReferenceSHAPrevEvents migration (#3087)
... as they could fail if there are duplicate events in
`roomserver_previous_events`.
This fixes the migration by trying to combine the `event_nids` if
possible (same room) as mentioned by @kegsay in
https://github.com/matrix-org/dendrite/pull/3083#discussion_r1195508963
2023-05-30 18:05:48 +02:00
Till
11b557097c
Drop reference_sha column (#3083)
Companion PR to https://github.com/matrix-org/gomatrixserverlib/pull/383
2023-05-24 12:14:42 +02:00
kegsay
b189edf4f4
Remove gmsl.HeaderedEvent (#3068)
Replaced with types.HeaderedEvent _for now_. In reality we want to move
them all to gmsl.Event and only use HeaderedEvent when we _need_ to
bundle the version/event ID with the event (seriailsation boundaries,
and even then only when we don't have the room version).

Requires https://github.com/matrix-org/gomatrixserverlib/pull/373
2023-04-27 12:54:20 +01:00
kegsay
72285b2659
refactor: update GMSL (#3058)
Sister PR to https://github.com/matrix-org/gomatrixserverlib/pull/364

Read this commit by commit to avoid going insane.
2023-04-19 15:50:33 +01:00
Till
5579121c6f
Preparations for removing BaseDendrite (#3016)
Preparations to actually remove/replace `BaseDendrite`.
Quite a few changes:
- SyncAPI accepts an `fulltext.Indexer` interface (fulltext is removed
from `BaseDendrite`)
- Caches are removed from `BaseDendrite`
- Introduces a `Router` struct (likely to change)
  - also fixes #2903
- Introduces a `sqlutil.ConnectionManager`, which should remove
`base.DatabaseConnection` later on
- probably more
2023-03-17 11:09:45 +00:00
Till
6c20f8f742
Refactor StoreEvent, add MaybeRedactEvent, create an EventDatabase (#2989)
This PR changes the following:
- `StoreEvent` now only stores an event (and possibly prev event),
instead of also doing redactions
- Adds a `MaybeRedactEvent` (pulled out from `StoreEvent`), which should
be called after storing events
- a few other things
2023-03-01 17:06:47 +01:00
Till
ad07b169b8
Refactor StoreEvent and create a new RoomDatabase interface (#2985)
This PR changes a few things:
- It pulls out the creation of several NIDs from the `StoreEvent`
function to make the functions more reusable
- Uses more caching when using those NIDs to avoid DB round trips
2023-02-24 09:40:20 +01:00
Till
eb29a31550
Optimize /sync and history visibility (#2961)
Should fix the following issues or make a lot less worse when using
Postgres:

The main issue behind #2911: The client gives up after a certain time,
causing a cascade of context errors, because the response couldn't be
built up fast enough. This mostly happens on accounts with many rooms,
due to the inefficient way we're getting recent events and current state

For #2777: The queries for getting the membership events for history
visibility were being executed for each room (I think 185?), resulting
in a whooping 2k queries for membership events. (Getting the
statesnapshot -> block nids -> actual wanted membership event)

Both should now be better by:
- Using a LATERAL join to get all recent events for all joined rooms in
one go (TODO: maybe do the same for room summary and current state etc)
- If we're lazy loading on initial syncs, we're now not getting the
whole current state, just to drop the majority of it because we're lazy
loading members - we add a filter to exclude membership events on the
first call to `CurrentState`.
- Using an optimized query to get the membership events needed to
calculate history visibility

---------

Co-authored-by: kegsay <kegan@matrix.org>
2023-02-07 14:31:23 +01:00
Neil
738686ae68
Add /_dendrite/admin/purgeRoom/{roomID} (#2662)
This adds a new admin endpoint `/_dendrite/admin/purgeRoom/{roomID}`. It
completely erases all database entries for a given room ID.

The roomserver will start by clearing all data for that room and then
will generate an output event to notify downstream components (i.e. the
sync API and federation API) to do the same.

It does not currently clear media and it is currently not implemented
for SQLite since it relies on SQL array operations right now.

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: Till Faelligen <2353100+S7evinK@users.noreply.github.com>
2023-01-19 21:02:32 +01:00
Till
7d2344049d
Cleanup stale device lists for users we don't share a room with anymore (#2857)
The stale device lists table might contain entries for users we don't
share a room with anymore. This now asks the roomserver about left users
and removes those entries from the table.

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-12-12 08:20:59 +01:00
Neil Alexander
6663728eb1
Fix SQLite roomserver_published migration 2022-11-01 16:08:13 +00:00
Till
2acc1d65fb
Optimize history visibility checks (#2848)
This optimizes history visibility checks by (mostly) avoiding database
hits.
Possibly solves https://github.com/matrix-org/dendrite/issues/2777

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-11-01 15:07:17 +00:00
Till
444b4bbdb8
Add AS specific public room list endpoints (#2836)
Adds `PUT
/_matrix/client/v3/directory/list/appservice/{networkId}/{roomId}` and
`DELTE
/_matrix/client/v3/directory/list/appservice/{networkId}/{roomId}`
support, as well as the ability to filter `/publicRooms` on networkID
and including all networks.
2022-10-27 14:40:35 +02:00
Till
1ca3f3efb5
Fix issue with DMs shown as normal rooms (#2776)
Fixes #2121, test added in
https://github.com/matrix-org/complement/pull/494
2022-10-07 16:00:12 +02:00
Neil Alexander
c85bc3434f
Optimise QuerySharedUsers so that we can only work on local users (#2766)
Otherwise the sync API key change consumer wastes a lot of time trying
to wake up the notifiers for non-local users.
2022-10-05 12:47:53 +01:00
Till
100fa9b235
Check unique constraint errors when manually inserting migrations (#2712)
This should avoid unnecessary logging on startup if the migration (were
we need `InsertMigration`) was already executed.
This now checks for "unique constraint errors" for SQLite and Postgres
and fails the startup process if the migration couldn't be manually
inserted for some other reason.
2022-09-13 08:07:43 +02:00
Till
8196b29657
Change detection of already executed migrations (#2665)
This changes the detection of already executed migrations for the
roomserver state block and keychange refactor. It now uses schema tables
provided by the database engine to check if the column was already
removed. We now also store the migration in the migrations table.

This should stop e.g. Postgres from logging errors like `ERROR: column
"event_nid" does not exist at character 8`.
2022-09-09 13:14:52 +01:00
Neil Alexander
522bd2999f
Allow un-rejecting events on reprocessing 2022-08-24 14:03:06 +01:00
Neil Alexander
14fea600bb
Detect types.MissingStateError in CheckServerAllowedToSeeEvent (#2667)
This will hopefully stop some 500 errors on `/event` where there is no state-before known.
2022-08-23 13:57:11 +01:00
Neil Alexander
6b48ce0d75
State handling tweaks (#2652)
This tweaks how rejected events are handled in room state and also to not apply checks we can't complete to outliers.
2022-08-18 17:06:13 +01:00
Neil Alexander
59bc0a6f4e
Reprocess rejected input events (#2647)
* Reprocess outliers that were previously rejected

* Might as well do all events this way

* More useful errors

* Fix queries

* Tweak condition

* Don't wrap errors

* Report more useful error

* Flatten error on `r.Queryer.QueryStateAfterEvents`

* Some more debug logging

* Flatten error in `QueryRestrictedJoinAllowed`

* Revert "Flatten error in `QueryRestrictedJoinAllowed`"

This reverts commit 1238b4184c30e0c31ffb0f364806fa1275aba483.

* Tweak `QueryStateAfterEvents`

* Handle MissingStateError too

* Scope to room

* Clean up

* Fix the error

* Only apply rejection check to outliers
2022-08-18 10:37:47 +01:00
Till
03ddd98f5e
Fix issues with migrations not getting executed (#2628)
* Fix issues with migrations not getting executed

* Check actual postgres error

* Return error if it's not "column does not exist"
2022-08-08 10:18:57 +02:00
Neil Alexander
119cde3766
De-race types.RoomInfo (#2600) 2022-08-01 15:29:19 +01:00
Neil Alexander
05c83923e3
Optimise checking other servers allowed to see events (#2596)
* Try optimising checking if server is allowed to see event

* Fix error

* Handle case where snapshot NID is 0

* Fix query

* Update SQL

* Clean up `CheckServerAllowedToSeeEvent`

* Not supported on SQLite

* Maybe placate the unit tests

* Review comments
2022-08-01 14:11:00 +01:00
Till
081f5e7226
Update database migrations, remove goose (#2264)
* Add new db migration

* Update migrations
Remove goose

* Add possibility to test direct upgrades

* Try to fix WASM test

* Add checks for specific migrations

* Remove AddMigration
Use WithTransaction
Add Dendrite version to table

* Fix linter issues

* Update tests

* Update comments, outdent if

* Namespace migrations

* Add direct upgrade tests, skipping over one version

* Split migrations

* Update go version in CI

* Fix copy&paste mistake

* Use contexts in migrations

Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
2022-07-25 10:39:22 +01:00
Neil Alexander
c7d978274d
Try to fix HTTP 500s on /members (#2581) 2022-07-22 19:43:48 +01:00
Neil Alexander
f0c8a03649
Membership updater refactoring (#2541)
* Membership updater refactoring

* Pass in membership state

* Use membership check rather than referring to state directly

* Delete irrelevant membership states

* We don't need the leave event after all

* Tweaks

* Put a log entry in that I might stand a chance of finding

* Be less panicky

* Tweak invite handling

* Don't freak if we can't find the event NID

* Use event NID from `types.Event`

* Clean up

* Better invite handling

* Placate the almighty linter

* Blacklist a Sytest which is otherwise fine under Complement for reasons I don't understand

* Fix the sytest after all (thanks @S7evinK for the spot)
2022-07-22 14:44:04 +01:00
Till
5087b36af0
Fix QuerySharedUsers for the SyncAPI keychange consumer (#2554)
* Make more use of base.BaseDendrite

* Fix QuerySharedUsers if no UserIDs are supplied
2022-07-05 14:50:56 +02:00
Till
05607d6b87
Add roomserver tests (3/4) (#2447)
* Add Room Aliases tests

* Add Rooms table test

* Move StateKeyTuplerSorter to the types package

* Add StateBlock tests
Some optimizations

* Add State Snapshot tests
Some optimization

* Return []int64 and convert to pq.Int64Array for postgres

* Move []types.EventNID back to rows.Next()

* Update tests, rename SelectRoomIDs
2022-05-16 19:33:16 +02:00
Till
6db08b2874
Add roomserver tests (2/?) (#2445)
* Add invite table tests; move variable declarations

* Add Membership table tests

* Move variable declarations

* Add PrevEvents table tests

* Add Published table test

* Add Redactions tests
Fix bug in SQLite markRedactionValidatedSQL

* PR comments, better readability for invite tests
2022-05-10 14:41:12 +02:00
Till
f69ebc6af2
Add roomserver tests (1/?) (#2434)
* Add EventJSONTable tests

* Add eventJSON tests

* Add EventStateKeysTable tests

* Add EventTypesTable tests

* Add Events Table tests
Move variable declaration outside loops
Switch to testify/assert for tests

* Move variable declaration outside loop

* Remove random data

* Fix issue where the EventReferenceSHA256 is not set

* Add more tests

* Revert "Fix issue where the EventReferenceSHA256 is not set"

This reverts commit 8ae34c4e5f78584f0edb479f5a893556d2b95d19.

* Update GMSL

* Add tests for duplicate entries

* Test what happens if we select non-existing NIDs

* Add test for non-existing eventType

* Really update GMSL
2022-05-09 15:30:32 +02:00
Neil Alexander
4ad5f9c982
Global database connection pool (for monolith mode) (#2411)
* Allow monolith components to share a single database pool

* Don't yell about missing connection strings

* Rename field

* Setup tweaks

* Fix panic

* Improve configuration checks

* Update config

* Fix lint errors

* Update comments
2022-05-03 16:35:06 +01:00
Neil Alexander
d983d17355
Fix lint errors 2022-03-24 10:03:22 +00:00
Neil Alexander
191486438c
Assign room NIDs, event type NIDs and state key NIDs outside of database transactions (#2265)
* Assign room NIDs and state key NIDs outside of database transactions

* In roomserver storage package too

* Don't take a `txn` parameter, clean up SQLite
2022-03-17 18:24:27 +00:00
Neil Alexander
4e64c270db
Various bug fixes and tweaks around invites and membership 2022-03-17 17:05:21 +00:00
Neil Alexander
22a034dcba
Fix memory leaks with SQLite prepared statements (#2253) 2022-03-04 15:05:42 +00:00
Neil Alexander
530f05885d
Limit JoinedUsersSetInRooms to interested users (#2234)
* Limit database work in `JoinedUsersSetInRooms` to changed user IDs only

* Comments

* Fix variadic params for SQLite, update comments
2022-03-01 13:01:38 +00:00
Neil Alexander
aa6bbf484a
Return ErrRoomNoExists if insufficient state is available for a buildEvent to succeed when joining a room (#2210)
This may help cases like #2206, since it should prompt us to try a federated join again instead.
2022-02-21 16:22:29 +00:00
Neil Alexander
a02dd7721d
Reset invalid state snapshots for events during state storage refactor migration (#2209)
This should help with #2204. We can't do this for rooms, only events.
2022-02-21 15:25:54 +00:00
Neil Alexander
7dfc7c3d70
Don't re-send sent events in add_state_events (#2195)
* Only add events to `add_state_events` that haven't already been sent to the roomserver output before

* Filter on event NIDs instead, hopefully bring joy to SQLite

* UnsentFilter, review comments
2022-02-17 13:53:48 +00:00
Neil Alexander
0e26662a55
Allow events to be un-rejected (#2159)
* Allow un-rejecting an event later

* SQL

* Only un-reject, don't re-reject

* Clarify ambiguous column reference
2022-02-08 13:45:48 +00:00
Neil Alexander
eb352a5f6b
Full roomserver input transactional isolation (#2141)
* Add transaction to all database tables in roomserver, rename latest events updater to room updater, use room updater for all RS input

* Better transaction management

* Tweak order

* Handle cases where the room does not exist

* Other fixes

* More tweaks

* Fill some gaps

* Fill in the gaps

* good lord it gets worse

* Don't roll back transactions when events rejected

* Pass through errors properly

* Fix bugs

* Fix incorrect error check

* Don't panic on nil txns

* Tweaks

* Hopefully fix panics for good in SQLite this time

* Fix rollback

* Minor bug fixes with latest event updater

* Some review comments

* Revert "Some review comments"

This reverts commit 0caf8cf53e62c33f7b83c52e9df1d963871f751e.

* Fix a couple of bugs

* Clearer commit and rollback results

* Remove unnecessary prepares
2022-02-04 10:39:34 +00:00
Neil Alexander
a763cbb0e1
Roomserver/federation input refactor (#2104)
* Put federation client functions into their own file

* Look for missing auth events in RS input

* Remove retrieveMissingAuthEvents from federation API

* Logging

* Sorta transplanted the code over

* Use event origin failing all else

* Don't get stuck on mutexes:

* Add verifier

* Don't mark state events with zero snapshot NID as not existing

* Check missing state if not an outlier before storing the event

* Reject instead of soft-fail, don't copy roominfo so much

* Use synchronous contexts, limit time to fetch missing events

* Clean up some commented out bits

* Simplify `/send` endpoint significantly

* Submit async

* Report errors on sending to RS input

* Set max payload in NATS to 16MB

* Tweak metrics

* Add `workerForRoom` for tidiness

* Try skipping unmarshalling errors for RespMissingEvents

* Track missing prev events separately to avoid calculating state when not possible

* Tweak logic around checking missing state

* Care about state when checking missing prev events

* Don't check missing state for create events

* Try that again

* Handle create events better

* Send create room events as new

* Use given event kind when sending auth/state events

* Revert "Use given event kind when sending auth/state events"

This reverts commit 089d64d271.

* Only search for missing prev events or state for new events

* Tweaks

* We only have missing prev if we don't supply state

* Room version tweaks

* Allow async inputs again

* Apply backpressure to consumers/synchronous requests to hopefully stop things being overwhelmed

* Set timeouts on roomserver input tasks (need to decide what timeout makes sense)

* Use work queue policy, deliver all on restart

* Reduce chance of duplicates being sent by NATS

* Limit the number of servers we attempt to reduce backpressure

* Some review comment fixes

* Tidy up a couple things

* Don't limit servers, randomise order using map

* Some context refactoring

* Update gmsl

* Don't resend create events

* Set stateIDs length correctly or else the roomserver thinks there are missing events when there aren't

* Exclude our own servername

* Try backing off servers

* Make excluding self behaviour optional

* Exclude self from g_m_e

* Update sytest-whitelist

* Update consumers for the roomserver output stream

* Remember to send outliers for state returned from /gme

* Make full HTTP tests less upsetti

* Remove 'If a device list update goes missing, the server resyncs on the next one' from the sytest blacklist

* Remove debugging test

* Fix blacklist again, remove unnecessary duplicate context

* Clearer contexts, don't use background in case there's something happening there

* Don't queue up events more than once in memory

* Correctly identify create events when checking for state

* Fill in gaps again in /gme code

* Remove `AuthEventIDs` from `InputRoomEvent`

* Remove stray field

Co-authored-by: Kegan Dougal <kegan@matrix.org>
2022-01-27 14:29:14 +00:00