`If a device list update goes missing, the server resyncs on the next
one` was failing because a previous test would receive a `waitTime` of
1h, resulting in the test timing out.
This now tries to handle the returned errors differently, e.g. by using
the default `waitTime` of 2s. Also doesn't try further users in the
list, if one of the errors would cause a longer `waitTime`.
This makes the following changes:
* The various `Defaults` functions are now responsible for setting sane defaults if `generate` is specified, rather than hiding them in `generate-config`
* Some configuration options have been marked as `omitempty` so that they don't appear in generated configs unnecessarily (monolith-specific vs. polylith-specific options)
* A new option `-polylith` has been added to `generate-config` to create a config that makes sense for polylith deployments (i.e. including the internal/external API listeners and per-component database sections)
* A new option `-normalise` has been added to `generate-config` to take an existing file and add any missing options and/or defaults
In some conditions (fast CPUs), this test would race the clock for EDU expiration when all we want to make sure of is that the expired EDUs are properly deleted. Given this, we set the expiry time to 0 so the specified EDUs are always deleted when DeleteExpiredEDUs is called.
Fixes#2650.
Signed-off-by: Winter <winter@winter.cafe>
* Generic-based internal HTTP API (tested out on a few endpoints in the federation API)
* Add `PerformInvite`
* More tweaks
* Fix metric name
* Fix LookupStateIDs
* Lots of changes to clients
* Some serverside stuff
* Some error handling
* Use paths as metric names
* Revert "Use paths as metric names"
This reverts commit a9323a6a343f5ce6461a2e5bd570fe06465f1b15.
* Namespace metric names
* Remove duplicate entry
* Remove another duplicate entry
* Tweak error handling
* Some more tweaks
* Update error behaviour
* Some more error tweaking
* Fix API path for `PerformDeleteKeys`
* Fix another path
* Tweak federation client proxying
* Fix another path
* Don't return typed nils
* Some more tweaks, not that it makes any difference
* Tweak federation client proxying
* Maybe fix the key backup test
* Add housekeeping function to delete old/expired EDUs
* Add migrations
* Evict EDUs from cache
* Fix queries
* Fix upgrade
* Use map[string]time.Duration to specify different expiry times
* Fix copy & paste mistake
* Set expires_at to tomorrow
* Don't allow NULL
* Add comment
* Add tests
* Use new testrig package
* Fix migrations
* Never expire m.direct_to_device
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
Co-authored-by: kegsay <kegan@matrix.org>
* Add race testing to tests, and fix a few small race conditions in the tests
* Enable run-sytest on MacOS
* Remove deadlock detecting mutex, per code review feedback
* Remove autoformatting related changes and a closure that is not needed
* Adjust to importing nats client as 'natsclient'
Signed-off-by: Brian Meek <brian@hntlabs.com>
* Clarify the use of gooseMutex to proect goose internal state
Signed-off-by: Brian Meek <brian@hntlabs.com>
* Remove no longer needed mutex for guarding goose
Signed-off-by: Brian Meek <brian@hntlabs.com>
* Add new db migration
* Update migrations
Remove goose
* Add possibility to test direct upgrades
* Try to fix WASM test
* Add checks for specific migrations
* Remove AddMigration
Use WithTransaction
Add Dendrite version to table
* Fix linter issues
* Update tests
* Update comments, outdent if
* Namespace migrations
* Add direct upgrade tests, skipping over one version
* Split migrations
* Update go version in CI
* Fix copy&paste mistake
* Use contexts in migrations
Co-authored-by: kegsay <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Membership updater refactoring
* Pass in membership state
* Use membership check rather than referring to state directly
* Delete irrelevant membership states
* We don't need the leave event after all
* Tweaks
* Put a log entry in that I might stand a chance of finding
* Be less panicky
* Tweak invite handling
* Don't freak if we can't find the event NID
* Use event NID from `types.Event`
* Clean up
* Better invite handling
* Placate the almighty linter
* Blacklist a Sytest which is otherwise fine under Complement for reasons I don't understand
* Fix the sytest after all (thanks @S7evinK for the spot)
* Try Ristretto cache
* Tweak
* It's beautiful
* Update GMSL
* More strict keyable interface
* Fix that some more
* Make less panicky
* Don't enforce mutability checks for now
* Determine mutability using deep equality
* Tweaks
* Namespace keys
* Make federation caches mutable
* Update cost estimation, add metric
* Update GMSL
* Estimate cost for metrics better
* Reduce counters a bit
* Try caching events
* Some guards
* Try again
* Try this
* Use separate caches for hopefully better hash distribution
* Fix bug with admitting events into cache
* Try to fix bugs
* Check nil
* Try that again
* Preserve order jeezo this is messy
* thanks VS Code for doing exactly the wrong thing
* Try this again
* Be more specific
* aaaaargh
* One more time
* That might be better
* Stronger sorting
* Cache expiries, async publishing of EDUs
* Put it back
* Use a shared cache again
* Cost estimation fixes
* Update ristretto
* Reduce counters a bit
* Clean up a bit
* Update GMSL
* 1GB
* Configurable cache sizees
* Tweaks
* Add `config.DataUnit` for specifying friendly cache sizes
* Various tweaks
* Update GMSL
* Add back some lazy loading caching
* Include key in cost
* Include key in cost
* Tweak max age handling, config key name
* Only register prometheus metrics if requested
* Review comments @S7evinK
* Don't return errors when creating caches (it is better just to crash since otherwise we'll `nil`-pointer exception everywhere)
* Review comments
* Update sample configs
* Update GHA Workflow
* Update Complement images to Go 1.18
* Remove the cache test from the federation API as we no longer guarantee immediate cache admission
* Don't check the caches in the renewal test
* Possibly fix the upgrade tests
* Update to matrix-org/gomatrixserverlib#322
* Update documentation to refer to Go 1.18
This should avoid coercions between signed and unsigned ints which might fix problems like `sql: converting argument $5 type: uint64 values with high bit set are not supported`.
* Add `QueryRestrictedJoinAllowed`
* Add `Resident` flag to `QueryRestrictedJoinAllowedResponse`
* Check restricted joins on federation API
* Return `Restricted` to determine if the room was restricted or not
* Populate `AuthorisedVia` properly
* Sign the event on `/send_join`, return it in the `/send_join` response in the `"event"` key
* Kick back joins with invalid authorising user IDs, use event from `"event"` key if returned in `RespSendJoin`
* Use invite helper in `QueryRestrictedJoinAllowed`
* Only use users with the power to invite, change error bubbling a bit
* Placate the almighty linter
One day I will nuke `gocyclo` from orbit and everything in the world will be much better for it.
* Review comments
* Fix flakey sytest 'Local device key changes get to remote servers'
* Debug logs
* Remove internal/test and use /test only
Remove a lot of ancient code too.
* Use FederationRoomserverAPI in more places
* Use more interfaces in federationapi; begin adding regression test
* Linting
* Add regression test
* Unbreak tests
* ALL THE LOGS
* Fix a race condition which could cause events to not be sent to servers
If a new room event which rewrites state arrives, we remove all joined hosts
then re-calculate them. This wasn't done in a transaction so for a brief period
we would have no joined hosts. During this interim, key change events which arrive
would not be sent to destination servers. This would sporadically fail on sytest.
* Unbreak new tests
* Linting
* Add very basic syncapi tests
* Add a way to inject jetstream messages
* implement add_state_ids
* bugfixes
* Unbreak tests
* Remove now un-needed API call
* Linting
* Don't ask roomserver for events we already have in federation API
* Check number of events returned is as expected
* Preallocate array
* Improve shape a bit
* tidy up interfaces
* remove unused GetCreatorIDForAlias
* Add RoomserverUserAPI interface
* Define more interfaces
* Use AppServiceInternalAPI for consistent naming
* clean up federationapi constructor a bit
* Fix monolith in -http mode
* Specify interfaces used by appservice, do half of clientapi
* convert more deps of clientapi to finer-grained interfaces
* Convert mediaapi and rest of clientapi
* Somehow this got missed
* Simplify federation API `AddPublicRoutes`
* Simplify client API `AddPublicRoutes`
* Simplify media API `AddPublicRoutes`
* Simplify sync API `AddPublicRoutes`
* Simplify `AddAllPublicRoutes`
* Fix retrieving cross-signing signatures in `/user/devices/{userId}`
We need to know the target device IDs in order to get the signatures and we weren't populating those.
* Fix up signature retrieval
* Fix SQLite
* Always include the target's own signatures as well as the requesting user
* Add response size and requests total to internal handler
* Move MustRegister calls to New* funcs
* Move MustRegister back to init
* Init at some place, minimize changes
* Move receipt sending to own JetStream producer
* Move SendToDevice to producer
* Remove most parts of the EDU server
* Fix SendToDevice & copyrights
* Move structs, cleanup EDU Server traces
* Use HeadersOnly subscription
* Missing file
* Fix linter issues
* Move consumers to own files
* Rename durable consumer; Consumer cleanup
* Docs/config cleanup
* Roomserver input refactoring — again!
* Ensure the actor runs again
* Preserve consumer after unsubscribe
* Another sprinkling of magic
* Rename `TopicFor` to `Prefixed`
* Recreate the stream if the config is bad
* Check streams too
* Prefix subjects, preserve inboxes
* Recreate if subjects wrong
* Remove stream subject
* Reconstruct properly
* Fix mutex unlock
* Comments
* Fix tests
* Don't drop events
* Review comments
* Separate `queueInputRoomEvents` function
* Re-jig control flow a bit
* Don't send `adds_state_events` in roomserver output events anymore
* Set `omitempty` on some output fields that aren't always set
* Add `AddsState` helper function
* No-op if no added state event IDs
* Revert "No-op if no added state event IDs"
This reverts commit 71a0ef3df10e0d94234d916246c30b0a4e82b26e.
* Revert "Add `AddsState` helper function"
This reverts commit c9fbe45475eb12ae44d2a8da7c0fc3a002ad9819.
* Initial cut at fixing up MSC2946 to work with latest spec
* bugfix: send response back correctly
* Initial working version of MSC2946
* msc2946: handle suggested_only; remove custom database
As the MSC doesn't require reverse lookups, we can just pull
the room state and inspect via the roomserver database. To
handle this, expand QueryCurrentState to support wildcards.
Use all this and handle `?suggested_only`.
* Sort child rooms
* msc2946: Make TestClientSpacesSummary pass
* msc2946: allow invited rooms to be spidered
* msc2946: support basic federation requests
* fix up go mod
* Ensure the input API only uses a single transaction
* Remove more of the dead query API call
* Tidy up
* Fix tests hopefully
* Don't do unnecessary work for rooms that don't exist
* Improve error, fix another case where transaction wasn't used properly
* Add a unit test for checking single transaction on RS input API
* Fix logic oops when deciding whether to use a transaction in storeEvent
* Use new event json types in gmsl
* Fix EventJSON to actually unmarshal events
* Update GMSL
* Bump GMSL and improve error messages
* Send back the correct RespState
* Update GMSL
* Remove unneeded logging
* Add MasterKey & SelfSigningKey to update
Avoid panic if signatures are not present
* Add passing test
* Revert "Add MasterKey & SelfSigningKey to update"
This reverts commit 2c81b34884be8b5b875a33420c0f985b578d3fb8.
* Send MasterKey & SelfSigningKey with update
* Debugging
* Remove delete() so we also query signingkeys
* Remove dependency on saramajetstream & sarama
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Remove internal.ContinualConsumer from federationapi
* Remove internal.ContinualConsumer from syncapi
* Remove internal.ContinualConsumer from keyserver
* Move to new Prepare function
* Remove saramajetstream & sarama dependency
* Delete unneeded file
* Remove duplicate import
* Log error instead of silently irgnoring it
* Move `OffsetNewest` and `OffsetOldest` into keyserver types, change them to be more sane values
* Fix comments
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Put federation client functions into their own file
* Look for missing auth events in RS input
* Remove retrieveMissingAuthEvents from federation API
* Logging
* Sorta transplanted the code over
* Use event origin failing all else
* Don't get stuck on mutexes:
* Add verifier
* Don't mark state events with zero snapshot NID as not existing
* Check missing state if not an outlier before storing the event
* Reject instead of soft-fail, don't copy roominfo so much
* Use synchronous contexts, limit time to fetch missing events
* Clean up some commented out bits
* Simplify `/send` endpoint significantly
* Submit async
* Report errors on sending to RS input
* Set max payload in NATS to 16MB
* Tweak metrics
* Add `workerForRoom` for tidiness
* Try skipping unmarshalling errors for RespMissingEvents
* Track missing prev events separately to avoid calculating state when not possible
* Tweak logic around checking missing state
* Care about state when checking missing prev events
* Don't check missing state for create events
* Try that again
* Handle create events better
* Send create room events as new
* Use given event kind when sending auth/state events
* Revert "Use given event kind when sending auth/state events"
This reverts commit 089d64d271.
* Only search for missing prev events or state for new events
* Tweaks
* We only have missing prev if we don't supply state
* Room version tweaks
* Allow async inputs again
* Apply backpressure to consumers/synchronous requests to hopefully stop things being overwhelmed
* Set timeouts on roomserver input tasks (need to decide what timeout makes sense)
* Use work queue policy, deliver all on restart
* Reduce chance of duplicates being sent by NATS
* Limit the number of servers we attempt to reduce backpressure
* Some review comment fixes
* Tidy up a couple things
* Don't limit servers, randomise order using map
* Some context refactoring
* Update gmsl
* Don't resend create events
* Set stateIDs length correctly or else the roomserver thinks there are missing events when there aren't
* Exclude our own servername
* Try backing off servers
* Make excluding self behaviour optional
* Exclude self from g_m_e
* Update sytest-whitelist
* Update consumers for the roomserver output stream
* Remember to send outliers for state returned from /gme
* Make full HTTP tests less upsetti
* Remove 'If a device list update goes missing, the server resyncs on the next one' from the sytest blacklist
* Remove debugging test
* Fix blacklist again, remove unnecessary duplicate context
* Clearer contexts, don't use background in case there's something happening there
* Don't queue up events more than once in memory
* Correctly identify create events when checking for state
* Fill in gaps again in /gme code
* Remove `AuthEventIDs` from `InputRoomEvent`
* Remove stray field
Co-authored-by: Kegan Dougal <kegan@matrix.org>
* Use named NATS durable consumers
* Build fixes
* Remove dupe call to SetFederationAPI
* Use namespaced consumer name
* Fix namespacing
* Fix unit tests hopefully
* Add NATS JetStream support
Update shopify/sarama
* Fix addresses
* Don't change Addresses in Defaults
* Update saramajetstream
* Add missing error check
Keep typing events for at least one minute
* Use all configured NATS addresses
* Update saramajetstream
* Try setting up with NATS
* Make sure NATS uses own persistent directory (TODO: make this configurable)
* Update go.mod/go.sum
* Jetstream package
* Various other refactoring
* Build fixes
* Config tweaks, make random jetstream storage path for CI
* Disable interest policies
* Try to sane default on jetstream base path
* Try to use in-memory for CI
* Restore storage/retention
* Update nats.go dependency
* Adapt changes to config
* Remove unneeded TopicFor
* Dep update
* Revert "Remove unneeded TopicFor"
This reverts commit f5a4e4a339.
* Revert changes made to streams
* Fix build problems
* Update nats-server
* Update go.mod/go.sum
* Roomserver input API queuing using NATS
* Fix topic naming
* Prometheus metrics
* More refactoring to remove saramajetstream
* Add missing topic
* Don't try to populate map that doesn't exist
* Roomserver output topic
* Update go.mod/go.sum
* Message acknowledgements
* Ack tweaks
* Try to resume transaction re-sends
* Try to resume transaction re-sends
* Update to matrix-org/gomatrixserverlib@91dadfb
* Remove internal.PartitionStorer from components that don't consume keychanges
* Try to reduce re-allocations a bit in resolveConflictsV2
* Tweak delivery options on RS input
* Publish send-to-device messages into correct JetStream subject
* Async and sync roomserver input
* Update dendrite-config.yaml
* Remove roomserver tests for now (they need rewriting)
* Remove roomserver test again (was merged back in)
* Update documentation
* Docker updates
* More Docker updates
* Update Docker readme again
* Fix lint issues
* Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)
* Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that
* Go 1.16 instead of Go 1.13 for upgrade tests and Complement
* Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"
This reverts commit 368675283f.
* Don't report any errors on `/send` to see what fun that creates
* Fix panics on closed channel sends
* Enforce state key matches sender
* Do the same for leave
* Various tweaks to make tests happier
Squashed commit of the following:
commit 13f9028e7a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 15:47:14 2022 +0000
Do the same for leave
commit e6be7f05c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 15:33:42 2022 +0000
Enforce state key matches sender
commit 85ede6d64b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 14:07:04 2022 +0000
Fix panics on closed channel sends
commit 9755494a98
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 13:38:22 2022 +0000
Don't report any errors on `/send` to see what fun that creates
commit 3bb4f87b5d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 13:00:26 2022 +0000
Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"
This reverts commit 368675283f.
commit fe2673ed7b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 12:09:34 2022 +0000
Go 1.16 instead of Go 1.13 for upgrade tests and Complement
commit 368675283f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 11:51:45 2022 +0000
Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that
commit b028dfc085
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Tue Jan 4 10:29:08 2022 +0000
Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)
* Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork
* Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers
* Fix consumer component name in federation API
* Add comment explaining where streams are defined
* Tweaks to roomserver input with comments
* Finish that sentence that I apparently forgot to finish in INSTALL.md
* Bump version number of config to 2
* Add comments around asynchronous sends to roomserver in processEventWithMissingState
* More useful error message when the config version does not match
* Set version in generate-config
* Fix version in config.Defaults
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Initial federation sender -> federation API refactoring
* Move base into own package, avoids import cycle
* Fix build errors
* Fix tests
* Add signing key server tables
* Try to fold signing key server into federation API
* Fix dendritejs builds
* Update embedded interfaces
* Fix panic, fix lint error
* Update configs, docker
* Rename some things
* Reuse same keyring on the implementing side
* Fix federation tests, `NewBaseDendrite` can accept freeform options
* Fix build
* Update create_db, configs
* Name tables back
* Don't rename federationsender consumer for now
* Add more logs
To help debug the migration issue in #1924 along with manual data-loss-inducing fixes.
Also log the origin server on processed txns to help debug buggy server origins.
* Fix query
* Handle other signatures
* Decorate key ID properly
* Match by key IDs
* Tweaks
* Fixes
* Fix /user/keys/query bug, review comments, update sytest-whitelist
* Various wtweaks
* Fix wiring for keyserver in API mode
* Additional fixes
* Enable unstable feature again
* Try to verify when a device signs a key
* Try to verify when a key signs a device
* It's the self-signing key, not the master key
* Fix error
* Try to verify master key uploads
* Actually we can't guarantee we can do that so nevermind
* Add signatures into /devices/list request
* Fix nil pointer
* Reprioritise map creation
* Don't skip devices that don't have signatures
* Add some debug logging
* Fix logic error in QuerySignatures
* Fix bugs
* Expose master and self-signing keys on /devices/list hopefully
* maps are tedious
* Expose signatures via /keys/query
* Upload signatures when uploading keys
* Fixes
* Disable the feature again
* Check for missing state keys to avoid panicking
* Check for not allowed errors on send_leave
* More logging
* handle send_join errors too
* Additional send_join checks
* s/join/gmsl.json/
* Add notary server tables for postgres
* Add sqlite tables
* fedsender: GetServerKeys -> QueryServerKeys
As it now checks a cache and can return multiple responses
* Add more optimised code path for checking if we're in a room
* Fix database queries
* Fix federation API test
* Fix logging
* Review comments
* Make separate API call for room membership
* Ensure worker has work before starting goroutine
* Revert "Remove processEventWithMissingStateMutex"
This reverts commit 7f02eab47d.
* Use request context when processing transactions
* Keep goroutine count down by not starting work for things where the caller gave up
* Remove mutex, start workers at correct time
* Try to process rooms concurrently in FS /send
* Clean up
* Use request context so that dead things don't linger for so long
* Remove mutex
* Free up pdus slice so only references remaining are in channel
* Revert "Remove mutex"
This reverts commit 8558075e8c.
* Process EDUs in parallel
* Try refactoring /send concurrency
* Fix waitgroup
* Release on waitgroup
* Respond to transaction
* Reduce CPU usage, fix unit tests
* Tweaks
* Move into one file
* More aggressive event caching
* Deduplicate /state results
* Deduplicate more
* Ensure we use the correct list of events when excluding repeated state
* Fixes
* Ensure we track all events we already knew about properly
Squashed commit of the following:
commit 7fad77c10e
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 15:06:52 2021 +0100
Fix processEventWithMissingStateMutexes
commit 138cddcac7
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 13:59:44 2021 +0100
Use internal.MutexByRoom
commit 6e6f026cfa
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 13:50:18 2021 +0100
Try to slow things down per room
commit b97d406dff
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 13:41:27 2021 +0100
Try to slow things down
commit 8866120ebf
Merge: 9f2de8a24a37b19a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 13:40:33 2021 +0100
Merge branch 'neilalexander/rsinputfifo' into neilalexander/rsinputfifo2
commit 4a37b19a8f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 13:34:54 2021 +0100
Add comments
commit f9ab3f4b81
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 13:31:21 2021 +0100
Tweaks
commit 9f2de8a29c
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 13:15:59 2021 +0100
Ask origin only for missing things for now
commit 8fd878c75a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 11:18:11 2021 +0100
Make sure someone wakes up
commit b63f699f1b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date: Mon Jun 28 11:12:58 2021 +0100
Use a FIFO queue instead of a channel to reduce backpressure
* Implement OpenID module (#599)
- Unrelated: change Riot references to Element in client API routing
Signed-off-by: Bruce MacDonald <contact@bruce-macdonald.com>
* OpenID module tweaks (#599)
- specify expiry is ms rather than vague ts
- add OpenID token lifetime to configuration
- use Go naming conventions for the path params
- store plaintext token rather than hash
- remove openid table sqllite mutex
* Add default OpenID token lifetime (#599)
* Update dendrite-config.yaml
Co-authored-by: Kegsay <kegsay@gmail.com>
Co-authored-by: Kegsay <kegan@matrix.org>
* Add a per-room mutex to federationapi when processing transactions
This has numerous benefits:
- Prevents us doing lots of state resolutions in busy rooms. Previously, room forks would always result
in a state resolution being performed immediately, without checking if we were already doing this in
a different transaction. Now they will queue up, resulting in fewer calls to `/state_ids`, `/g_m_e`, etc.
- Prevents memory usage from growing too large as a result and potentially OOMing.
And costs:
- High traffic rooms will be slightly slower due to head-of-line blocking from other servers,
though this has always been an issue as roomserver has a per-room mutex already.
* Fix unit tests
* Correct mutex lock ordering
* Check membership of room
* Use QueryStateAfterEventsResponse
* Fix complexity
* Add field ShouldHitAppservice to GetRoomIDForAlias
* Hit appservice when trying to join a non-existent alias
* remove unused
* Changes that I made a long time ago
* Rename to appserviceJoinedAtEvent
* Check membership in GetMemberships
* Update QueryMembershipsForRoom
* Tweaks in client API
* Update appserviceJoinedAtEvent
* Comments
* Try QueryMembershipForUser instead
* Undo some changes to client API that shouldn't be needed
* More /event tweaks
* Refactor /event bit
* Go back to QueryMembershipsForRoom because appservices are hard
* Fix bugs in onMessage
* Add comments
* More logical naming, clean up a bit
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Look up servers less often, don't hit API for missing auth events unless there are actually missing auth events
* Remove ResolveConflictsAdhoc (since it is already in GMSL), other tweaks
* Update gomatrixserverlib to matrix-org/gomatrixserverlib#254
* Fix resolve-state
* Initialise t.servers on first use
* a very very WIP first cut of peeking via MSC2753.
doesn't yet compile or work.
needs to actually add the peeking block into the sync response.
checking in now before it gets any bigger, and to gather any initial feedback on the vague shape of it.
* make PeekingDeviceSet private
* add server_name param
* blind stab at adding a `peek` section to /sync
* make it build
* make it launch
* add peeking to getResponseWithPDUsForCompleteSync
* cancel any peeks when we join a room
* spell out how to runoutside of docker if you want speed
* fix SQL
* remove unnecessary txn for SelectPeeks
* fix s/join/peek/ cargocult fail
* HACK: Track goroutine IDs to determine when we write by the wrong thread
To use: set `DENDRITE_TRACE_SQL=1` then grep for `unsafe`
* Track partition offsets and only log unsafe for non-selects
* Put redactions in the writer goroutine
* Update filters on writer goroutine
* wrap peek storage in goid hack
* use exclusive writer, and MarkPeeksAsOld more efficiently
* don't log ascii in binary at sql trace...
* strip out empty roomd deltas
* re-add txn to SelectPeeks
* re-add accidentally deleted field
* reject peeks for non-worldreadable rooms
* move perform_peek
* fix package
* correctly refactor perform_peek
* WIP of implementing MSC2444
* typo
* Revert "Merge branch 'kegan/HACK-goid-sqlite-db-is-locked' into matthew/peeking"
This reverts commit 3cebd8dbfb, reversing
changes made to ed4b3a58a7.
* (almost) make it build
* clean up bad merge
* support SendEventWithState with optional event
* fix build & lint
* fix build & lint
* reinstate federated peeks in the roomserver (doh)
* fix sql thinko
* todo for authenticating state returned by /peek
* support returning current state from QueryStateAndAuthChain
* handle SS /peek
* reimplement SS /peek to prod the RS to tell the FS about the peek
* rename RemotePeeks as OutboundPeeks
* rename remote_peeks_table as outbound_peeks_table
* add perform_handle_remote_peek.go
* flesh out federation doc
* add inbound peeks table and hook it up
* rename ambiguous RemotePeek as InboundPeek
* rename FSAPI's PerformPeek as PerformOutboundPeek
* setup inbound peeks db correctly
* fix api.SendEventWithState with no event
* track latestevent on /peek
* go fmt
* document the peek send stream race better
* fix SendEventWithRewrite not to bail if handed a non-state event
* add fixme
* switch SS /peek to use SendEventWithRewrite
* fix comment
* use reverse topo ordering to find latest extrem
* support postgres for federated peeking
* go fmt
* back out bogus go.mod change
* Fix performOutboundPeekUsingServer
* Fix getAuthChain -> GetAuthChain
* Fix build issues
* Fix build again
* Fix getAuthChain -> GetAuthChain
* Don't repeat outbound peeks for the same room ID to the same servers
* Fix lint
* Don't omitempty to appease sytest
Co-authored-by: Kegan Dougal <kegan@matrix.org>
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Update GMSL
* Add MSC2836EventRelationships to fedsender
* Call MSC2836EventRelationships in reqCtx
* auth remote servers
* Extract room ID and servers from previous events; refactor a bit
* initial cut of federated threading
* Use the right client/fed struct in the response
* Add QueryAuthChain for use with MSC2836
* Add auth chain to federated response
* Fix pointers
* under CI: more logging and enable mscs, nil fix
* Handle direction: up
* Actually send message events to the roomserver..
* Add children and children_hash to unsigned, with tests
* Add logic for exploring threads and tracking children; missing storage functions
* Implement storage functions for children
* Add fetchUnknownEvent
* Do federated hits for include_children if we have unexplored children
* Use /ev_rel rather than /event as the former includes child metadata
* Remove cross-room threading impl
* Enable MSC2836 in the p2p demo
* Namespace mscs db
* Enable msc2836 for ygg
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* fix conversion from int to string yields a string of one rune, not a string of digits
* Add receipts table to syncapi
* Use StreamingToken as the since value
* Add required method to testEDUProducer
* Make receipt json creation "easier" to read
* Add receipts api to the eduserver
* Add receipts endpoint
* Add eduserver kafka consumer
* Add missing kafka config
* Add passing tests to whitelist
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Fix copy & paste error
* Fix column count error
* Make outbound federation receipts pass
* Make "Inbound federation rejects receipts from wrong remote" pass
* Don't use errors package
* - Add TODO for batching requests
- Rename variable
* Return a better error message
* - Use OutputReceiptEvent instead of InputReceiptEvent as result
- Don't use the errors package for errors
- Defer CloseAndLogIfError to close rows
- Fix Copyright
* Better creation/usage of JoinResponse
* Query all joined rooms instead of just one
* Update gomatrixserverlib
* Add sqlite3 migration
* Add postgres migration
* Ensure required sequence exists before running migrations
* Clarification on comment
* - Fix a bug when creating client receipts
- Use concrete types instead of interface{}
* Remove dead code
Use key for timestamp
* Fix postgres query...
* Remove single purpose struct
* Use key/value directly
* Only apply receipts on initial sync or if edu positions differ,
otherwise we'll be sending the same receipts over and over again.
* Actually update the id, so it is correctly send in syncs
* Set receipt on request to /read_markers
* Fix issue with receipts getting overwritten
* Use fmt.Errorf instead of pkg/errors
* Revert "Add postgres migration"
This reverts commit 722fe5a04628882b787d096942459961db159b06.
* Revert "Add sqlite3 migration"
This reverts commit d113b03f6495a4b8f8bcf158a3d00b510b4240cc.
* Fix selectRoomReceipts query
* Make golangci-lint happy
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add basic storage methods
* Add internal api handler
* Add check for forgotten room
* Add /rooms/{roomID}/forget endpoint
* Add missing rsAPI method
* Remove unused parameters
* Add passing tests
Signed-off-by: Till Faelligen <tfaelligen@gmail.com>
* Add missing file
* Add postgres migration
* Add sqlite migration
* Use Forgetter to forget room
* Remove empty line
* Update HTTP status codes
It looks like the spec calls for these to be 400, rather than 403: https://matrix.org/docs/spec/client_server/r0.6.1#post-matrix-client-r0-rooms-roomid-forget
Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
* Add KindOld
* Don't process latest events/memberships for old events
* Allow federationsender to ignore duplicate key entries when LatestEventIDs is duplicated by RS output events
* Signal to downstream components if an event has become a forward extremity
* Don't exclude from sync
* Soft-fail checks on KindNew
* Don't run the latest events updater at all for KindOld
* Don't make federation sender change after all
* Kind in federation sender join
* Don't send isForwardExtremity
* Fix syncapi
* Update comments
* Fix SendEventWithState
* Update sytest-whitelist
* Generate old output events
* Sync API consumes old room events
* Update comments
* Capture errors
* Don't request only state key tuples needed for auth (we end up discarding room state this way)
* QueryStateAfterEvent returns all state when no tuples supplied
* Resolve state
* Comments
* Recursively fetch auth events if needed
* Fix processEvent call
* Ask more servers in lookupEvent
* Don't panic!
* Panic at the Disco
* Find servers more aggressively
* Add getServers
* Fix number of servers to 5, don't bail making RespState if auth events missing
* Fix panic
* Ignore missing state events too
* Report number of servers correctly
* Don't reuse request context for /send_join
* Update federation API tests
* Don't recurse processEvents
* Implement getEvents differently
* Adjust backfill to send backward extremity with state before other backfilled events, include prev_events with no state amongst missing events
* Not finished refactor
* Fix test
* Remove isInboundTxn
* Remove debug logging
* Try to ask other servers in the room for missing events if the origin won't provide them
* Logging
* More logging
* Implement QueryMissingAuthPrevEvents
* Try to get missing auth events badly
* Use processEvent
* Logging
* Update QueryMissingAuthPrevEvents
* Try to find missing auth events
* Patchy fix for test
* Logging tweaks
* Send auth events as outliers
* Update check in QueryMissingAuthPrevEvents
* Error responses
* More return codes
* Don't return error on reject/soft-fail since it was ultimately handled
* More tweaks
* More error tweaks
* Sanity-check room version on RS event input
* Update gomatrixserverlib
* Reject make_join when no room members are left
* Revert some changes from wrong branch
* Distinguish between room not existing and room being abandoned on this server
* nolint