Add NATS JetStream support (#1866)

* Add NATS JetStream support
Update shopify/sarama

* Fix addresses

* Don't change Addresses in Defaults

* Update saramajetstream

* Add missing error check

Keep typing events for at least one minute

* Use all configured NATS addresses

* Update saramajetstream

* Try setting up with NATS

* Make sure NATS uses own persistent directory (TODO: make this configurable)

* Update go.mod/go.sum

* Jetstream package

* Various other refactoring

* Build fixes

* Config tweaks, make random jetstream storage path for CI

* Disable interest policies

* Try to sane default on jetstream base path

* Try to use in-memory for CI

* Restore storage/retention

* Update nats.go dependency

* Adapt changes to config

* Remove unneeded TopicFor

* Dep update

* Revert "Remove unneeded TopicFor"

This reverts commit f5a4e4a339.

* Revert changes made to streams

* Fix build problems

* Update nats-server

* Update go.mod/go.sum

* Roomserver input API queuing using NATS

* Fix topic naming

* Prometheus metrics

* More refactoring to remove saramajetstream

* Add missing topic

* Don't try to populate map that doesn't exist

* Roomserver output topic

* Update go.mod/go.sum

* Message acknowledgements

* Ack tweaks

* Try to resume transaction re-sends

* Try to resume transaction re-sends

* Update to matrix-org/gomatrixserverlib@91dadfb

* Remove internal.PartitionStorer from components that don't consume keychanges

* Try to reduce re-allocations a bit in resolveConflictsV2

* Tweak delivery options on RS input

* Publish send-to-device messages into correct JetStream subject

* Async and sync roomserver input

* Update dendrite-config.yaml

* Remove roomserver tests for now (they need rewriting)

* Remove roomserver test again (was merged back in)

* Update documentation

* Docker updates

* More Docker updates

* Update Docker readme again

* Fix lint issues

* Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

* Go 1.16 instead of Go 1.13 for upgrade tests and Complement

* Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

This reverts commit 368675283f.

* Don't report any errors on `/send` to see what fun that creates

* Fix panics on closed channel sends

* Enforce state key matches sender

* Do the same for leave

* Various tweaks to make tests happier

Squashed commit of the following:

commit 13f9028e7a
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:47:14 2022 +0000

    Do the same for leave

commit e6be7f05c3
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 15:33:42 2022 +0000

    Enforce state key matches sender

commit 85ede6d64b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 14:07:04 2022 +0000

    Fix panics on closed channel sends

commit 9755494a98
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:38:22 2022 +0000

    Don't report any errors on `/send` to see what fun that creates

commit 3bb4f87b5d
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 13:00:26 2022 +0000

    Revert "Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that"

    This reverts commit 368675283f.

commit fe2673ed7b
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 12:09:34 2022 +0000

    Go 1.16 instead of Go 1.13 for upgrade tests and Complement

commit 368675283f
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 11:51:45 2022 +0000

    Don't report event rejection errors via `/send`, since apparently this is upsetting tests that don't expect that

commit b028dfc085
Author: Neil Alexander <neilalexander@users.noreply.github.com>
Date:   Tue Jan 4 10:29:08 2022 +0000

    Send final event in `processEvent` synchronously (since this might stop Sytest from being so upset)

* Merge in NATS Server v2.6.6 and nats.go v1.13 into the in-process connection fork

* Add `jetstream.WithJetStreamMessage` to make ack/nak-ing less messy, use process context in consumers

* Fix consumer component name in  federation API

* Add comment explaining where streams are defined

* Tweaks to roomserver input with comments

* Finish that sentence that I apparently forgot to finish in INSTALL.md

* Bump version number of config to 2

* Add comments around asynchronous sends to roomserver in processEventWithMissingState

* More useful error message when the config version does not match

* Set version in generate-config

* Fix version in config.Defaults

Co-authored-by: Neil Alexander <neilalexander@users.noreply.github.com>
This commit is contained in:
S7evinK 2022-01-05 18:44:49 +01:00 committed by GitHub
parent a47b12dc7d
commit 161f145176
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
75 changed files with 1317 additions and 1696 deletions

View file

@ -2,21 +2,23 @@
Dendrite can be run in one of two configurations:
* **Polylith mode**: A cluster of individual components, dealing with different
aspects of the Matrix protocol (see [WIRING.md](WIRING-Current.md)). Components communicate
with each other using internal HTTP APIs and [Apache Kafka](https://kafka.apache.org).
This will almost certainly be the preferred model for large-scale deployments.
* **Monolith mode**: All components run in the same process. In this mode,
Kafka is completely optional and can instead be replaced with an in-process
lightweight implementation called [Naffka](https://github.com/matrix-org/naffka). This
will usually be the preferred model for low-volume, low-user or experimental deployments.
it is possible to run an in-process [NATS Server](https://github.com/nats-io/nats-server)
instead of running a standalone deployment. This will usually be the preferred model for
low-to-mid volume deployments, providing the best balance between performance and resource usage.
For most deployments, it is **recommended to run in monolith mode with PostgreSQL databases**.
* **Polylith mode**: A cluster of individual components running in their own processes, dealing
with different aspects of the Matrix protocol (see [WIRING.md](WIRING-Current.md)). Components
communicate with each other using internal HTTP APIs and [NATS Server](https://github.com/nats-io/nats-server).
This will almost certainly be the preferred model for very large deployments but scalability
comes with a cost. API calls are expensive and therefore a polylith deployment may end up using
disproportionately more resources for a smaller number of users compared to a monolith deployment.
In almost all cases, it is **recommended to run in monolith mode with PostgreSQL databases**.
Regardless of whether you are running in polylith or monolith mode, each Dendrite component that
requires storage has its own database. Both Postgres and SQLite are supported and can be
mixed-and-matched across components as needed in the configuration file.
requires storage has its own database connections. Both Postgres and SQLite are supported and can
be mixed-and-matched across components as needed in the configuration file.
Be advised that Dendrite is still in development and it's not recommended for
use in production environments just yet!
@ -26,13 +28,11 @@ use in production environments just yet!
Dendrite requires:
* Go 1.15 or higher
* Postgres 9.6 or higher (if using Postgres databases, not needed for SQLite)
* PostgreSQL 12 or higher (if using PostgreSQL databases, not needed for SQLite)
If you want to run a polylith deployment, you also need:
* Apache Kafka 0.10.2+
Please note that Kafka is **not required** for a monolith deployment.
* A standalone [NATS Server](https://github.com/nats-io/nats-server) deployment with JetStream enabled
## Building Dendrite
@ -49,40 +49,18 @@ Then build it:
./build.sh
```
## Install Kafka (polylith only)
## Install NATS Server
Install and start Kafka (c.f. [scripts/install-local-kafka.sh](scripts/install-local-kafka.sh)):
Follow the [NATS Server installation instructions](https://docs.nats.io/running-a-nats-service/introduction/installation) and then [start your NATS deployment](https://docs.nats.io/running-a-nats-service/introduction/running).
```bash
KAFKA_URL=http://archive.apache.org/dist/kafka/2.1.0/kafka_2.11-2.1.0.tgz
# Only download the kafka if it isn't already downloaded.
test -f kafka.tgz || wget $KAFKA_URL -O kafka.tgz
# Unpack the kafka over the top of any existing installation
mkdir -p kafka && tar xzf kafka.tgz -C kafka --strip-components 1
# Start the zookeeper running in the background.
# By default the zookeeper listens on localhost:2181
kafka/bin/zookeeper-server-start.sh -daemon kafka/config/zookeeper.properties
# Start the kafka server running in the background.
# By default the kafka listens on localhost:9092
kafka/bin/kafka-server-start.sh -daemon kafka/config/server.properties
```
On macOS, you can use [Homebrew](https://brew.sh/) for easier setup of Kafka:
```bash
brew install kafka
brew services start zookeeper
brew services start kafka
```
JetStream must be enabled, either by passing the `-js` flag to `nats-server`,
or by specifying the `store_dir` option in the the `jetstream` configuration.
## Configuration
### PostgreSQL database setup
Assuming that PostgreSQL 9.6 (or later) is installed:
Assuming that PostgreSQL 12 (or later) is installed:
* Create role, choosing a new password when prompted:
@ -109,7 +87,7 @@ On macOS, omit `sudo -u postgres` from the below commands.
* If you want to run each Dendrite component with its own database:
```bash
for i in mediaapi syncapi roomserver signingkeyserver federationsender appservice keyserver userapi_accounts userapi_devices naffka; do
for i in mediaapi syncapi roomserver federationapi appservice keyserver userapi_accounts userapi_devices; do
sudo -u postgres createdb -O dendrite dendrite_$i
done
```
@ -163,7 +141,11 @@ Create config file, based on `dendrite-config.yaml`. Call it `dendrite.yaml`. Th
* `postgres://dendrite:password@localhost/dendrite_userapi_account?sslmode=disable` to connect to PostgreSQL without SSL/TLS
* For SQLite on disk: `file:component.db` or `file:///path/to/component.db`, e.g. `file:userapi_account.db`
* Postgres and SQLite can be mixed and matched on different components as desired.
* The `use_naffka` option if using Naffka in a monolith deployment
* Either one of the following in the `jetstream` configuration section:
* The `addresses` option — a list of one or more addresses of an external standalone
NATS Server deployment
* The `storage_path` — where on the filesystem the built-in NATS server should
store durable queues, if using the built-in NATS server
There are other options which may be useful so review them all. In particular,
if you are trying to federate from your Dendrite instance into public rooms
@ -177,11 +159,6 @@ using SQLite, all components **MUST** use their own database file.
## Starting a monolith server
It is possible to use Naffka as an in-process replacement to Kafka when using
the monolith server. To do this, set `use_naffka: true` in your `dendrite.yaml`
configuration and uncomment the relevant Naffka line in the `database` section.
Be sure to update the database username and password if needed.
The monolith server can be started as shown below. By default it listens for
HTTP connections on port 8008, so you can configure your Matrix client to use
`http://servername:8008` as the server:
@ -197,6 +174,10 @@ for HTTPS connections on port 8448:
./bin/dendrite-monolith-server --tls-cert=server.crt --tls-key=server.key
```
If the `jetstream` section of the configuration contains no `addresses` but does
contain a `store_dir`, Dendrite will start up a built-in NATS JetStream node
automatically, eliminating the need to run a separate NATS server.
## Starting a polylith deployment
The following contains scripts which will run all the required processes in order to point a Matrix client at Dendrite.
@ -263,15 +244,6 @@ This is what implements the room DAG. Clients do not talk to this.
./bin/dendrite-polylith-multi --config=dendrite.yaml roomserver
```
#### Federation sender
This sends events from our users to other servers. This is only required if
you want to support federation.
```bash
./bin/dendrite-polylith-multi --config=dendrite.yaml federationsender
```
#### Appservice server
This sends events from the network to [application
@ -291,14 +263,6 @@ This manages end-to-end encryption keys for users.
./bin/dendrite-polylith-multi --config=dendrite.yaml keyserver
```
#### Signing key server
This manages signing keys for servers.
```bash
./bin/dendrite-polylith-multi --config=dendrite.yaml signingkeyserver
```
#### EDU server
This manages processing EDUs such as typing, send-to-device events and presence. Clients do not talk to