Bran Migration
This document provides a step-by-step tutorial on how to migrate the bran to work with High Availability requierements.
Introduction
bran
service (a backend for Threads) now creates and uses replicated tables, which is a starting point for HA (High Availability) of this service.
Note: for existing installation manual migration required!!! Migration can only be executed if installation has 2 ClickHouse instances and 1-3 ZooKeeper instances properly configured.
Migration plan
- Create a remote backup of
/mnt/clickhouse
. It contains databases, tables, their metadata and data. You can just mount new disk, create a backup on it and unmount (check that it’s not removed and have a clear name, e.g.clickhouse_backup_before_migration_delete_after_date
) - Check that policy for
bran*
queues in rabbitmq can collect all messages received during down time. Looks likemax-length: 3000000 max-length-bytes: 8388608000
should be enough. Your maliage may vary. - Stop anybody writing into
bran
’s database (looks likebran-write
only).NOTE: you may potentially get a message duplicate on restart, because
bran
may process message, but don’t callack
before shutdown. - Rename current tables
RENAME TABLE bran.messages TO bran.messages_standalone, bran.events TO bran.events_standalone, bran.containers TO bran.containers_standalone, bran.threads_view_list TO bran.threads_view_list_standalone, bran.threads_view_by_id TO bran.threads_view_by_id_standalone
- create new replicated tables:
run
LOG_LEVEL=info AMQP_URI= MONGO_URI= BRAN_CLICKHOUSE_URI="http://localhost:8123/bran" npm run dev:migrate
locally. - Move parts from old tables to a special folder of new ones:
rsync -a --remove-source-files --exclude "format_version.txt" --exclude "detached" /mnt/clickhouse/data/bran/messages_standalone/ /mnt/clickhouse/data/bran/messages/detached/ rsync -a --remove-source-files --exclude "format_version.txt" --exclude "detached" /mnt/clickhouse/data/bran/events_standalone/ /mnt/clickhouse/data/bran/events/detached/ rsync -a --remove-source-files --exclude "format_version.txt" --exclude "detached" /mnt/clickhouse/data/bran/%2Einner%2Econtainers_standalone/ /mnt/clickhouse/data/bran/%2Einner%2Econtainers/detached/ rsync -a --remove-source-files --exclude "format_version.txt" --exclude "detached" /mnt/clickhouse/data/bran/%2Einner%2Ethreads_view_by_id_standalone/ /mnt/clickhouse/data/bran/%2Einner%2Ethreads_view_by_id/detached/ rsync -a --remove-source-files --exclude "format_version.txt" --exclude "detached" /mnt/clickhouse/data/bran/%2Einner%2Ethreads_view_list_standalone/ /mnt/clickhouse/data/bran/%2Einner%2Ethreads_view_list/detached/
- Start services that were stopped in 3). Old data will be temporarily unavailable in UI, but at least queued data will be sent to new CH tables.
- Check that UI page Executions is working and lists new threads.
- Attach moved parts to new tables (ClickHouse/ClickHouse#8183):
clickhouse-client --format=TSVRaw -q"select 'ALTER TABLE ' || database || '.\"' || table || '\" ATTACH PARTITION ID \'' || partition_id || '\';\n' from system.detached_parts where table != 'messages' group by database, table, partition_id order by database, table, partition_id;" | clickhouse-client -nm (add --host --user --password params for clickhouse-client)
- Restore rabbitmq policy if it was changed in 2).
- Check that data was really restored (ask QA)
- run step 5) for 2nd ClickHouse instance
- Drop old tables:
DROP TABLE bran.messages_standalone; DROP TABLE bran.events_standalone; DROP TABLE bran.containers_standalone; DROP TABLE bran.threads_view_list_standalone; DROP TABLE bran.threads_view_by_id_standalone;
- Schedule backup remove made in 1)