The Problem
Periodically the Chainflip Labs team will need to perform a Runtime Upgrade to the node. You can read more about this here (opens in a new tab).
Unfortunately it is very difficult to make the chainflip-engine
backwards compatible with older versions of the runtime.
Therefore, we require operators to run both the new and old version of the engine simultaneously.
This guide walks through doing this on VPS and Kubernetes.
VPS
Upgrading on VPS is very trivial. We supply the new binaries as an entirely separate package. The following shows how you
would upgrade from 1.0.0
to 1.1.0
.
Before the Runtime Upgrade
When we announce that a new version of the software is available you can already get ready by running the second version of the engine.
sudo apt update
sudo apt --only-upgrade install chainflip-node
sudo apt install chainflip-engine1.1
You should see the following logs from the new engine:
journalctl -f -u chainflip-engine1.1
{"timestamp":"2023-12-11T16:26:38.381549Z","level":"INFO","fields":{"message":"This version '1.1.0' is incompatible with the current release '1.0.0' at block: 0x4d66…fb38. WAITING for a compatible release version."},"target":"chainflip_engine::state_chain_observer::client"}
After the Runtime Upgrade
Check the logs of the new version of the chainflip-engine1.1
. You should see that it is now processing blocks.
You can now safely disable the old version of the chainflip-engine
.
sudo systemctl disable --now chainflip-engine1.0.service
sudo apt purge chainflip-engine1.0
Kubernetes
While we don't officially support running Validators on Kubernetes, we understand its appeal.
The main difficulty is having two containers that can both access the data.db
directory. Sadly, most cloud providers
don't support ReadWriteMany
on volumes. So the next best thing would be to run two containers in a single pod. The issue
here is that there is no easy way to route traffic from the Service
to the correct pod at the moment the Runtime Upgrade
is pushed. It would require the operator to be available at the time of the upgrade which can be difficult considering we
are a global network.
The best solution we can offer is to build a custom image wherein both versions of the chainflip-engine
can run simultaneously.
Docker (while also making it clear that this isn't best practise), provides several solutions on their website (opens in a new tab).
Of the listed solutions, using Supervisord (opens in a new tab) suits us the best.
Building the Image
Below is an example Dockerfile that pulls both versions 1.0.0
and 1.1.0
of the chainflip-engine
. We then load in a Supervisord
config that runs both processes simultaneously, and writing the logs of both to stdout
.
FROM chainfliplabs/chainflip-engine:perseverance-1.0.0 as chainflip-engine1-0-0
FROM chainfliplabs/chainflip-engine:perseverance-1.1.0 as chainflip-engine1-1-0
FROM ubuntu:22.04
USER root
RUN apt-get update && apt-get install -y --no-install-recommends \
ca-certificates supervisor && \
rm -rf /var/lib/apt/lists/*
WORKDIR /etc/chainflip
RUN chmod +x /usr/local/bin/chainflip-engine \
&& useradd -m -u 1000 -U -s /bin/sh -d /flip flip \
&& chown -R 1000:1000 /etc/chainflip \
&& rm -rf /sbin /usr/sbin /usr/local/sbin
USER flip
COPY --from=chainflip-engine1-0-0 --chown=1000:1000 /usr/local/bin/chainflip-engine /usr/local/bin/chainflip-engine1.0
COPY --from=chainflip-engine1-1-0 --chown=1000:1000 /usr/local/bin/chainflip-engine /usr/local/bin/chainflip-engine1.1
COPY supervisord.conf .
ENTRYPOINT ["supervisord", "-c", "supervisord.conf"]
[supervisord]
nodaemon=true
logfile=/dev/null
logfile_maxbytes=0
[program:engine_1]
command=/usr/local/bin/chainflip-engine1.1
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
startretries=10
startsecs=10
autorestart=true
priority=1
[program:engine_2]
command=/usr/local/bin/chainflip-engine1.0
stdout_logfile=/dev/fd/1
stdout_logfile_maxbytes=0
redirect_stderr=true
startretries=10
startsecs=10
autorestart=true
priority=2
[supervisorctl]
To build simply run:
docker buildx build -t my_repo/my_image:tag .
Before the Runtime Upgrade
Make sure you are running your custom image once the team has made an announcement. The new engine will register itself on-chain thereby signalling its readiness for the upgrade.
After the Runtime Upgrade
The old engine will now sit in Idle mode. Until the next upgrade, you don't need to do anything. It's also possible to switch your image back to the official one after this.
Docker Compose
If running in docker compose
for example, the set-up is also quite trivial. Below is an example docker-compose.yml
file, notice the use of the multi image for the engine:
version: "3"
services:
chainflip-node:
image: chainfliplabs/chainflip-node:perseverance-1.1.0
container_name: chainflip-node
platform: linux/amd64
volumes:
- ./state-chain/node/chainspecs/berghain.chainspec.raw.json:/etc/chainflip/berghain.chainspec.json
- ./keys:/etc/chainflip/keys
entrypoint:
- /bin/sh
- -c
command:
- chainflip-node --chain /etc/chainflip/berghain.chainspec.json --sync=warp --rpc-external --rpc-cors='*' --unsafe-rpc-external --rpc-methods=unsafe
ports:
- "30333:30333"
- "9944:9944"
chainflip-engine:
image: chainflip-engine:berghain-1.0.0-1.1.0 # This would be a custom image tag
container_name: chainflip-engine
platform: linux/amd64
restart: always
depends_on:
- chainflip-node
ports:
- "8078:8078"
volumes:
- ./Settings.toml:/etc/chainflip/config/Settings.toml
- ./keys:/etc/chainflip/keys
Before the Runtime Upgrade
Just like Kubernetes, make sure you are running your custom image once the team has made an announcement.
After the Runtime Upgrade
The old engine will now sit in Idle mode. Until the next upgrade, you don't need to do anything. It's also possible to switch your image back to the official one after this.