firefly
firefly copied to clipboard
Hardening for 1.3 Release
Had an awesome chat with @nguyer and come up with what feels like the remaining items required for hardening so that we have confidence in FireFly 1.3.
Pre-requisite: We have an RC for 1.3, realistically it looks like we might a couple of weeks away from that due to a number of in-flight PRs requiring review from specific people.
High-level items:
- Migration Testing 1.2.2 -> 1.3 RC
- Major functionality testing
- Performance Testing
Migration 1.2.2 -> 1.3 RC
One of the largest items in this release is the change from having a single event stream per plugin (even across multiple namespaces) to having a single event stream per namespace in the network. There has been a whole bunch of testing from branches to verify that the migration should work as expected but there's gaps in the coverage and it's worth re-testing anyway.
In discussion w/ @nguyer we think the areas to be covered here are:
- Full end-to-end testing with the RC
- Testing in multi-party and gateway mode
- PostgreSQL as DB (+ permutations involving other configurations)
- ETHConnect as blockchain connector
Existing testing has covered:
- ERC20/721/1155 tokens connectors
- EVM Connector as blockchain connector
- Fabric
Script for contract migration
Performing contract migration should be able to be done using this script (note: this only works on Stacks created using the FireFly ff
CLI.)
#!/bin/bash
REQUIRED_TOOLS=("jq" "yq")
for TOOL in "${REQUIRED_TOOLS[@]}"
do
if ! [ -x "$(command -v ${TOOL})" ]; then
echo "Error: ${TOOL} is not installed." >&2
exit 1
fi
done
STACK_NAME="${1}"
CONFIG_FILE_LOCATION="${2}"
FIREFLY_REPO_LOCATION="${3}"
if [ -z "$STACK_NAME" ]; then
echo "Error: Name of the stack to migrate was not provided." >&2
# TODO: Best-effort try to find out if the name of the stack exists
exit 1
fi
if [ -z "$CONFIG_FILE_LOCATION" ]; then
echo "Error: Folder containing FireFly configuration files was not provided." >&2
# TODO: Test if the configuration files exist
exit 1
fi
if [ -z "$FIREFLY_REPO_LOCATION" ]; then
echo "Error: FireFly core repository location was not provided." >&2
# TODO: Test if the configuration files exist
exit 1
fi
echo ""
echo "-----------------------------------------------"
echo "----- FireFly contract migration starting -----"
echo "-----------------------------------------------"
echo ""
echo "Target stack name: ${STACK_NAME}"
echo "Configuration hosted in: ${CONFIG_FILE_LOCATION}"
echo "FireFly core repository location: ${FIREFLY_REPO_LOCATION}"
echo ""
getCurrentContractAddress () {
ADDRESS=$(curl --silent -X 'GET' \
'http://127.0.0.1:5000/api/v1/status' \
-H 'accept: application/json' \
-H 'Request-Timeout: 2m0s' | jq -r '.multiparty.contract.active.location.address')
echo "${ADDRESS}"
}
TMP_DIR=/tmp/firefly-contract-upgrade
mkdir -p TMP_DIR
echo -ne "🖊️ Compiling the multi-party contract...\r"
solc --overwrite --evm-version paris --bin ${FIREFLY_REPO_LOCATION}/smart_contracts/ethereum/solidity_firefly/contracts/Firefly.sol -o $TMP_DIR >/dev/null 2>&1
solc --overwrite --evm-version paris --abi ${FIREFLY_REPO_LOCATION}/smart_contracts/ethereum/solidity_firefly/contracts/Firefly.sol -o $TMP_DIR >/dev/null 2>&1
echo -e "✅ Compiling the multi-party contract (Done)\r"
CURRENT_CONTRACT_ADDRESS=$(getCurrentContractAddress)
CONTRACT_BIN=$(cat $TMP_DIR/Firefly.bin)
CONTRACT_ABI=$(cat $TMP_DIR/Firefly.abi)
PAYLOAD=$(jq -n \
--arg bin "${CONTRACT_BIN}" \
--argjson abi "${CONTRACT_ABI}" \
'{contract: $bin, definition: $abi, input: []}')
echo -ne "🚚 Deploying the contract...\r"
curl --silent -X POST -H "content-type: application/json" -d "${PAYLOAD}" \
http://localhost:5000/api/v1/namespaces/default/contracts/deploy?confirm=true > $TMP_DIR/deploy-operation.json
echo -e "✅ Deployed the new contract\r"
echo -ne "🔍 Extracting transaction and address information...\r"
TRANSACTION_ID=$(cat $TMP_DIR/deploy-operation.json | jq -r '.output.headers.requestId')
CONTRACT_ADDRESS=$(cat $TMP_DIR/deploy-operation.json | jq -r '.output.contractLocation.address')
echo -e "✅ Transaction ID is ${TRANSACTION_ID} and address is ${CONTRACT_ADDRESS}\r"
echo -ne "🔍 Getting the transaction receipt...\r"
curl --silent -X 'GET' "http://localhost:5102/transactions/${TRANSACTION_ID}" \
-H 'accept: application/json' \
-H 'Request-Timeout: 0s' > $TMP_DIR/transaction-receipt.json
echo -e "✅ Got the transaction receipt\r"
echo -ne "#️⃣ Getting the block number...\r"
BLOCK_NUMBER=$(cat $TMP_DIR/transaction-receipt.json | jq -r '.receipt.blockNumber')
echo -e "✅ Contract deployed in block ${BLOCK_NUMBER}\r"
echo -ne "⬆️ Updating stack $STACK_NAME with the new contract...\r"
SAVEIFS=$IFS
IFS=$'\n'
files=$(find ~/.firefly/stacks/$STACK_NAME/runtime/config | grep -E 'firefly_core_[0-9].yml')
files=($files)
for file in "${files[@]}"
do
CURRENT_NAMESPACE=$(cat $file | yq '.namespaces.default')
# jq > yq
FILE_COPY="${file}.json"
cat ${file} | yq --output-format json > "${FILE_COPY}"
NEW_CONTRACT_ENTRY=$(jq -n \
--arg blocknumber "${BLOCK_NUMBER}" \
--arg contractaddress "${CONTRACT_ADDRESS}" \
'{firstEvent: $blocknumber, location: { address: $contractaddress }, options: {}}')
EDITED_FILE=$(jq \
--arg NAMESPACE "${CURRENT_NAMESPACE}" \
--argjson NEW_CONTRACT_ENTRY "${NEW_CONTRACT_ENTRY}" \
'(.namespaces.predefined[] | select(.name == $NAMESPACE) | .multiparty.contract) |= .+[$NEW_CONTRACT_ENTRY]' "${FILE_COPY}")
echo "$EDITED_FILE" | yq -P > ${file}
echo "✅ Updated ${file}"
rm "${FILE_COPY}"
done
IFS=$SAVEIFS
echo -e "✅ Stack ${STACK_NAME} updated!\r"
echo -ne "🔫 Restarting FireFly docker containers...\r"
SAVEIFS=$IFS
IFS=$'\n'
containers=$(docker container ls | grep -E 'firefly_core')
containers=($containers)
for container in "${containers[@]}"
do
ID=$(echo $container | awk '{print $1}')
docker restart $ID >/dev/null 2>&1
done
IFS=$SAVEIFS
echo -ne "✅ Containers restarted\r"
echo -ne "⏱️ Waiting for the new containers to come up\r"
sleep 10
echo -e "✅ They're probably up by now...\r"
echo -ne "🖊️ Terminating use of current contract...\r"
curl --silent -X 'POST' \
'http://127.0.0.1:5000/api/v1/network/action' \
-H 'accept: application/json' \
-H 'Request-Timeout: 2m0s' \
-H 'Content-Type: application/json' \
-d '{
"type": "terminate"
}' >/dev/null 2>&1
echo -e "✅ Terminated use of current contract\r"
echo -ne "⏱️ Waiting before verifying the new contract is in use\r"
sleep 10
echo -e "✅ Got the current contract\r"
DISCOVERED_CONTRACT_ADDRESS=$(getCurrentContractAddress)
echo ""
echo "-----------------------------------------------"
echo "----- FireFly contract migration is done! -----"
echo "-----------------------------------------------"
echo ""
echo "Old contract location: ${CURRENT_CONTRACT_ADDRESS}"
echo "New contract location: ${DISCOVERED_CONTRACT_ADDRESS}"
Major functionality check
There's a draft PR here https://github.com/hyperledger/firefly/pull/1461 with some draft release notes covering the major features being added in this release, as part of hardening here we should go through with the release candidate and check that all of the major function area have been covered by some testing.
- New Multi-party contract
- Rescue APIs (when available in RC)
- Definition/publication APIs
- Batching events for delivery over Websockets
Performance testing
Once we have a new RC available we can start to conduct performance regression testing against the previous release. Some very preliminary testing has been done under this issue https://github.com/hyperledger/firefly/issues/1465 so we should be able to use the same configuration for the testing there.
RC1 - Migration Testing
Some pre-RC testing has already been done in this space, so the aim of this testing is to double check the existing testing and other permutations of FireFly configurations which have not yet been checked. From the original comment we know we don't have coverage for migration testing in these areas:
- Testing in multi-party and gateway mode
- PostgreSQL as DB
- ETHConnect as blockchain connector
Additionally, we'll also need to do migration testing in the areas we have covered pre-RC:
- ERC20/721/1155 tokens connectors
- EVM Connector as blockchain connector
- Fabric
Using this issue to track permutations of testing...
General steps to migrate a FireFly stack
- A freshly built CLI from source (to ensure you have the latest commits)
- Create and run a normal stack
- Run E2E tests (using tests from the commit your stack is based from)
- Build new images using commits from the release you want to move to
- Update the Docker Compose Override file with the new images
- Upgrade the batch pin contract (see below)
- Restart the containers
- Run the E2E tests (using tests from the commit you've moved to)
- Verify that all data from the previous run of the E2E is still available
To upgrade the batch pin contract:
- Deploy the new contract
- Update the config file for each node with the new address and block number
- Restart the nodes
-
POST
to/network/actions
with payload{"type": "terminate"}
- Verify with a
GET /status
call that the new contract is in use
...or alternatively use the script from the original comment to automate this process.
Very quick and dirty migration script
Will contribute this formally into a script somewhere when it's not thrown-together hackiness but this script does a semi-automatic migration.
#!/bin/bash
export STACK_NAME=migration
export CREATE_STACK=false
ff init migration
ff start migration
sleep 10
cd ./firefly
make e2e
cd ..
ff stop migration
cat >>~/.firefly/stacks/migration <<EOL
...
EOL
services:
dataexchange_0:
image: localhost:7000/firefly-dataexchange-https:latest
dataexchange_1:
image: localhost:7000/firefly-dataexchange-https:latest
evmconnect_0:
image: localhost:7000/firefly-evmconnect:latest
evmconnect_1:
image: localhost:7000/firefly-evmconnect:latest
firefly_core_0:
image: localhost:7000/firefly:latest
firefly_core_1:
image: localhost:7000/firefly:latest
tokens_0_0:
image: localhost:7000/firefly-tokens-erc20-erc721:latest
tokens_1_0:
image: localhost:7000/firefly-tokens-erc20-erc721:latest
ff start migration
./contract-migration migration ~/.firefly/stacks ~/firefly
cd ~/firefly
git checkout main
git pull
make e2e
Multi-party | DB Provider | Blockchain Connector | Tokens Connector | Passed? | Tested by |
---|---|---|---|---|---|
N | PostgreSQL | EVMConnect | None | ✅ | @nguyer |
Y | PostgreSQL | EVMConnect | None | ✅ | @nguyer |
N | PostgreSQL | EVMConnect | ERC20/721 | ✅ | @nguyer |
Y | PostgreSQL | EVMConnect | ERC20/721 | ✅ | @nguyer |
Y | PostgreSQL | EVMConnect | ERC1155 | ✅ | @nguyer |
N | PostgreSQL | EVMConnect | ERC1155 | ✅ | @nguyer |
N | PostgreSQL | Fabconnect | None | ✅ | @nguyer |
Y | PostgreSQL | Fabconnect | None | ✅ | @nguyer |
RC1 - Functionality Check
RC1 - Performance Testing
Will start looking at the performance of RC1 soon, but in the mean time, I've kicked a test of 1.2.2 to gather some performance metrics as a reference point, I'll put the configuration and results below.
1.2.2 Release Commit 1.3-rc1 Release Commit
Reference performance testing for 1.2.2
nohup ./start.sh &> ffperf.log &
core-config.yml
log:
level: debug
broadcast:
batch:
size: 200
timeout: 1s
privatemessaging:
batch:
size: 200
timeout: 1s
message:
writer:
count: 5
download:
worker:
count: 100
publicstorage:
ipfs:
api:
requestTimeout: 2s
gateway:
requestTimeout: 2s
ethconnect.yml
rest:
rest-gateway:
maxTXWaitTime: 120
maxInFlight: 200
alwaysManageNonce: true
attemptGapFill: true
sendConcurrency: 3
gasEstimationFactor: 2.0
confirmations:
required: 5
debug:
port: 6000
instances.yml
stackJSONPath: /home/ubuntu/.firefly/stacks/1-2-2-perf-test/stack.json
wsConfig:
wsPath: /ws
readBufferSize: 16000
writeBufferSize: 16000
initialDelay: 250ms
maximumDelay: 30s
initialConnectAttempts: 5
heartbeatInterval: 5s
instances:
- name: long-run
tests: [{"name": "msg_broadcast", "workers":50},{"name": "msg_private", "workers":50},{"name": "blob_broadcast", "workers":30},{"name": "blob_private", "workers":30},{"name": "custom_ethereum_contract", "workers":20},{"name": "token_mint", "workers":10}]
length: 500h
sender: 0
recipient: 1
messageOptions:
longMessage: false
tokenOptions:
tokenType: fungible
contractOptions: {"address": "0xfe1a8867fc460fe5696cb316b2649788b74ec46d"}
FireFly git commit:
d0fb82d64cfeb2848b0a32a6bc286d5b9ade87ea
Steps that I am following for validating migration scenarios:
Multiparty Tests
- [ ] Deploy a v1.2.2 stack
- [ ] Enable multiparty mode (run the
./hack/multiparty.sh
script) - [ ] Send a broadcast (using the sandbox)
- [ ] Send a private message
- [ ] Upgrade to v1.3.0
- [ ] Send a broadcast (still using the old batch pin contract)
- [ ] Upgrade the contract (run
./hack/multiparty.sh
again) - [ ] Send a broadcast
- [ ] Send a private message
- [ ] Verify transactions in FireFly UI
Tokens Tests
- [ ] After installing, deploy a token contract (via FF API)
- [ ] Mint some tokens
- [ ] Transfer one
- [ ] Burn one
- [ ] Transfer one with a message attached
- [ ] Verify transactions and balances in FireFly UI
- [ ] If using ERC20/721 connector, repeat for each
RC4 - Performance Testing
Have started a RC4 performance testing with the below options
Reference Performance test options
nohup ./start.sh &> ffperf.log &
core-config.yml
log:
level: debug
broadcast:
batch:
size: 200
timeout: 1s
privatemessaging:
batch:
size: 200
timeout: 1s
message:
writer:
count: 5
download:
worker:
count: 100
publicstorage:
ipfs:
api:
requestTimeout: 2s
gateway:
requestTimeout: 2s
ethconnect.yml
rest:
rest-gateway:
maxTXWaitTime: 120
maxInFlight: 200
alwaysManageNonce: true
attemptGapFill: true
sendConcurrency: 3
gasEstimationFactor: 2.0
confirmations:
required: 5
debug:
port: 6000
instances.yml
stackJSONPath: /home/ubuntu/.firefly/stacks/enrique-test/stack.json
wsConfig:
wsPath: /ws
readBufferSize: 16000
writeBufferSize: 16000
initialDelay: 250ms
maximumDelay: 30s
initialConnectAttempts: 5
heartbeatInterval: 5s
instances:
- name: long-run
tests: [{"name": "msg_broadcast", "workers":50},{"name": "msg_private", "workers":50},{"name": "blob_broadcast", "workers":30},{"name": "blob_private", "workers":30},{"name": "custom_ethereum_contract", "workers":20},{"name": "token_mint", "workers":10}]
length: 500h
sender: 0
recipient: 1
messageOptions:
longMessage: false
tokenOptions:
tokenType: fungible
contractOptions: {"address": "0xf4e5e921cf78de3c503623bb91230c4e54cf91cb"}
FireFly git commit:
577e8c47680c6230209a74829921a9c427766af8
Run Report RC4
Started: 10/04/24 Duration: ~5 hours Git commit: https://github.com/hyperledger/firefly/commit/[49410c52653e143e8a17bd9ab58ba2423f564714]
Node Configuration 2 FireFly nodes on one virtual server (EC2 m4.xlarge) Entire FireFly stack is local to the server (ie both blockchains, Postgres databases, etc) Single geth node with 2 instances of ethconnect Maximum time to confirm before considering failure = 1 minute
Reference Performance test options
core-config.yml
log:
level: debug
broadcast:
batch:
size: 200
timeout: 1s
privatemessaging:
batch:
size: 200
timeout: 1s
message:
writer:
count: 5
download:
worker:
count: 100
publicstorage:
ipfs:
api:
requestTimeout: 2s
gateway:
requestTimeout: 2s
ethconnect.yml
rest:
rest-gateway:
maxTXWaitTime: 120
maxInFlight: 200
alwaysManageNonce: true
attemptGapFill: true
sendConcurrency: 3
gasEstimationFactor: 2.0
debug:
port: 6000
instances.yml
stackJSONPath: /home/ubuntu/.firefly/stacks/latest/stack.json
wsConfig:
wsPath: /ws
readBufferSize: 16000
writeBufferSize: 16000
initialDelay: 250ms
maximumDelay: 30s
initialConnectAttempts: 5
heartbeatInterval: 5s
instances:
- name: long-run
tests: [{"name": "msg_broadcast", "workers":50},{"name": "msg_private", "workers":50},{"name": "blob_broadcast", "workers":30},{"name": "blob_private", "workers":30},{"name": "custom_ethereum_contract", "workers":20},{"name": "token_mint", "workers":10}]
length: 500h
sender: 0
recipient: 1
messageOptions:
longMessage: false
tokenOptions:
tokenType: fungible
contractOptions: {"address": "0x528adc5c826721ba6a40342ad5918a3499f9663c"}
FireFly git commit:
49410c52653e143e8a17bd9ab58ba2423f564714
NOTE: confirmations set to 0
Results
Broadcast messages: 199,448 Private messages: 235,242 Token mints: 18,443 Transactions: 89,198 No errors
Summary result:
INFO[2024-04-09T16:32:43.924] Shutdown summary:
INFO[2024-04-09T16:32:43.924] - Prometheus metric sent_mints_total = 18447.000000
INFO[2024-04-09T16:32:43.924] - Prometheus metric sent_mint_errors_total = 0.000000
INFO[2024-04-09T16:32:43.924] - Prometheus metric mint_token_balance = 0.000000
INFO[2024-04-09T16:32:43.924] - Prometheus metric received_events_total = 1097482.000000
INFO[2024-04-09T16:32:43.924] - Prometheus metric incomplete_events_total = 0.000000
INFO[2024-04-09T16:32:43.924] - Prometheus metric delinquent_msgs_total = 0.000000
INFO[2024-04-09T16:32:43.924] - Prometheus metric actions_submitted_total = 530299.000000
INFO[2024-04-09T16:32:43.924] - Test duration: 4h58m57.296804144s
INFO[2024-04-09T16:32:43.924] - Measured actions: 1097105
INFO[2024-04-09T16:32:43.924] - Measured send TPS: 61.167763
INFO[2024-04-09T16:32:43.924] - Measured throughput: 61.163341
INFO[2024-04-09T16:32:43.924] - Measured send duration: min: 11.556354ms, max: 1.485517803s, avg: 151ms
INFO[2024-04-09T16:32:43.924] - Measured event receiving duration: min: 2.030708026s, max: 1m4.571629481s, avg: 6.414s
INFO[2024-04-09T16:32:43.924] - Measured total duration: min: 2.030708026s, max: 1m4.571629481s, avg: 6.414s
Grafana results:
(Note I modified the Grafana dashboard to add the transfer submitted to the broadcast submitted ahead of https://github.com/hyperledger/firefly/pull/1490 getting merged)
I find the heatmap not particularly useful, so this a view with histograms to see on average how long it takes to confirm:
Compared to the testing from 1.2, the number look slightly better based on the above testing.
I did notice that the TPS and time to confirm grows overtime
Run Report RC2
Started: 24/05/24 Duration: ~27 hours Git commit: https://github.com/hyperledger/firefly/commit/[b2f86880a109d17751c3481ea1a72c9a2e94dd28]
Node Configuration 2 FireFly nodes on one virtual server (EC2 m4.xlarge) Entire FireFly stack is local to the server (ie both blockchains, Postgres databases, etc) Single geth node with 2 instances of ethconnect Maximum time to confirm before considering failure = 1 minute
Reference Performance test options
core-config.yml
log:
level: debug
broadcast:
batch:
size: 200
timeout: 1s
privatemessaging:
batch:
size: 200
timeout: 1s
message:
writer:
count: 5
download:
worker:
count: 100
publicstorage:
ipfs:
api:
requestTimeout: 2s
gateway:
requestTimeout: 2s
ethconnect.yml
rest:
rest-gateway:
maxTXWaitTime: 120
maxInFlight: 200
alwaysManageNonce: true
attemptGapFill: true
sendConcurrency: 3
gasEstimationFactor: 2.0
debug:
port: 6000
instances.yml
stackJSONPath: /home/ubuntu/.firefly/stacks/rc2-latest/stack.json
wsConfig:
wsPath: /ws
readBufferSize: 16000
writeBufferSize: 16000
initialDelay: 250ms
maximumDelay: 30s
initialConnectAttempts: 5
heartbeatInterval: 5s
instances:
- name: long-run
tests: [{"name": "msg_broadcast", "workers":50},{"name": "msg_private", "workers":50},{"name": "blob_broadcast", "workers":30},{"name": "blob_private", "workers":30},{"name": "custom_ethereum_contract", "workers":20},{"name": "token_mint", "workers":10}]
length: 500h
sender: 0
recipient: 1
messageOptions:
longMessage: false
tokenOptions:
tokenType: fungible
contractOptions: {"address": "0x4c05f4e749304da29017e4eb3e5e2a4aaa84e637"}
subscriptionOptions:
batch: true
batchTimeout: 250ms
readAhead: 50
FireFly git commit:
b2f86880a109d17751c3481ea1a72c9a2e94dd28
NOTE: confirmations set to 0
Results
Broadcast messages: 1,000,039 Private messages: 1,424,155 Token mints: 96,353 Transactions: 540K No errors
Summary result:
INFO[2024-04-25T13:57:39.019] Shutdown summary:
INFO[2024-04-25T13:57:39.020] - Prometheus metric sent_mints_total = 96373.000000
INFO[2024-04-25T13:57:39.020] - Prometheus metric sent_mint_errors_total = 0.000000
INFO[2024-04-25T13:57:39.020] - Prometheus metric mint_token_balance = 0.000000
INFO[2024-04-25T13:57:39.020] - Prometheus metric received_events_total = 6008344.000000
INFO[2024-04-25T13:57:39.020] - Prometheus metric incomplete_events_total = 0.000000
INFO[2024-04-25T13:57:39.020] - Prometheus metric delinquent_msgs_total = 0.000000
INFO[2024-04-25T13:57:39.020] - Prometheus metric actions_submitted_total = 2907814.000000
INFO[2024-04-25T13:57:39.020] - Test duration: 27h32m18.966742004s
INFO[2024-04-25T13:57:39.020] - Measured actions: 6008025
INFO[2024-04-25T13:57:39.020] - Measured send TPS: 60.604479
INFO[2024-04-25T13:57:39.020] - Measured throughput: 60.602054
INFO[2024-04-25T13:57:39.020] - Measured send duration: min: 8.687266ms, max: 6.013594408s, avg: 174ms
INFO[2024-04-25T13:57:39.021] - Measured event receiving duration: min: 2.007866751s, max: 54.132299131s, avg: 6.473s
INFO[2024-04-25T13:57:39.021] - Measured total duration: min: 2.007866751s, max: 54.132299131s, avg: 6.473s
Grafana results:
I find the heatmap not particularly useful, so this a view with histograms to see on average how long it takes to confirm:
This is great! Thank you so much for all the work on this, @EnriqueL8