temporal
temporal copied to clipboard
bug: concurrent map read and map write
Expected Behavior
I ran into a weird error in a test run, just using the temporal auto-setup docker image version 1.19.1. I don't think I'm doing anything special. Just starting some basic containers.
Actual Behavior
The following logs show how temporal failed to start https://gist.github.com/vikstrous2/7d016b5562903b723d93b6a403589620
Steps to Reproduce the Problem
Start temporal from this docker-compose file over and over again until this error triggers:
version: '3.4'
services:
temporal-db:
image: postgres:9.6.24-alpine@sha256:8342bcb43446694428ec6594e72e4299692854f0fc3aca090b0ab46f4c7f32a1
restart: unless-stopped
environment:
POSTGRES_PASSWORD: temporal
POSTGRES_USER: temporal
ports:
- 5434:5432
healthcheck:
interval: 1000h
test: 'true'
temporal:
image: temporalio/auto-setup:1.19.1@sha256:3b582c47c354e7f9958c098f168ceb514766ab93526e9be1d772179663710d0f
restart: unless-stopped
depends_on:
- temporal-db
environment:
- DB=postgresql
- DB_PORT=5432
- POSTGRES_USER=temporal
- POSTGRES_PWD=temporal
- POSTGRES_SEEDS=temporal-db
ports:
- 7233:7233
healthcheck:
interval: 1000h
test: 'true'
Specifications
- Version: 1.19.1
- Platform: docker
It seems it is coming from the ringpop (which uses tchannel). We do plan to deprecate ringpop and replace with other better maintained membership library.
Regularly crashing in our CI environments. Every ~1/20 pipeline executions.
Crashing our CI environments as well. More detail here -> https://github.com/temporalio/cli/issues/212
It also crashed our CI a few times per week. I wish there was a way to control/catch the error before it happens
What's the progress on this?
bump