Very slow performance (~2MB/s), process eventually killed
Hi there,
I'm using 0.10.0 with the following config:
# encryption_key: $MY_PRIVATE_ENC_KEY # optional - encrypt data on datastore
source:
connection_uri: postgres://XXXX:[email protected]:5432/XXXX # you can use $DATABASE_URL
datastore:
gcp:
bucket: XXXX
region: us-east1
access_key: XXXX
secret: XXXX
destination:
connection_uri: postgres://XXXX:[email protected]:3000/XXXX # you can use $DATABASE_URL
The source database is remote, the destination database is running locally in Docker.
The database is approximately 15GB on disk and is running at < 5% CPU utilization. It's a remote database hosted on GCP Cloud SQL. I've observed network activity below 2MB/s while running replibyte. Eventually, the process is killed
[1] 28147 killed replibyte -c replibyte.conf.yaml dump create
Hi, I am working on improving the overall performance. Watch #257
Confirming the same issue, doesn't seem to matter where replibyte is running, I also get ~2MB/s and then the process is killed. FWIW I'm seeing this on mysql
Hi @ikegentz and @samcfinan , FYI, I'm allocating some time now to improve the performance. I found out what was the performance issue and the main issue is how the parsers work. I am right now experimenting with nom, and I got more than 600MB/s IO read. I'll keep you posted.
That awesome!! Did you think in pest as alternative?
PEST is a good option as well, but:
- I didn't read good things about PEST and their performances.
- They admit that they have lower performances :D
Performance is key for Replibyte.
Yes, that true. A high optimized nom parser will be great 🙌. Ping me if I can help you with the impl.
Thx!