WARNING:pgsync.sync: Checkpoint file not found
PGSync version: Latest git branch 2.5.0
Postgres version: 13
Elasticsearch version: 8.9.2
Redis version: docker redis:alpine
Python version: docker python3.11-slim
Problem Description: Checkpoint file keeps getting deleted.
Error Message (if any): 2023-09-09 06:57:28.047:WARNING:pgsync.sync: Checkpoint file not found: /usr/src/app/.fusionpbx_v_xml_cdr
Dockerfile
FROM python:3.11-slim
WORKDIR /usr/src/app
RUN apt update \
&& apt install -y moreutils \
&& apt install -y jq \
&& apt install -y git \
&& apt install -y wait-for-it
RUN pip install git+https://github.com/toluaina/pgsync.git
ENTRYPOINT ["bash", "./entrypoint.sh"]
docker-compose.yml
version: '3.8'
services:
elasticsearch:
image: elasticsearch:8.9.2
container_name: elasticsearch
environment:
- xpack.security.enabled=false
- discovery.type=single-node
- bootstrap.memory_lock=true
- ES_JAVA_OPTS=-Xms512m -Xmx512m
ulimits:
memlock:
soft: -1
hard: -1
ports:
- '9200:9200'
volumes:
# mkdir elasticsearch && chown -R 1000:1000 elasticsearch
- ./elasticsearch:/usr/share/elasticsearch/data
redis:
image: 'redis:alpine'
container_name: redis
pgsync:
build:
context: .
dockerfile: Dockerfile
container_name: pgsync
sysctls:
- net.ipv4.tcp_keepalive_time=200
- net.ipv4.tcp_keepalive_intvl=200
- net.ipv4.tcp_keepalive_probes=5
volumes:
# chown 1000:1000 pgsync
- ./pgsync:/usr/src/app
environment:
- PG_USER=fusionpbx
- PG_PASSWORD=XXX
- PG_DATABASE=fusionpbx
- PG_HOST=XXX
- PG_PORT=5432
- ELASTICSEARCH_HOST=elasticsearch
- ELASTICSEARCH_PORT=9200
- REDIS_HOST=redis
- REDIS_PORT=6379
- ELASTICSEARCH=true
- OPENSEARCH=false
- CHECKPOINT_PATH=/usr/src/app
depends_on:
- redis
- elasticsearch
entrypoint.sh
#!/usr/bin/env bash
wait-for-it $PG_HOST:$PG_PORT -t 60
wait-for-it $REDIS_HOST:$REDIS_PORT -t 60
wait-for-it $ELASTICSEARCH_HOST:$ELASTICSEARCH_PORT -t 60
jq '.[].database = env.PG_DATABASE' schema.json | sponge schema.json
bootstrap -c ./schema.json -t
bootstrap --config ./schema.json
pgsync --config ./schema.json -d
schema.json
[
{
"database": "fusionpbx",
"index": "v_xml_cdr",
"nodes": {
"table": "v_xml_cdr",
"schema": "public",
"columns": [
"network_addr",
"billsec",
"hangup_cause",
"domain_name",
"domain_uuid",
"start_epoch",
"start_stamp",
"end_epoch",
"end_stamp",
"direction",
"extension_uuid",
"source_number",
"caller_id_number",
"caller_id_name",
"destination_number"
],
"children": [
{
"table": "v_extensions",
"columns": [
"effective_caller_id_name",
"effective_caller_id_number",
"outbound_caller_id_name",
"outbound_caller_id_number"
],
"relationship": {
"variant": "object",
"type": "one_to_one",
"foreign_key": {
"child": [
"extension_uuid"
],
"parent": [
"extension_uuid"
]
}
}
},
{
"table": "v_domains",
"columns": [
"domain_description"
],
"relationship": {
"variant": "object",
"type": "one_to_one",
"foreign_key": {
"child": [
"domain_uuid"
],
"parent": [
"domain_uuid"
]
}
}
}
]
}
}
]
docker compose up - brings up the stack and syncs fine, I just want it to work after reboots, without resyncing the entire data again,
After the first startup, it takes around 10 minutes to sync (lots of data) and creates a checkpoint file which i can see at the mountpoint .fusionpbx_v_xml_cdr
docker compose down - the checkpoint file still exists when the stack is downed, however on the next startup it deletes this file and starts the sync from scratch, even duplicating data meaning that the data in elasticsearch is unreliable and incorrect
http://localhost:9200/_cat/indices
yellow open v_xml_cdr p1umAsPtRxSmS-67EZqlSQ 1 1 3302175 0 937.5mb 937.5mb
after a second run, this grows to double the size...
What is your best advice ?
After first run and before shutdown:
After second startup:
Thanks in advance !!
May i stress, it's working perfectly - and i'm using the data in Grafana to build dashboards
just the issue of restarting the stack, causes duplication of data ?
Great project! Big fan 👍
I checked the indices after about 10 minutes,
yellow open v_xml_cdr p1umAsPtRxSmS-67EZqlSQ 1 1 3322957 257988 1023.5mb 1023.5mb
Seems to have shrunken down to size, but still seems like theres duplicate data as the original sync wasn't that big
- I think it's hard to tell why the index is so large from the data above.
- there is some denormalization happening when you go from relational to Elastic/OpenSearch so that is probably why data is getting duplicated.
- regarding the checkpoint can you ensure there is a file named
/usr/src/app/.fusionpbx_v_xml_cdrthat is accessible from the container?
sorry about the delay
Hi @toluaina,
I'm experiencing similar issues. Pgsync itself works great, but any restart causes full sync, which takes some time. With larger datasets even up to 3+ hours.
When I test it locally:
- if there is no checkpoint file pgsync logs it and creates them
- if files are there it deletes it and performs full sync anyway
- I can see the files in the container and can read them
How can I verify if it actually starts from where it left off?
I just found out that calling bootstrap process deletes checkpoint files. Should bootstrap be called only on demand when there are new indices added or schema is changed? If I run just pgsync command without calling bootstrap before it immediatelly starts continous sync.