[Feature Request]: PostgreSQL Bulk Loader needs commit size?

Open zamanganji opened this issue 1 month ago • 1 comments

What would you like to happen?

We are using an MQTT broker to ingest sensor data into the new PostgreSQL database. The Apache HOP (2.9) pipeline runs continuously, and I monitor the incoming data by executing a query. However, I noticed that when the ETL pipeline is running, the newly ingested data is not visible. When the pipeline is stopped, the data appears. I resolved this issue by using the Table Output step instead of the PostgreSQL Bulk Loader, with a commit size of 10. It would be better if PostgreSQL Bulk Loader had a commit size.

Issue Priority

Priority: 3

Issue Component

Component: Database

Nov 27 '25 12:11 zamanganji

Are you expecting a volume that needs a bulk loader? The bulk loader creates a file in the backend and then uses the PostgreSQL COPY command. The command is executed once the input stream finishes and the file is fully created. This is why you are not seeing any records in the DB.

In theory, we could add a splitting option to execute the copy command and split to a new file at a certain "commit size". You could also add a pipeline executor to execute for every X rows and inside the pipeline executor use the Bulk loader. This pipeline will run once per batch of records that is received.

Nov 27 '25 12:11 hansva