clickhouse-docs
clickhouse-docs copied to clipboard
Suggestion for "Bulk loading data..." section in inserting-data.md
I was reading through your docs and had some suggestions for improving the peerdb addition to /en/guides/inserting-data.md. I tried to create a PR to fix but it doesn't look like ClickHouse accepts external PRs on docs. It looks like @mshustov made these changes (commit).
Small change / suggestion, but here's the markdown and file. Hope this is helpful!
## Bulk loading data from PostgreSQL
For bulk loading data from PostgreSQL, users can use:
- [PeerDB by ClickHouse](/en/integrations/postgresql#using-peerdb-by-clickhouse), an ETL tool specifically designed for PostgreSQL database replication to both self-hosted ClickHouse and ClickHouse Cloud. To get started, create an account on [PeerDB Cloud](https://www.peerdb.io/) and refer to [the documentation](https://docs.peerdb.io/connect/clickhouse/clickhouse-cloud) for setup instructions.
- The [PostgreSQL table engine](/en/integrations/postgresql#using-the-postgresql-table-engine) to read data directly as shown in previous examples. Typically appropriate if batch replication based on a known watermark, e.g., timestamp, is sufficient or if it's a one-off migration. This approach can scale to 10's millions of rows. Users looking to migrate larger datasets should consider multiple requests, each dealing with a chunk of the data. Staging tables can be used for each chunk prior to its partitions being moved to a final table. This allows failed requests to be retried. For further details on this bulk-loading strategy, see here.
- Data can be exported from PostgreSQL in CSV format. This can then be inserted into ClickHouse from either local files or via object storage using table functions.
:::note Need help inserting large datasets?
If you need help inserting large datasets or encounter any errors when importing data into ClickHouse Cloud, please contact us at [email protected] and we can assist.
:::