transactional-datalake-using-apache-iceberg-on-aws-glue
transactional-datalake-using-apache-iceberg-on-aws-glue copied to clipboard
How to handle multiple tables from database source
Hi Any tips to use this example in the real case we have multiple tables to synchronize from the database source ? Thanks for your help Benoit COLAS
Can someone please respond to this man?
There are two ways to handle multiple tables from a database source. First, you can replicate these data pipelines for each table. The other way is to set up AWS DMS to read CDC data from multiple tables (for more information, see AWS DMS - Wildcards in table mapping) and have AWS Glue Streaming Job upsert streaming CDC data into multiple Apache Iceberg tables.
AWS DMS allows you to get binlogs from many tables under a single database. Then, if you have the Glue Streaming Job script repeat upserting for each source database table, you can handle multiple tables.