[Question] What are the risks associated with the Java API?
Search before asking
- [X] I searched in the issues and found nothing similar.
Motivation
https://paimon.apache.org/docs/0.8/program-api/java-api/ comes with a warning at the top
We do not recommend using the Paimon API naked, unless you are a professional downstream ecosystem developer, and even if you do, there will be significant difficulties. If you are only using Paimon, we strongly recommend using computing engines such as Flink SQL or Spark SQL. The following documents are not detailed and are for reference only.
Can you elaborate on the difficulties that will be encountered?
Solution
No response
Anything else?
No response
Are you willing to submit a PR?
- [x] I'm willing to submit a PR!
The main difficulty is to decide where you should use each class and call each method.
For example, consider a distributed system with one master node and several workers node. TableScan should only be used in master, while TableRead and TableWrite should only be used in workers. Also you need to design how to distribute Splits generated from TableScan to the workers. You also need to be careful with TableCommit because it can only run with 1 parallelism (otherwise the consistency guarantee is broken).
All in all, these things are exactly what you need to concern when designing a distributed system.