gobblin
gobblin copied to clipboard
Gobblin 1320 add iceberg writer
Dear Gobblin maintainers,
This pr is a still ongoing development. Any comment and advise is welcomed!
JIRA
- [ ] My PR addresses the following Gobblin JIRA issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
- https://issues.apache.org/jira/browse/GOBBLIN-1320
Description
- [ ] Here are some details about my PR, including screenshots (if applicable): This PR is trying to add an Iceberg module, that enable Gobblin to write as Iceberg table format. It wraps Iceberg task writer into FsDataWriter. It accept Avro, ORC and Parquet format. This PR only addresses the writer part of the Iceberg module, and the source part will be in another PR.
Tests
- [ ] My PR adds the following unit tests OR does not need testing for this extremely good reason:
Commits
- [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
- Subject is separated from body by a blank line
- Subject is limited to 50 characters
- Subject does not end with a period
- Subject uses the imperative mood ("add", not "adding")
- Body wraps at 72 characters
- Body explains "what" and "why", not "how"
Codecov Report
Merging #3184 (579aadc) into master (fb5e40f) will increase coverage by
0.24%. The diff coverage is29.62%.
@@ Coverage Diff @@
## master #3184 +/- ##
============================================
+ Coverage 45.94% 46.18% +0.24%
- Complexity 9602 9762 +160
============================================
Files 1997 2019 +22
Lines 76097 77180 +1083
Branches 8469 8563 +94
============================================
+ Hits 34960 35648 +688
- Misses 37873 38226 +353
- Partials 3264 3306 +42
| Impacted Files | Coverage Δ | Complexity Δ | |
|---|---|---|---|
| ...pache.gobblin/writer/IcebergDataWriterBuilder.java | 0.00% <0.00%> (ø) |
0.00 <0.00> (?) |
|
| ...in/java/org.apache.gobblin/writer/IcebergUtil.java | 0.00% <0.00%> (ø) |
0.00 <0.00> (?) |
|
| .../java/org.apache.gobblin/writer/IcebergWriter.java | 0.00% <0.00%> (ø) |
0.00 <0.00> (?) |
|
| ...che.gobblin/writer/IcebergFileAppenderFactory.java | 32.25% <32.25%> (ø) |
2.00 <2.00> (?) |
|
| ...pache.gobblin/writer/IcebergTaskWriterFactory.java | 87.50% <87.50%> (ø) |
3.00 <3.00> (?) |
|
| ...apache/gobblin/runtime/api/JobCatalogListener.java | 76.92% <0.00%> (-23.08%) |
0.00% <0.00%> (ø%) |
|
| ...n/runtime/job_catalog/JobCatalogListenersList.java | 63.63% <0.00%> (-10.05%) |
10.00% <0.00%> (ø%) |
|
| ...pache/gobblin/cluster/JobConfigurationManager.java | 81.39% <0.00%> (-6.11%) |
10.00% <0.00%> (ø%) |
|
| .../apache/gobblin/runtime/api/MutableJobCatalog.java | 81.25% <0.00%> (-5.42%) |
0.00% <0.00%> (ø%) |
|
| ...ache/gobblin/cluster/GobblinHelixJobScheduler.java | 34.48% <0.00%> (-4.74%) |
6.00% <0.00%> (ø%) |
|
| ... and 54 more |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update fb5e40f...579aadc. Read the comment docs.
Also, there's a travis failure
@autumnust, @sv2000 , any plan on continuing on this PR ?
@jhsenjaliya - It would be good to revive this PR. Are you interested in taking it up? Happy to discuss, how this PR can be improved.
sure,lets talk about it next week.