gobblin
gobblin copied to clipboard
[GOBBLIN-2147] Added lookback time fetch in partitioned filesource
Dear Gobblin maintainers,
Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!
JIRA
- [✅] My PR addresses the following Gobblin JIRA issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
- https://issues.apache.org/jira/browse/GOBBLIN-2147
Description
- [✅] Here are some details about my PR, including screenshots (if applicable):
- In partitioned file source based copy even if copy.lookbackTime property was passed in config it wasn't used and files from lowest watermark (if passed, otherwise default value was used) were being processed which can lead to too much processing of files in case granularity isn't configured properly.
- With this change user can pass "copy.lookbackTime" or "date.partitioned.source.lookback.time"
- If in case user doesn't passed values or values are not proper or there is any exception while parsing the property value then it will fallback to watermark values.
Tests
- [✖️] My PR adds the following unit tests OR does not need testing for this extremely good reason:
Commits
- [✅] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
- Subject is separated from body by a blank line
- Subject is limited to 50 characters
- Subject does not end with a period
- Subject uses the imperative mood ("add", not "adding")
- Body wraps at 72 characters
- Body explains "what" and "why", not "how"
Codecov Report
All modified and coverable lines are covered by tests :white_check_mark:
Project coverage is 43.11%. Comparing base (
adef734) to head (876246f). Report is 18 commits behind head on master.
Additional details and impacted files
@@ Coverage Diff @@
## master #4044 +/- ##
============================================
- Coverage 45.38% 43.11% -2.28%
+ Complexity 3192 2468 -724
============================================
Files 696 511 -185
Lines 26628 21501 -5127
Branches 2655 2457 -198
============================================
- Hits 12085 9270 -2815
+ Misses 13542 11286 -2256
+ Partials 1001 945 -56
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
🚀 New features to boost your workflow:
- ❄ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.