gobblin icon indicating copy to clipboard operation
gobblin copied to clipboard

GOBBLIN-759: Added feature to support DistCP to copy files that were …

Open amarnathkarthik opened this issue 6 years ago • 6 comments

…modified in last n days

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

  • [x] My PR addresses the following Gobblin JIRA issues and references them in the PR title. For example, "[GOBBLIN-759] My Added feature to support DistCP to copy files modified in last n days"
    • https://issues.apache.org/jira/browse/GOBBLIN-759

Description

  • [x] Here are some details about my PR, including screenshots (if applicable):
  1. Added feature to DistCP the files which were modified in last n days within the lookback period.
  2. This feature allows to copy only the modified files even when non modified files not at the destination.
  3. Leverage existing TimestampBasedCopyableDataset to find the dataset and uses SelectBtwModDataTimeBasedCopyableFileFilter CopyableFilter implementation to filter the files that were modified in last n days.

Tests

  • [x] My PR adds the following unit tests OR does not need testing for this extremely good reason:
  1. Added TimestampBasedCopyableDatasetTest.testCopyWithFilter test case to test 1 modified and 1 non-modified scenario.

Commits

  • [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

amarnathkarthik avatar May 14 '19 00:05 amarnathkarthik

@sv2000 @htran1 @jhsenjaliya created New PR. Please review

amarnathkarthik avatar May 14 '19 05:05 amarnathkarthik

will continue review tomorrow....

jhsenjaliya avatar May 29 '19 06:05 jhsenjaliya

@jhsenjaliya Pushed the changes, please review

amarnathkarthik avatar Jun 13 '19 00:06 amarnathkarthik

Codecov Report

:exclamation: No coverage uploaded for pull request base (master@bca2e1f). Click here to learn what that means. The diff coverage is 0%.

Impacted file tree graph

@@           Coverage Diff            @@
##             master   #2633   +/-   ##
========================================
  Coverage          ?   4.13%           
  Complexity        ?     751           
========================================
  Files             ?    1937           
  Lines             ?   72988           
  Branches          ?    8051           
========================================
  Hits              ?    3017           
  Misses            ?   69652           
  Partials          ?     319
Impacted Files Coverage Δ Complexity Δ
...sion/finder/HdfsModifiedTimeHiveVersionFinder.java 23.07% <ø> (ø) 1 <0> (?)
...writer/partitioner/TimeBasedWriterPartitioner.java 0% <ø> (ø) 0 <0> (?)
...he/gobblin/cluster/TaskRunnerSuiteThreadModel.java 0% <ø> (ø) 0 <0> (?)
.../java/org/apache/gobblin/hive/HiveLockFactory.java 0% <ø> (ø) 0 <0> (?)
...lin/hive/metastore/HiveMetaStoreBasedRegister.java 0% <ø> (ø) 0 <0> (?)
...pache/gobblin/configuration/ConfigurationKeys.java 0% <ø> (ø) 0 <0> (?)
.../org/apache/gobblin/hive/HiveRegistrationUnit.java 0% <ø> (ø) 0 <0> (?)
.../org/apache/gobblin/service/ServiceConfigKeys.java 0% <ø> (ø) 0 <0> (?)
...ain/java/org/apache/gobblin/writer/DataWriter.java 0% <ø> (ø) 0 <0> (?)
...ain/java/org/apache/gobblin/hive/HiveLockImpl.java 0% <ø> (ø) 0 <0> (?)
... and 129 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update bca2e1f...c3dc277. Read the comment docs.

codecov-io avatar Feb 15 '20 19:02 codecov-io

@sv2000 Please review

amarnathkarthik avatar Feb 15 '20 19:02 amarnathkarthik

+1 LGTM

arjun4084346 avatar Feb 29 '20 01:02 arjun4084346