gobblin icon indicating copy to clipboard operation
gobblin copied to clipboard

[GOBBLIN-425] Partitionaware Hive Registration Policy

Open treff7es opened this issue 6 years ago • 4 comments

Dear Gobblin maintainers,

Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below!

JIRA

  • [ ] My PR addresses the following Gobblin JIRA issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
    • https://issues.apache.org/jira/browse/GOBBLIN-425

Description

This is the partition aware implementation of {@link HiveRegistrationPolicyBase}. You can specify hive.partition.regex where the first match is the table location and the rest match are the partitions. The partition names can be specified with the hive.table.partition.keys property. The order of the partition names should match the order of regexp matches in the hive.partition.regexp expression.

For example in the case of path: s3://testbucket/myawesomlogs/compacted/dt=20170101/hr=22/ The hive.partition.regexp would look like: hive.partition.regex=(s3://testbucket/myawesomelogs/compacted/)dt=(.)/hr=(.) and the hive.table.partition.keys=dt,hr

Tests

  • [ ] My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

treff7es avatar Mar 12 '18 15:03 treff7es

@treff7es @autumnust ^^

abti avatar May 04 '18 18:05 abti

@autumnust @deepak-batra Sorry for the long radio silence but finally I updated the pull request. Please, can you check it again?

treff7es avatar Jul 23 '18 14:07 treff7es

@abti Fyi, I have updated this pull request, sorry for the delay.

treff7es avatar Jul 25 '18 05:07 treff7es

@abti ping :)

treff7es avatar Nov 08 '18 13:11 treff7es