alluxio icon indicating copy to clipboard operation
alluxio copied to clipboard

Change the way to calculate content md5 in localUnderFileSystem

Open Jackson-Wang-7 opened this issue 1 year ago • 3 comments

What changes are proposed in this pull request?

change the way to calculate the content hash of files from localUnderFileSystem to get the md5 of content.

Why are the changes needed?

It's better to compare the md5 of the file by using hexadecimal format instead of Base64 format.

Does this PR introduce any user facing changes?

no

Jackson-Wang-7 avatar Jul 07 '23 08:07 Jackson-Wang-7

Automated checks report:

  • PR title follows the conventions: FAIL
    • The title of the PR does not pass all the checks. Please fix the following issues:
      • First word must be capitalized
  • Commits associated with Github account: PASS

Some checks failed. Please fix the reported issues and reply 'alluxio-bot, check this please' to re-run checks.

alluxio-bot avatar Jul 07 '23 08:07 alluxio-bot

Automated checks report:

  • PR title follows the conventions: PASS
  • Commits associated with Github account: PASS

All checks passed!

alluxio-bot avatar Jul 07 '23 08:07 alluxio-bot

Different storage has different encoding(ex. s3 has Hex while gcs has base64), so I recommend we change the content hash to a proto CheckSum('AlgorithmName', 'bytes') so we can compare if necessary. I can draft the code if you guys are OK with that @Jackson-Wang-7 @elega

jja725 avatar Jul 07 '23 18:07 jja725