feat: Add image hashing functions with support for 5 algorithms
Changes Made
Adds image hashing functionality to Daft with support for 5 algorithms: average, perceptual, difference, wavelet, and crop_resistant.
API Usage
from daft.functions import image_hash
from daft import col
# Default algorithm (average)
df = df.with_column("hash", image_hash(col("image")))
# Specific algorithm
df = df.with_column("hash", image_hash(col("image"), "perceptual"))
Implementation
- New
daft.functions.image_hash()function - Rust backend implementation in
daft-image - 12 comprehensive tests covering all algorithms
- Proper error handling and type validation
Related Issues
https://github.com/Eventual-Inc/Daft/issues/4889
Checklist
- [ ] Documented in API Docs (if applicable)
- [ ] Documented in User Guide (if applicable)
- [ ] If adding a new documentation page, doc is added to
docs/mkdocs.ymlnavigation - [ ] Documentation builds and is formatted properly (tag @/ccmao1130 for docs review)
Codecov Report
:x: Patch coverage is 80.89552% with 64 lines in your changes missing coverage. Please review.
:white_check_mark: Project coverage is 74.50%. Comparing base (70116a6) to head (3f993b0).
:warning: Report is 1 commits behind head on main.
Additional details and impacted files
@@ Coverage Diff @@
## main #5229 +/- ##
==========================================
+ Coverage 74.48% 74.50% +0.01%
==========================================
Files 969 970 +1
Lines 124225 124558 +333
==========================================
+ Hits 92535 92803 +268
- Misses 31690 31755 +65
| Files with missing lines | Coverage Δ | |
|---|---|---|
| daft/expressions/expressions.py | 97.05% <ø> (ø) |
|
| daft/functions/__init__.py | 100.00% <100.00%> (ø) |
|
| daft/functions/image.py | 93.10% <100.00%> (+0.51%) |
:arrow_up: |
| daft/series.py | 92.77% <ø> (ø) |
|
| src/daft-image/src/functions/mod.rs | 100.00% <100.00%> (ø) |
|
| src/daft-image/src/series.rs | 75.90% <67.74%> (-1.88%) |
:arrow_down: |
| src/daft-image/src/functions/hash.rs | 73.21% <73.21%> (ø) |
|
| src/daft-image/src/ops.rs | 75.15% <83.95%> (+8.48%) |
:arrow_up: |
:rocket: New features to boost your workflow:
- :snowflake: Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
- :package: JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.
hey @codekshitij i added comments on the other PR that got closed. Could you address these issues?
https://github.com/Eventual-Inc/Daft/pull/5227
hey @codekshitij i added comments on the other PR that got closed. Could you address these issues?
#5227
Hey i got all your suggestion and will work on that ASAP.
Thank you all for the review. And I'll fix it all ASAP. @rchowell @srilman @universalmind303
Hi @codekshitij 👋 Just checking in -- how is this one going?
Hi @codekshitij 👋 Just checking in -- how is this one going? Sorry for the delay, I'll commit the changes by the end of this week.
Hey @codekshitij - Checking in on this.
Hey @codekshitij - Checking to see if you are able to push this through? Thanks!