anonlink icon indicating copy to clipboard operation
anonlink copied to clipboard

ValueError: Buffer dtype mismatch when running anonlink.candidate_generation.find_candidate_pairs on AWS Glue

Open bllmo opened this issue 2 years ago • 1 comments

When running the following code snippet on AWS Glue:

results_candidate_pairs = anonlink.candidate_generation.find_candidate_pairs(
    [
     ...
    ],
    [
    ...
    ]
    anonlink.similarities.dice_coefficient_accelerated,
    0.9,
)

I encounter the following error:

ValueError: Buffer dtype mismatch, expected 'const char' but got 'signed char'

I tried using anonlink.similarities.dice_coefficient_accelerated_python as an alternative, and it did not produce the error. However, this alternative is significantly slower, making it impractical for large datasets.

bllmo avatar Sep 21 '23 15:09 bllmo

I ran into a similar problem in https://github.com/data61/anonlink/issues/566, I put a PR there that hopefully fixes it. However, I'm not clear what the contribution guidelines are, so not sure how to move it forward.

snazzer avatar May 21 '24 13:05 snazzer