data-prep-kit icon indicating copy to clipboard operation
data-prep-kit copied to clipboard

[Bug] text_encoder tests are not verifying the resulting embeddings to avoid floating point rounding errors

Open daw3rd opened this issue 1 year ago • 3 comments
trafficstars

Search before asking

  • [X] I searched the issues and found no similar issues.

Component

Transforms/Other

What happened + What you expected to happen

The current tests for the new text_encoder in PR #461 avoid comparing the "embeddings" column to avoid failures due to floating point variability. This should be fixed by either rounding the embeddings or adding support to the core framework to allow fuzzy match on vectors of floats.

The PR #461 is being approved with the caveat that this issue needs to be addressed after merging.

Reproduction script

none

Anything else

no

OS

Ubuntu

Python

3.11.x

Are you willing to submit a PR?

  • [ ] Yes I am willing to submit a PR!

daw3rd avatar Aug 07 '24 17:08 daw3rd

@dolfim-ibm Is this issue solved?

Bytes-Explorer avatar Sep 04 '24 11:09 Bytes-Explorer

No, I don't think anything was done here.

dolfim-ibm avatar Sep 04 '24 11:09 dolfim-ibm

@daw3rd is this still an issue?

agoyal26 avatar Mar 24 '25 08:03 agoyal26