valkey-py icon indicating copy to clipboard operation
valkey-py copied to clipboard

Fix/result embedding data encoding

Open swarnaprakash opened this issue 1 month ago • 4 comments

Pull Request check-list

  • [Y] Do tests and lints pass with this change? Ran linter and corrected changed lines
  • [Y ] Do the CI tests pass with this change (enable it first in your forked repo and wait for the github action build to finish)? See https://github.com/swarnaprakash/valkey-py/actions/runs/19217879094 . The failed tests are unrelated to this change
  • [Y] Is the new or changed code fully tested? added tests for the new methods and enabled it (even though existing tests don't pass)
  • [Y] Is a documentation update included (if this change modifies existing APIs, or introduces new ones)? T_he docstring is updated withe new parameters_
  • [Y] Is there an example added to the examples folder (if applicable)? NA

Description of change

See https://github.com/valkey-io/valkey-py/issues/242

The Result class in valkey/commands/search/result.py inappropriately applies UTF-8 decoding to all field values, including binary vector data. This corrupts VECTOR field embeddings and makes valkey-py unsuitable for vector search applications.

The change Adds preserve_bytes and binary_fields parameters to search methods to prevent UTF-8 decoding from corrupting VECTOR field embeddings and other binary data. The Result class was inappropriately applying UTF-8 decoding to all field values, including binary vector embeddings. This corrupted FLOAT32 vector data and made valkey-py unsuitable for vector search applications.

swarnaprakash avatar Nov 05 '25 23:11 swarnaprakash