keyvi icon indicating copy to clipboard operation
keyvi copied to clipboard

Breaking change in v0.6.4

Open remusao opened this issue 7 months ago • 4 comments

Describe the bug

There is a change of behavior between v0.6.3 and v0.6.4 that should be considered a breaking change. Previously dict values would be returned from get(...).value, and v0.6.4 now returns str values instead.

To Reproduce

Here is a simple script to reproduce of the breaking change:

from keyvi.compiler import StringDictionaryCompiler
from keyvi.dictionary import Dictionary

# Build keyvi file
test = StringDictionaryCompiler()
test.Add("foo", "{}")
test.Compile()
test.WriteToFile("./test.kv")

# Read keyvi file
keyvi_file = Dictionary("./test.kv")
print(type(keyvi_file.get("foo").value))

Expected behavior In v0.6.3 the script would print <class 'dict'> (value would be parsed from JSON to a Python dict) While in v0.6.4 the same script now prints <class 'str'> (the value is now returned as a Python str)

Additional context It's unclear to me if the new behavior is the intended one or not but regardless I believe this is a breaking change that should probably not be shipped as patch release.

remusao avatar May 15 '25 14:05 remusao

Thanks for the report.

Definitely not intended, will look into it.

Gut-feeling: For a StringDictionary it is maybe better to return str, but as you pointed out, this should not be changed in a patch release.

Out of curiosity: Did you consider using a JsonDictionary? And if not, what is the reason? As far as I remember StringDictionary is one of the older dictionary types, implemented before JsonDictionary. JsonDictionary usually is the more modern implementation which e.g. supports compression. But maybe I miss something, that's why I am asking.

hendrikmuhs avatar May 16 '25 08:05 hendrikmuhs

Thanks for the quick response on this. It also seems to me like the new behavior is more correct but I suspect there will be existing code making use of StringDictionary which will break. I don't think there is a good reason not to use JsonDictionary (at least that I can remember); is the only difference the support for compression and the fact it auto-handles JSON values?

remusao avatar May 16 '25 08:05 remusao

Thanks for the quick response on this. It also seems to me like the new behavior is more correct but I suspect there will be existing code making use of StringDictionary which will break

Agree, I will revert to the old behavior and as mentioned in #104 rather create a new version of StringDictionary, which will return str and support compression.

is the only difference the support for compression and the fact it auto-handles JSON values?

The storage layer is also different. With StringDictionary values are stored as json strings, JsonDictionary encodes JSON as msgpack. That means keyvi files created using JsonDictionary should be smaller then keyvi files created using StringDictionary. There should also be a small performance difference on lookup: decoding a msgpack (binary) string should be faster than decoding a json string.

hendrikmuhs avatar May 16 '25 08:05 hendrikmuhs

Good to know, thanks for the clarifications.

remusao avatar May 16 '25 08:05 remusao

took some time until I finally had time to investigate.

@remusao how much trouble does this change cause? I guess it caused some undesired effect and a maybe a long debugging session till you find it.

However, what is the current state?

  • did you implemented a change on your side to adjust to the new behavior? or
  • is it (still) blocking going forward from 0.6.3 to 0.6.4?

Fixing this would require adding an ugly workaround. If possible I would like to avoid that.

hendrikmuhs avatar Jul 11 '25 07:07 hendrikmuhs

Hi @hendrikmuhs, and apologies for not answering this, the notification got lost for a while.

how much trouble does this change cause? I guess it caused some undesired effect and a maybe a long debugging session till you find it.

Correct, it did break at least one or two use-cases we have for keyvi. For safety we would probably need to rebuild/test each use-case with the latest keyvi release to make sure everything works as expected. Not a massive deal since we can stick to an older version but it will still take some time/work.

is it (still) blocking going forward from 0.6.3 to 0.6.4?

At the moment we are sticking to the last release before that change.

remusao avatar Oct 18 '25 09:10 remusao

Thanks for feedback, I think I somehow misunderstood: This happens for all json like strings, not only the empty ("{}") one.

Will revisit what I wrote in https://github.com/KeyviDev/keyvi/issues/333#issuecomment-2886036882

hendrikmuhs avatar Oct 18 '25 11:10 hendrikmuhs