zstd
zstd copied to clipboard
Discrepancy in manual for dictionary loading
In zstd_manual.html:872, describing ZSTD_DCtx_loadDictionary:
The dictionary remains valid for all future frames, until explicitly invalidated
However, in ZSTD_CCtx_loadDictionary_advanced (zstd_compress.c:1123 which ZSTD_DCtx_loadDictionary wraps around), this seems not to be true:
size_t ZSTD_CCtx_loadDictionary_advanced(
ZSTD_CCtx* cctx, const void* dict, size_t dictSize,
ZSTD_dictLoadMethod_e dictLoadMethod, ZSTD_dictContentType_e dictContentType)
{
RETURN_ERROR_IF(cctx->streamStage != zcss_init, stage_wrong,
"Can't load a dictionary when ctx is not in init stage.");
DEBUGLOG(4, "ZSTD_CCtx_loadDictionary_advanced (size: %u)", (U32)dictSize);
ZSTD_clearAllDicts(cctx); /* in case one already exists */
It seems that if adictionary is already used within a context, when loading a new one, the previous dictionary gets cleared from the context. I assumed that it would be retained in the context based on the manual, since by loading the new dictionary, I am not explicitly invalidating the old dictionary. Am I correct in thinking this?
The context can only manage one dictionary. So, when a new dictionary get inserted, the old one get evicted.
There is also a difference between compression and decompression sides, which are quoted in this issue.
On the decompression side, it's possible, and was considered, to manage multiple dictionaries simultaneously. In which case, it would be possible to stack multiple DDict references in the context.
The problem is that this capability has not been developed yet.
On the compression side though, this makes less sense, since one dictionary must be selected for a given compression job. Therefore it has always been expected that referencing a new dictionary would evict the previous one.
However, in
ZSTD_CCtx_loadDictionary_advanced(zstd_compress.c:1123whichZSTD_DCtx_loadDictionarywraps around),
This sentence puzzles me. Decompression code should never depend on compression code, and vice versa. Can it be double-checked ?
I'll clarify the docs slightly.
In which case, it would be possible to stack multiple DDict references in the context. The problem is that this capability has not been developed yet.
It is enabled if you set ZSTD_d_refMultipleDDicts on your `dctx.