zstd
zstd copied to clipboard
[Not a bug] Dictionary building strategy
Describe the bug A clear and concise description of what the bug is. Hi @Cyan4973 , Sorry for reporting this as a bug as i don't have a way to reach out to you/team to get this info
I would like to digest a dictionary which i want to use it for all the users. My Blob which i want to compress varies for documents, folders, within documents - pdfs/docx/ppt/jpeg/video etc there are multiple types ... 2. It is a key, value pair object but not a json object 3. Within a given type, content of blob varies from document to document i..e., if i have 2 photos (photo1 and photo2), content for photo1 varies from content for photo2...there is a high chance that the keys may be same ...at times, keys also might slightly differ
With this set up,
-
if i want to train a dictionary(using the API - ZDICT_trainFromBuffer ), is it enough to choose 1 doc from each type or do i have to run it on multiple files of the same type.
-
I tried to train using some 700 samples data whose size came around to be ~12MB(sum of samples size)...should i pass dictBufferCapacity to be 12MB or send default value of 110 KB.?
Thank you
To Reproduce Steps to reproduce the behavior:
- Downloads data '...'
- Run '...' with flags '...'
- Scroll up on the log to '....'
- See error
Expected behavior A clear and concise description of what you expected to happen.
Screenshots and charts If applicable, add screenshots and charts to help explain your problem.
Desktop (please complete the following information):
- OS: [e.g. Mac]
- Version [e.g. 22]
- Compiler [e.g. gcc]
- Flags [e.g. O2]
- Other relevant hardware specs [e.g. Dual-core]
- Build system [e.g. Makefile]
Additional context Add any other context about the problem here.