zstd
zstd copied to clipboard
Sample data merging
Can we merge the sample data when training the dictionary in a buffer. Because there will be a lot of sample data that repeats a certain data, but this data only appears once in this sample, but we can't find this data to use as a dictionary well now.
Because there will be a lot of sample data that repeats a certain data, but this data only appears once in this sample
This is the intended use case for the dictionary trainer. The trainer will actually ONLY look at repetitions between different samples, and ignores repetition within a single sample.
If you are finding that the dictionary isn't working well, please share more details:
- Include the training command, the number of samples, and the approximate size of the samples. With this, we may be able to help diagnose your problem.
- If possible, include the data. If you can provide the samples you are training on, we can certainly help.
In fact, I want to achieve the function of merging all sample data together, not trimming and aligning each data, dividing all the data into small blocks.
Can you explain exactly what you want to do and why (an example may help)? I don't think that is what you want, but I can't be sure because I don't 100% understand the problem.
Closing due to lack of activity. Please open a new issue if you have further questions.