CTranslate2
CTranslate2 copied to clipboard
nucleus sampler problem?
In the Github repository: https://github.com/the-crypt-keeper/can-ai-code/tree/main from @the-crpyt-keeper the WizardCoder was investigated with ctranslate2.
8/08 Added cformers2 support and evaluated michaelfeil/ct2fast-WizardCoder-15B-V1.0 it seems this runtime may have a problem with it's nucleus sampler, precise settings hurt the results far more then they should.
Can this be investigated in more detail?
The problem seems to come from repetition_penalty. Compared to other implementations the outcome differy quite alot. The implementation however seems to be fine, at least for the penalty itself.
Can you provide a way to reproduce the issue? It would be great if you can specify the exact model, input, and generation parameters to use.
Some more information is in the following git project issue: https://github.com/the-crypt-keeper/can-ai-code/issues/75
To reproduce the whole process:
- git clone the-crypt-keeper/can-ai-code
- Create the prompts with
prepare.py --template prompts/Wizard-Coder.txt
- Create the runtime with
./interview_cuda.py --runtime ctranslate2 --model_name michaelfeil/ct2fast-WizardCoder-15B-V1.0 --params params/**.json --input results/prepare_junior-v2_python-javascript_Wizard-Coder.ndjson
For the params:
- Precise Settings with the repetition penalty leads to a massive deterioration of prediction results https://github.com/the-crypt-keeper/can-ai-code/blob/main/params/precise.json
- Precise Settings with turned off repetition penalty / setting it to 1 leads to better result
- WizardCoder Settings are fine https://github.com/the-crypt-keeper/can-ai-code/blob/main/params/precise.json
- Changing the
--runtime
to transformers,vllm, autogptq, ... does not show the deterioration. (Different model is necessary: WizardLM/WizardCoder-15B-V1.0)
The calculation of the penalty in the ctranslate2 library seems to be fine. I don't know where the error comes from.
Thanks and let me know if you need anything else.