Zeguan Xiao
Zeguan Xiao
## Environment info - `adapter-transformers` version: 3.1.0 - Platform: Ubuntu 18.04 (Linux-5.4.0-87-generic-x86_64-with-glibc2.27) - Python version: Python 3.9.13 - PyTorch version (GPU?): 1.13.1 (GPU) - Tensorflow version (GPU?): False - Using...
## Environment info - `adapter-transformers` version: 3.1.0 - Platform: Ubuntu 18.04 (Linux-5.4.0-87-generic-x86_64-with-glibc2.27) - Python version: Python 3.9.13 - PyTorch version (GPU?): 1.13.1 (GPU) - Tensorflow version (GPU?): False - Using...
~/multimodal_injection/llava_injection does not appear to be a Python project: neither 'setup.py' nor 'pyproject.toml' found.
Use a custom GPTs chatbot by set the `model` argument.
When using the script in the README to finetune llama2, the training loss goes to 0 and the eval loss goes to nan randomly.
I noticed there are several variants of datasets under the ft-training_set directory about math reasoning, such as math_7k.json, math_10k.json, and math_data.json. It seems that apart from math_10k, the other datasets...
### Bug description Unable to run with example code in https://github.com/Lightning-AI/litgpt/blob/main/extensions/thunder/README.md: ` python extensions/thunder/pretrain.py --config config.yaml --executors '[sdpa, torchcompile_cat, nvfuser, torch]' ` ### What operating system are you using? Linux...
### Bug description When I finetune a pre-trained (using litgpt) tinyllama model with multiple GPUs, there is an error with weight mismatch. But when I finetune with only 1 GPU,...
When `use_beam_search=True`, I noticed that the `best_of` parameter is not specified in the model configs. However, according to [the vLLM source code (0.5.0)](https://github.com/vllm-project/vllm/blob/8f89d72090da70895d77d32248ea8504f7daba50/vllm/sampling_params.py#L252), `best_of` should be set when using beam...
### Feature request / 功能建议 I noticed that there is a [Llama format checkpoint](https://huggingface.co/openbmb/MiniCPM-2B-sft-bf16-llama-format) available, but the conversion script used to create this format is not publicly accessible. Would it...