starcoder
starcoder copied to clipboard
Home of StarCoder: fine-tuning & inference!
I'm seeing batching errors when updating to the latest `text-generation-inference` container. Latest container image: ``` ghcr.io/huggingface/text-generation-inference latest 7b12068effa3 2 hours ago 9.15GB ``` I cloned the model repo, which is...
Hi, I wonder that if I can continue finetune this model from the last lora checkpoint if I interrupted training process?
Hi, the model is to large to out of my GPU meomery. So i want to use 2 GPUS to dispatch the model, and use the device_map like: ``` {'transformer.wte':...
 I tried but it just repeats the prompts.
https://github.com/bigcode-project/starcoder/blob/6c746b437d1895fb47b809d81a6a0db6637f6eee/chat/train.py#L313 It generates error: if data_args.dataset_config_name is not None:AttributeErrorif data_args.dataset_config_name is not None: AttributeError: 'DataArguments' object has no attribute 'dataset_config_name': 'DataArguments' object has no attribute 'dataset_config_name'AttributeError: AttributeError'DataArguments' object has no...
I would like to train the model specifically on swift documents that are not the the classical chat format ( instruction/input/output). Can I use the script in starcoder/finetune/finetune.py? In which...
1. Is there a saved checkpoint that I can use to load the chatting feature of the model? 2. Also, I tried fine tuning the model by following the [instructions](https://github.com/bigcode-project/starcoder/blob/main/chat/README.md)...
I need to know how to use ``, `` and other special tokens listed in tokenizer special_tokens_map when preparing the dataset. I've been successfully able to finetune Starcoder on my...
Is it possible to integrate [StarCoder](https://github.com/bigcode-project/starcoder) as an [LLM Model](https://python.langchain.com/en/latest/modules/models.html) or an [Agent](https://python.langchain.com/en/latest/modules/agents.html) with [LangChain](https://github.com/hwchase17/langchain), and [chain](https://python.langchain.com/en/latest/modules/chains.html) it in a complex usecase? Any help / hints on the same would...