text-generation-inference
text-generation-inference copied to clipboard
Any chance to support replit-code-v1-3b?
Model description
https://huggingface.co/replit/replit-code-v1-3b
replit-code-v1-3b is a 2.7B Causal Language Model focused on Code Completion. The model has been trained on a subset of the Stack Dedup v1.2 dataset.
Open source status
- [X] The model implementation is available
- [X] The model weights are available
Provide useful links for the implementation
https://huggingface.co/replit/replit-code-v1-3b
right now it would pop error msg: "Unsupported model type mpt"
It should work, but you would need --trust-remote-code
flag for it to work
Can you provide a full stacktrace ?
It should work, but you would need
--trust-remote-code
flag for it to workCan you provide a full stacktrace ? Thank you! after I add --trust-remote-code, it really works!
it still / again does not work for me, because Grouped Query Attention is not implemented
tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 601, in run_f[58/76432]tgi-scripts-text-generation-inference-awq-1 | self._run_once() tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/asyncio/base_events.py", line 1905, in _run_once tgi-scripts-text-generation-inference-awq-1 | handle._run() tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/asyncio/events.py", line 80, in _run tgi-scripts-text-generation-inference-awq-1 | self._context.run(self._callback, *self._args) tgi-scripts-text-generation-inference-awq-1 | > File "/opt/conda/lib/python3.9/site-packages/text_generation_server/server.py", line 159, in serve_inner tgi-scripts-text-generation-inference-awq-1 | model = get_model( tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/__init__.py", line 165, in get_model tgi-scripts-text-generation-inference-awq-1 | return MPTSharded( tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/mpt.py", line 87, in __init__ tgi-scripts-text-generation-inference-awq-1 | model = MPTForCausalLM(config, weights) tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/mpt_modeling.py", line 1033, in __init__ tgi-scripts-text-generation-inference-awq-1 | self.transformer = MPTModel(config, weights) tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/mpt_modeling.py", line 764, in __init__ tgi-scripts-text-generation-inference-awq-1 | [ tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/mpt_modeling.py", line 765, in <listcomp> tgi-scripts-text-generation-inference-awq-1 | MPTBlock(config, prefix=f"transformer.blocks.{i}", weights=weights) tgi-scripts-text-generation-inference-awq-1 | File "/opt/conda/lib/python3.9/site-packages/text_generation_server/models/custom_modeling/mpt_modeling.py", line 580, in __init__ tgi-scripts-text-generation-inference-awq-1 | raise NotImplementedError( tgi-scripts-text-generation-inference-awq-1 | NotImplementedError: Not implemented attn grouped_query_attention