EnergonAI
EnergonAI copied to clipboard
Large-scale model inference.
Hi, is there any generate example for OTP models?
**Problem** ``` [root@2e71bfd17f96 inference]# export PYTHONPATH=/workspace/colossal/inference/examples/bert [root@2e71bfd17f96 inference]# energonai service init --config_file=/workspace/colossal/inference/examples/bert/bert_config.py Traceback (most recent call last): File “/opt/conda/lib/python3.9/site-packages/energonai/kernel/cuda_native/linear_func.py”, line 5, in energonai_linear = importlib.import_module(“energonai_linear_func”) File “/opt/conda/lib/python3.9/importlib/__init__.py”, line 127, in...
**Problem** If we docker run energonai like: `docker run -ti --gpus all --rm --ipc=host -p 8010:8010 ... ` Then in container run: `export PYTHONPATH=/workspace/colossal/inference/examples/bert` `energonai service init --config_file=/workspace/colossal/inference/examples/bert/bert_config.py` The access...
**Describe the feature:** We are going to introduce the automated pipeline parallelism feature into EnergonAI, which hopes that users only need to specify some simple arguments and achieve the pipeline...
I can't find server.sh,how can I run a example now?
I'm trying to use [OPT 66B](https://huggingface.co/facebook/opt-30b/tree/main) pre-trained model for inference on EnergonAI. After preprocessing the weights by the script of `preprocessing_ckpt_66b.py` and starting opt server, the service hangs there when...
does EnergonAI support gpt model with int8 quantitation in model parallel?
Update: I think this is caused by running a VM on Unraid. The Ubuntu kernel being used is not quite normal. When attempting the OPT examples, via either Docker or...
Hi, I want to use num_beams for generate, but PipelineModel can't. Can you support num_beams? Best wishes.
Hi, I am very interested in the distributed inference of Colossal AI. Since we have pre-trained NLP models from Pytorch or JAX, I wonder if possible or what should be...