EnergonAI icon indicating copy to clipboard operation
EnergonAI copied to clipboard

Large-scale model inference.

Results 43 EnergonAI issues
Sort by recently updated
recently updated
newest added

Hi, is there any generate example for OTP models?

**Problem** ``` [root@2e71bfd17f96 inference]# export PYTHONPATH=/workspace/colossal/inference/examples/bert [root@2e71bfd17f96 inference]# energonai service init --config_file=/workspace/colossal/inference/examples/bert/bert_config.py Traceback (most recent call last): File “/opt/conda/lib/python3.9/site-packages/energonai/kernel/cuda_native/linear_func.py”, line 5, in energonai_linear = importlib.import_module(“energonai_linear_func”) File “/opt/conda/lib/python3.9/importlib/__init__.py”, line 127, in...

**Problem** If we docker run energonai like: `docker run -ti --gpus all --rm --ipc=host -p 8010:8010 ... ` Then in container run: `export PYTHONPATH=/workspace/colossal/inference/examples/bert` `energonai service init --config_file=/workspace/colossal/inference/examples/bert/bert_config.py` The access...

**Describe the feature:** We are going to introduce the automated pipeline parallelism feature into EnergonAI, which hopes that users only need to specify some simple arguments and achieve the pipeline...

enhancement

I can't find server.sh,how can I run a example now?

I'm trying to use [OPT 66B](https://huggingface.co/facebook/opt-30b/tree/main) pre-trained model for inference on EnergonAI. After preprocessing the weights by the script of `preprocessing_ckpt_66b.py` and starting opt server, the service hangs there when...

does EnergonAI support gpt model with int8 quantitation in model parallel?

Update: I think this is caused by running a VM on Unraid. The Ubuntu kernel being used is not quite normal. When attempting the OPT examples, via either Docker or...

Hi, I want to use num_beams for generate, but PipelineModel can't. Can you support num_beams? Best wishes.

Hi, I am very interested in the distributed inference of Colossal AI. Since we have pre-trained NLP models from Pytorch or JAX, I wonder if possible or what should be...