parallelformers icon indicating copy to clipboard operation
parallelformers copied to clipboard

Parallelformers: An Efficient Model Parallelization Toolkit for Deployment

Results 29 parallelformers issues
Sort by recently updated
recently updated
newest added

## Describe a requested feature Thanks for releasing this great library! I am currently working on deploying [facebook/xglm-7.5B](https://huggingface.co/facebook/xglm-7.5B), which is currently not supported by parallelformers. [POLICY.md](https://github.com/tunib-ai/parallelformers/blob/main/POLICY.md) provides a comprehensive guide...

enhancement

## Describe a requested feature Can you please add support for gpt_neox Its official documentation is here https://huggingface.co/docs/transformers/model_doc/gpt_neox

enhancement

## How to reproduce First of all, thanks for this great project! I'm facing an issue running the test code provided [here](https://github.com/tunib-ai/parallelformers/blob/main/tests/seq2seq_lm.py) on Kubernetes. This is what I'm running inside...

bug

I am trying to use Roberta NER and BERT NER uncased but for both of the models I am getting the following issues. Is it something which is still under...

bug

Hi, I'm very interested in this work, looks super interesting and useful. Unfortunately one of my models is an EncoderDecoder model and I have no idea how to get it...

enhancement

Hi there! Thanks for the awesome work on this lib! Just wanted to ask what the recommended way is to clean up a loaded model that has been `parallelize`d using...

## How to reproduce ```python tokenizer = AutoTokenizer.from_pretrained(model_name, bos_token='[BOS]', eos_token='[EOS]', unk_token='[UNK]', pad_token='[PAD]', mask_token='[MASK]') model = AutoModelForCausalLM.from_pretrained(model_name)#.to(device='cuda', non_blocking=True) _ = model.eval() parallelformers.parallelize(model, num_gpus=4, fp16=True, verbose='detail') tok = tokenizer("My name is Kevin."*10,...

bug

I am using a 3060 and a 3090 to split GPT models two ways including GPTJ and GPT Neo 2.7B. When generating many tokens, say 500, the model hangs and...

bug

Is it possible to use this library for CNN networks implemented with pytorch? Can you show me an example?

enhancement