Diego Fiori
Diego Fiori
# Description Currently we don’t support any runtime specific for transformer models. DeepSpeed has implemented a runtime we could use for accelerating the Transformer models at inference time. # Integration...
# Description FasterTransformer is a library developed by Nvidia specifically for accelerating transformer architecture on Nvidia devices. We should test its performance and implement a conversion framework for converting TF,...
# Description Currently nebullvm does not support TF-built in compiler XLA, which also allows the model to be compiled on Google TPUs. XLA is available for JAX, TF and PyTorch....
# Description OpenAssistant has released on HF the reward models they trained on the open-source datasets. Even if they are not tailored for the user need, we could lavarege them...
# Description DeepSpeed supports offloading during training using the Zero-Infinity technology. We should add examples of working configuration files for the models we support. # TODO - [ ] Add...
# Description Currently we are supporting the following datasets: - [Stanford Human Preferences Dataset (SHP)](https://huggingface.co/datasets/stanfordnlp/SHP) - [Anthropic RLHF](https://huggingface.co/datasets/Anthropic/hh-rlhf But we are not using all the information contained in the dataset:...
# Description ChatLLaMA currently does not have either a playground or some scripts which allow the user to easily use the model for inference. It would be great to add...
# Description Currently ChatLLaMA documentation consists of just the readme on github. We should align chatllama with the other modules in nebullvm and add the `mkdocs` documentation to `docs.nebuly.com`. All...
# Description Currently, chatllama supports the synthetic data generation just from OpenAI’s `davinci-003`. Both for conversations and for scores. In order to avoid huge costs while generating data we should...
# Description Once of the biggest difficulty when selecting and cleaning the data for training is to estimate to correct amount of data needed for training the model. ChatLLaMA training...