shrinath-suresh
shrinath-suresh
@rohan-varma Thank you very much. I applied the fix as given in the screenshot and compiled from source. The model is gettting saved in the fsdp mode. Attached image and...
@sgugger Thanks for the quick reply.. I did try that too ``` inputs = tokenizer("Bloomberg has decided to publish a new report on the global economy", return_tensors="pt") inputs = inputs.to("cuda:0")...
@sgugger I converted the input to float 16, because the model throws the following error when the input is (int or long) (Stack trace can be found in the issue...
In fact, its not accepting any of the data types **Long** ``` inputs = tokenizer("Bloomberg has decided to publish a new report on the global economy", return_tensors="pt") inputs = inputs.to("cuda:0")...
@sgugger To get more clarity, i tested the model in [p3.8xlarge ](https://aws.amazon.com/ec2/instance-types/p3/) . It has 4 gpus and more RAM . Bloom 7b model can be loaded without offloading. [bloom7b_p3_8xlarge.zip](https://github.com/huggingface/accelerate/files/9793671/bloom7b_p3_8xlarge.zip)...
Thank you very much for looking into it @sgugger
@PierpaoloSorbellini The inference section is tagged with WIP. Do we have any basic inference code available in chatllama to load actor_rl model and run few queries ?
@mreso Thanks for your review comments. I have already addressed few of your comments - implementing destructor, batch processing, remove `auto` based on your previous comments in babyllama PR. Will...
@saichandrapandraju Apologies for the delay. The above trace looks like a path issue. The `torch-model-archiver` is unable to find the relevant files. The generic exception `Exception: Unable to create mar...
@akasantony From the information you have shared, the model is saved using `mlflow.pytorch.log_model` and while loading the model, base handler is trying to load it as state dict. `mlflow.pytorch.log_model` uses...