maxjeblick
maxjeblick
> target_kl is unused currently. No early stopping based on this parameter. It is used in `AdaptiveKLController`? (as `kl_target`).
Thanks for opening the issue @binga, I'll have a look. Could you maybe share `cfg.yaml` file of the failed experiment? I tried int8 on latest main (while keeping default params...
We checked training using the parameters above (DDP, on less than 8gpus) on 2 different machines, but we could not reproduce the error. How did you prepare your python environment?
Seems to be an open [issue](https://github.com/TimDettmers/bitsandbytes/issues/240) with bitsandbytes library, occurring on Tesla V100 GPUs. I also found this [stackoverflow post](https://stackoverflow.com/questions/75918140/getting-runtimeerror-expected-scalar-type-half-but-found-float-in-aws-p3-instan) that has a concise code example to reproduce the error....
> Could be interesting to add a system prompt before fine-tuning directly from an input box or something like that This is probably not too helpful, as all finetuning samples...
Will check if it this issue occurs with cpu only and push a fix to #564 eventually. What's puzzling is that the error in the subprocess freezes the parent process.
When uploading an image to the app, I'm getting: (probably the table extraction failed on that particular image) ``` AttributeError: 'UploadedFile' object has no attribute 'split' Traceback: File "/home/user/.local/lib/python3.8/site-packages/streamlit/scriptrunner/script_runner.py", line...
Would be great to have some gradio app and/or some end-to-end jupyter notebook example that extends the examples in the README. (E.g. similar to https://github.com/NielsRogge/Transformers-Tutorials examples) `run...py` seem to be...
Hi @ErikKaum , is this model already implemented? I tried deploying it yesterday, and while the startup works fine, I encountered some inference issues (which are expected if the mode...
Thanks for the clarification, I created an issue [here.](https://github.com/huggingface/text-generation-inference/issues/2781)