NanoCode012

Results 180 comments of NanoCode012

You may try the datasets in the example configs for testing though they're a bit small.

With the addition of the Fuyu to transformers, axolotl should inherently support it. `sample_packing` and `flash_attention` would not work as their modeling code does not support it.

Hm, that's weird. I have nvidia-smi showing `CUDA 12.0` on host and I can run `python -m bitsandbytes` successfully in docker. If you have the axolotl repo clone, do you...

Sorry for late reply. > I can run python -m bitsandbytes successfully, though it says that it is targeting CUDA 11.8 (BNB_CUDA_VERSION=118) Axolotl is targeting 11.8 for default image. You...

Hi, sorry for late reply. `eval_table_size` is not working as well as expected unfortunately. That's why it's disabled by default.

@ysawej . Sorry for that. It seems that this feature is not properly maintained anymore. If you would like, would you be able to take a look and perhaps submit...

Could you check your prompt format is the same? Your loss also starts quite high.

Did not know that using JSON would cause such an issue. That sounds weird. I will close this issue. Please re-open if the problem re-occurs.

Sorry for that. The issue is reopened. Could you please provide an example config of where json does not work? The dataset handler for json and jsonl is the same...

@l3utterfly , do you perhaps still have the offending dataset to share or some sample of it for reproducing this?