Einengutenmorgen
Einengutenmorgen
I have the same problem but im using a ml.g4dn.12xlarge (4x Tesla T4). I'm running it on AWS Sagemaker inside a docker (recommended docker image) and im using the mpt-7b_dolly_sft.yaml
Im having the same issue. Im using the docker image [2.0.1_cu117-python3.10-ubuntu20.04-aws] and don't use a Venv. I can run the hf_ generate.py but I can't run train.py
thanks my mistake
What's your Question ?
Probably: BFloat16 is not supported on MPS
Have the same issue. Output looks exactly like mine. But I don't get an OOM like bandish-shah suggests. I set 'export PYTHONUNBUFFERED=True' and don't receive any new information for this...