other-ones

Results 8 comments of other-ones

Hi thanks for the comment. It seems the model is not being trained. The result below is the generated images after 25k iterations using the inference.py script. ![results](https://github.com/microsoft/unilm/assets/25217277/a23a1fcf-7066-43a5-af3b-4458fd3e207a) Could you...

I upgraded the xformers from 0.0.16 to 0.0.17 as suggested in the code, and it's now getting some plausible results. However, I'm facing the increase in the gpu memory consumptions...

This is the command I used: export CUDA_VISIBLE_DEVICES=0,1,2,3; export PYTHONPATH=/home/twkim/project/textdiffuser; accelerate launch --main_process_port 2941 train.py \ --train_batch_size=1 \ --gradient_accumulation_steps=1 \ --gradient_checkpointing \ --mixed_precision="fp16" \ --num_train_epochs=2000000 \ --max_train_steps=2000000000 \ --learning_rate=1e-5 \...

Hi, Thanks for the suggestion. I've degraded from 0.0.17 to xformers==0.0.16, the same as your config, then the memory consumption got decreased. But the output became the complete noise again....

Could you also let me know which GPU model you used for training? I see this message in train.py: > "xFormers 0.0.16 cannot be used for training in some GPUs....

Hi, I've tried that before with xformers==0.0.16 torch==1.13.1, but then the model does not get trained (gets complete noise) 1. If with xformers==0.0.16 torch==1.13.1 -> not getting trained I think...

> Thanks for your feedback. It is a mistake and the command should be: > > ``` > img2dataset --url_list=url.txt --output_folder=laion_ocr --thread_count=64 --resize_mode=no > ``` > > We will fix...

Hi, have you solved this issue? I'm facing the same thing