Thomas Capelle

Results 169 comments of Thomas Capelle

I came to also point to https://missing.csail.mit.edu/

Do we handle serialization of packed dataset? running the packing only on rank0? In this same logic, shouldn't we compute the block triu on the fly so we don't store...

There is a tradeoff, pre-computing them is faster but if we serialize it would be very memory expensive. We can compute them with the position ids, maybe storing them as...

This is more regarding the naming of the padding functions > Collating happens at the sample level for packed datasets with _padded_collate_packed, so we do not use a collator in...

I re-run both SFT and DPO here in case you want to check. I manually passed the total training steps and added the missing warmup. Also if you look at...

https://wandb.ai/capecape/train_sd/reports/How-To-Train-a-Conditional-Diffusion-Model-From-Scratch--VmlldzoyNzIzNTQ1

Man, it is time to start annotating. I am currently using a Unet to segment clouds from our sky imager and using https://github.com/Britefury/django-labeller to make the segmentation mask. With only...

> @tcapelle that's really interesting to know that Unet is performing well for cloud masks. Just curious: Have you tried the [Axial DeepLab](https://arxiv.org/abs/2003.07853) approach for segmenting clouds? (I know I'm...

I am curious about this, please tell us when you tried this. I have found that `CoorConvs` help to get better results consistently, they act as positional embeddings. ```python class...