accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

PickScore: We used accelerate in our recent paper!

Open yuvalkirstain opened this issue 1 year ago • 3 comments

Hi,

In our recent paper Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation we created a repo that uses accelerate deepspeed to train PickScore, a scoring function that achieves superhuman performance in the task of predicting human preferences in text-to-image.

This repo uses hydra, logs validation results with images and general metrics to wandb, works with Slurm and multinode settings, saves checkpoints, clears checkpoints while keeping the best one, and I think that it is a nice example of using accelerate. Also, it is easy to extend to other text-image tasks.

We benefitted a lot (!!!) from accelerate, so please check out our repo and perhaps consider adding it as a reference or example for a research paper that enjoyed accelerate (and added a lot of boilerplate that other related projects will also need).

Thanks for being so great, absolutely love accelerate!

yuvalkirstain avatar May 05 '23 08:05 yuvalkirstain

Awesome! That's fantastic @yuvalkirstain! Would you be willing to put in a PR describing what all is in there for others to know about quickly on our example zoo documentation? (See https://github.com/huggingface/accelerate/blob/main/docs/source/usage_guides/training_zoo.mdx), that's exactly the kind of material we want for that reference doc :)

(I may change/move this around to also showcase papers, as there's quite a few now :) )

muellerzr avatar May 05 '23 12:05 muellerzr

Made the PR in #1397 (feel free to change the description, I was not sure how to describe it). I think that a lot of examples in accelerate are good for someone to play around with finetuning models, but will require adding a lot of features if someone wishes to conduct research. In this repo I tried to add features that make it easier to experiment and produce results for a research paper and train at scale (but still much much leaner than pytorch lightning or other frameworks). Here is a list of some features:

  1. Uses hydra for easy configuration.
  2. Can easily be extended to other tasks (language, diffusion, etc.), can easily switch learning rate schedulers, optimizers, backbone models, etc.
  3. Logs a lot of helpful things to wandb - e.g. images during validation, learning rate, total batch size, and more. this makes it easier to catch bugs and eyeball how your model is doing.
  4. Saves checkpoints during training - one can limit the number of checkpoints, and select a metric to delete unnecessary checkpoints by (e.g. keep the best 3 performing checkpoints according accuracy metric).
  5. Has instructions and the relevant adaptations to easily use accelerate deepspeed + multinode training
  6. Supports SLURM (using deepspeed + multinode + slurm required a bit of time to get done).

yuvalkirstain avatar May 07 '23 07:05 yuvalkirstain

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions[bot] avatar Jun 04 '23 15:06 github-actions[bot]