Romain Beaumont

Results 2297 comments of Romain Beaumont

yes I'm surprised how much this is confusing people

I joined it with the fasta file (on the uniprot_name field which is the RepId in fasta). I used this code: ```py import dask.dataframe as dd from dask.distributed import Client...

That's a simple way to read parquet as a torch dataset : ```py import pyarrow as pa import pyarrow.parquet as pq import pyarrow.dataset as ds import pandas as pd from...

The dataset can be downloaded (it's 50GB) by running `!wget --recursive --no-parent -nd -P uniref90_with_annotations http://3080.rom1504.fr/uniref90_with_annotations/` example on colab https://colab.research.google.com/drive/1Zcns30b1H3IcxMJ-A-wQDF6pUcyNL5ei?usp=sharing

Accelerate merged the PR but we need to wait for a release of accelerate before merging here

> When set to True, DDP knows the trained graph is static. Static graph means 1) The set of used and unused parameters will not change during the whole training...

https://github.com/huggingface/accelerate/pull/637 here's the fix + adding an option here, will PR later

https://github.com/lucidrains/DALLE2-pytorch/pull/226

fyi 2x batch increase possible (ram saving) using grad checkpointing once this is fixed

It's a different model yes. And the improvement is quite significative on many evaluation datasets. On classification tasks but also particularly on retrieval https://laion.ai/blog/large-openclip/