yurakuratov

Results 14 comments of yurakuratov

Yes, we might cast reports outputs to default python types in `nn_trainer`. @yoptar

Yes, just set `verbosity=0` in amp.initialize call.

We have replicated BigBird pre-training on more recent T2T human genome assembly. The model is available via HuggingFace: https://huggingface.co/AIRI-Institute/gena-lm-bigbird-base-t2t. Any kind of feedback is welcome!

Hi! We also encountered OOM issue while training the tokenizer. To overcome this problem, we sampled 10 x 10^6 random subsequences from the whole dataset to train the tokenizer.

Hi! It seems that triton 1.0.0 requires python 3.9 or lower. I am successfully running our models with python 3.8 and triton 1.0.0. Try to check your python version. Yes,...

You can also try to using triton 1.1.1 as mentioned here: https://github.com/yurakuratov/t5-experiments#triron-111 but you will need to install deepspeed fork from this instruction.

Could you try to install transformers==4.17.0 with `!pip install transformers==4.17.0`?

Hi, @aaronmaiww! I have just updated readme section on requirements for sparse models: https://github.com/AIRI-Institute/GENA_LM#deepspeed-for-sparse-ops. Hope you find it useful.

Hi! Great question! Theoretically, the number of operations for full attention would always be higher than for sparse attention, because sparse attention removes full blocks of the attention matrix from...

Hi! We have not done comparison with DNABERT-2. Could you share more details on how do you run GENA-LM on their benchmarks? This will help us to identify the issue.