Tatiana Likhomanenko

Results 242 comments of Tatiana Likhomanenko
trafficstars

We are planing to remove arch file at all and switch to plugins. What do you think about it?

The problem now is that for example transformer forward should need to have extra param for forward like mask and if you do crazy stuff with resnet block with transformer...

Do you use old docker images? as current docker images are for 11.1 cuda. TextDatasetTest probably again are related to the docker memory. Can you run ModuleTest.ConvolutionFwd separately (my guess...

Can you try to set 1E-5 -> 1E-3 or 1E-4 for the failed tests? If 1E-3/1E-4 passes, you are good. This probably depends on many factors. At least I saw...

This model was trained with old codebase that is why it cannot be right now reused by the new codebase. Solutions: - use particular branch/commit with which model was trained...

Converting models will be here https://github.com/facebookresearch/flashlight/pull/524

cc @xuqiantong In the meantime a good reference will be https://github.com/flashlight/wav2letter/tree/main/recipes/streaming_convnets/inference where online decoding is done (and this API is used).

what versions of w2l and fl are you using? When we moved w2l codebase into fl the namespacing of some blocks were changed so you need to convert models into...

Could you try another optimizer? adagrad instead of sgd (sgd can be very trickier for transformer models).