Ahmed Elnaggar

Results 41 comments of Ahmed Elnaggar

Thanks a lot @manzilz for your reply. So, we need to disable "preprocessed_data" and perform masking on the fly. Could you please give us a concrete example for using on...

Thanks again for your explanation. Could you please provide the code for generating the tfrecords ? Unfortunately, if we used Bert tfrecords generator, it will combine two sequences per sample....

@LifeIsStrange Thanks for the links. I already know both of them, but as you already know they only support bert and GPT, but not XLNet. For my use-case, I am...

@huseinzol05 This is multi-gpu training for single node training. I am asking about distributed GPU Training for multi-nodes.

Both your code and the official code are using "MirroredStrategy" which works for single node multi-gpu, in order to make it work for multiple nodes a "MultiWorkerMirroredStrategy" should be used....

Thanks for the information, but I am looking for more advanced large scale distributed training using Horovod for example.

Here is a small example that re-produce the problem of my big model: ``` m = torch.nn.Bilinear(20, 30, 40).to("cuda") input1 = torch.randn(128, 20).to("cuda") input2 = torch.randn(128, 30).to("cuda") output = m(input1,...

Jit Trace works fine, the error comes from tensorrt compile. Any ideas how to make it work ?

@emeryberger Any idea why Scalene is not detecting the correct memory usage ?

More information: Our application is mainly based on the following library: https://github.com/msg-systems/holmes-extractor/tree/master Which is using "multiprocessing" and "threading", but as far as I understand Scalene can detect and profile them.