Vincent Nguyen comments

Results 123 comments of


Vincent Nguyen

Training speed of Transformer-Big

In fact I am at 21k on a DE to EN big on 6 GPUs. I don't see any issue with 6. Seems quite almost linear vs 4. I didn't...

Training speed of Transformer-Big

In case you're interested, I was mistaken before because I used paracrawl which is in fact largely not clean at all (german segments in the english side and vice versa),...

Training speed of Transformer-Big

I am using nccl2 backend. I'll check this nvlink thing not sure if pytorch already take advantage of it. is nvlink just pairing 2 GPUs or can it link as...

[won't merge - v1 codebase] Bert

@Zenglinxiao don't you think it could be possible to embed "bert_build_model" within "build_model" ?

Export model to ONNX

see #1023 Onnx export compatibility requires some preliminary work: - remove "object" like DecoderState which is done by this PR - change the way attentions flow, currently a dictionary but...

Export model to ONNX

As of now, just to export the transformer encoder, we need pytorch to support following operators: expand_as masked_fill Then we need to change the way attention is returned from the...

Export model to ONNX

with the current code it won't work. also if you want to try some preliminary steps, you need to export separately the encoder, the decoder and the generator. but again...

Export model to ONNX

I mixed up my comment sorry.

Export model to ONNX

it is not as simple as this. there are several levels of complexity: 1) we need to use only operations that are onnx compatible. Until recently some operations like expand_as,...

predict result repeat the same result

if accuracy does not go beyond 50 likelihood is that somehting is wrong with your data (eg misaligned corpus) do some checks at various points in your dataset.