Marcin Junczys-Dowmunt comments

Results 258 comments of


Marcin Junczys-Dowmunt

Float16 does not work

Hi, it might also not be worth it. If I am not wrong float16 is artifically capped in gamer hardware, i.e. the GTX 1080, to laughable performance, about ~30x slower...

Float16 does not work

Oh. In that case carry on :)

Float16 does not work

Interesting. Thing is, it should not be faster. F16 arithmetics are severly capped. We benchmarked cublas hgemm vs sgemm on a GTX1080 once, it was slower by a factor of...

Float16 does not work

Yeah, maybe on the CPU as well? Are float16 operations faster on our CPUs?

Suggestions for GPUs

I would say two GPUs are preferable to one. With synchronous SGD the RAM in the two cards basically adds up in terms of batch size (not model size though)...

Multi-Task Training with BERT in Marian?

In my research branch, that is not properly merged yet. I can point you to the very experimental code. In hindsight and after more experiments I cannot currently confirm that...

Using 16-bit floating point

I just got a couple of Voltas to play around with. So this is next big work item.

Using 16-bit floating point

This is quite exploratory. I plan to have fp16 after Christmas, but don't take my word for it. I first need to learn how that works :)

How to guide translation from the context of previous sentences.

Hi, The problem here depends on how you have implemented your document-level system. With current Marian I would say there are two ways to achieve that out-of-the-box with no to...

all shards must have the same size -- problem with 6GPUs but not with 5

I also do not recommend to use NCCL with a number of GPUs that's not a power of 2. While you may get a small performance improvement going from, say,...