datavizweb comments

Results 5 comments of


                                            datavizweb

trafficstars

Training time

@dingevin did it help to run on P100s/v100. I am getting around ~6-7 seconds per step (or Steps/second: 0.114680) on v100s (single host with 8 GPUs) in async mode. I...

Training time

@dingevin we can generate random features (Batches in memory) and give it to model from input_generator. If it still takes 6-7 seconds per step, I/O is not an issue. We...

GPU utilization down to 0% without any error infos

I am seeing the same issue with libri recipe. While running libri grapheme recipe (with default params, no changes to recipe) I see that loss starts reducing over steps. But...

GPU utilization down to 0% without any error infos

In the async mode I am seeing the same issue. It stops after say 14k steps and GPU utilization becomes zero. Memory usage remains the same. Unlike sync mode (previous...

Why the results of LM fusion worse than without LM fusion?

It is difficult to tell why no improvement is seen without looking into the implementation. I would however, make sure that LM trained is good enough. You could do n-best...