Results 3 issues of Dabulv

When predicting answer, sometimes I get empty answer, however the score is high enough. Finally I locate the problem. A function called `_get2` at line 284 and 382 in file...

In `model.py/_build_loss()`, when computing the average loss of start position `ce_loss`, `loss_mask` is used to avoid counting in these samples where `0 == len(question)`. However, when computing the average loss...

I0427 14:37:27.545102 21654 parallel_executor.cc:440] The Program will be executed on CUDA using ParallelExecutor, 1 cards are used, so 1 programs are executed in parallel. I0427 14:37:27.635213 21654 build_strategy.cc:365] SeqOnlyAllReduceOps:0, num_trainers:1...