Guolin Ke
Guolin Ke
yeah, it is better to use the same num_recycles for training and inference.
even I set `do_lowercase=False`, there are still some undesired tokens. ``` [S ##EP [SEP ```
@lhatsk we are waiting for the fix from modelscope team. will post the updates here.
> > @DimaMolod The data generation code is almost the same as the one used in inference, except for the label extraction from mmcif. @ZiyaoLi maybe we can add a...
@henrywotton you can report the issue to https://github.com/modelscope/modelscope
It is usually caused by the wrong `--user-dir`. You can try the absolute path.
@octatour not, all documents are used, truncation_level is for the loss calculation. It is used to ensure at least one document in pair (in the pair-wise loss accumulation) is above...
voting based parallel is better to used with large #feature and #row data. When #row is small, the accuracy may is not good. when #feature is small, its speed-up is...
Did you fully shuffle the rows before row partitions? How many nodes you used?
Voting parallel will leverage local data partition to find split candidates, and the a voting is conducted on the local top splits. Therefore, if the distribution of local data is...