Guolin Ke comments

Results 163 comments of


                                            Guolin Ke

How `num_recycles` work in UnimolConfGModel?

yeah, it is better to use the same num_recycles for training and inference.

bug when using special token with uppercase

even I set `do_lowercase=False`, there are still some undesired tokens. ``` [S ##EP [SEP ```

Training Dataset

@lhatsk we are waiting for the fix from modelscope team. will post the updates here.

Training Dataset

> > @DimaMolod The data generation code is almost the same as the one used in inference, except for the label extraction from mmcif. @ZiyaoLi maybe we can add a...

Training Dataset

@henrywotton you can report the issue to https://github.com/modelscope/modelscope

A question about using unimol-plus with unicore

It is usually caused by the wrong `--user-dir`. You can try the absolute path.

feature request: Add parameter to control maximum group size for Lambdarank

@octatour not, all documents are used, truncation_level is for the loss calculation. It is used to ensure at least one document in pair (in the pair-wise loss accumulation) is above...

Model performance degradation when switch from `data` to `voting` in distributed learning algorithm

voting based parallel is better to used with large #feature and #row data. When #row is small, the accuracy may is not good. when #feature is small, its speed-up is...

Model performance degradation when switch from `data` to `voting` in distributed learning algorithm

Did you fully shuffle the rows before row partitions? How many nodes you used?

Model performance degradation when switch from `data` to `voting` in distributed learning algorithm

Voting parallel will leverage local data partition to find split candidates, and the a voting is conducted on the local top splits. Therefore, if the distribution of local data is...