bytes-lost
bytes-lost
env ``` gpu: 4*A100 80G pytorch: 1.13.1 cuda version: 11.7 deepspeed: 0.9.0 transformers: 4.28.0.dev ``` run script ``` OUTPUT=$1 ZERO_STAGE=3 if [ "$OUTPUT" == "" ]; then OUTPUT=./output fi if...
请问一下,在rerank modelling代码中看到的target label是用torch.zeros初始化的,然后loss计算是使用cross_entropy(scores, target_label),构建的批数据首个正样本对的target label应该是1?其余负样本对设置成0?
论文3.1节提到 ``` To improve the decoding efficiency of Chinese sentences, Cui et al. (2023) expand the vocabulary by adding common Chinese characters and re-training these newly added word embeddings along...