DSXiangLi

Results 18 issues of DSXiangLi

After reading several paper about causal tree and uplift model, I found these two model quite similar. The difference is merely that causal tree use loss function for continuous treatment...

enhancement

你好我在MSRA中文数据集上训练,发现start end loss很快就收敛了,但是span部分的recall上升非常缓慢。看了下metric因为span部分正负样本相差很大,请问有什么解决方案么?

FMLayer最终输出的应该是一个scaler不是么?那为什么compute_output_shape这里却是隐向量长度(int( input_shape[0] ), output_dim)呢?

你好还有一个问题想请教,想用词增强主要是想逼近bert的性能又比bert要快。不知道是不是我没完全理解soft lexicon提取每个char 对应B/M/E/S word list的方式,不过一眼看去提取soft lexicon这一步是接近O(N^2)(N是sentence length)的复杂度,因为要遍历句子中任意长度的sentence[i:j]看是否出现在词典中,这样在线上infer的时候虽然模型更轻量,但是是不是会在生成模型输入这一步更加耗时呢?

你好我看bert tokenizer只对text进行了tokenize,如果碰到tokenizer把例如1994分成了19和##94, 但是gaz是针对每个character 1/9/9/4识别的BMES word,不会发生输入mismatch的问题么?

In the hyperparameter, I saw the bf16 and tf32 are both set to true. However I don't they can be used together?