DeepCT
DeepCT copied to clipboard
Question regarding quantization
Hi @AdeDZY ,
To get the new tfs, you used TF_{DeepCT}(t, d) = round(y_{t,d} * 100)
; I was wondering if you tried values other than 100 ?
I did similar experiments on related approaches (roughly a model learning term weights), and while experimenting with Anserini, I surprisingly noticed that increasing the quantization factor (100 in your case) degrades performance, and it was actually better to use a low value (like 5) ! I agree the models are not the same, but I initially have weights in a small range like you (~ [0,3]), so I am curious if you already tried other values (I don't think it's mentioned in the paper ?), or if you could actually observe some gains by tuning it !
Thanks, Thibault
This is super interesting! I tried using [1, 10, 100, 1000], and found that 100 in general worked the best for DeepCT. When using small values (e.g., 1 and 10), a lot of words end up having weight=0 and were deleted in my setting.
I am wondering how does your weight distribution look like?