Travis Chen
Results
2
comments of
Travis Chen
it seems that you clip w after all the training steps. should clip be added to each gradient desc step? clip is operated on w to norm, not on gradient...
Actually, CJK characters are encoded together so there's no critical *range* for Chinese characters. A punctuation dict could be used to do the filtering.