g2pW
g2pW copied to clipboard
The length of input could not more than 16?
I'm testing the onnx version by @BarryKCL and found that once the input len more than 16, the onnxruntime sess will not give output without any error。。I don't known if it's bug of onnxruntime or feature of g2pw model?
it's the bug in he's preprocess
Is window_size necessary for inference? window_size = 32 in _truncate_texts(window_size, texts, query_ids),
start = max(0, query_id - window_size // 2) end = min(len(text), query_id + window_size // 2) truncated_text = text[start:end]
so input "這場抗議活動究竟是如何發展演變的。" will become:
truncated_texts: ['這場抗議活動究竟是如何發展演變的', '這場抗議活動究竟是如何發展演變的。', '這場抗議活動究竟是如何發展演變的。', '這場抗議活動究竟是如何發展演變的。', '這場抗議活動究竟是如何發展演變的。', '這場抗議活動究竟是如何發展演變的。', '這場抗議活動究竟是如何發展演變的。']
in pytorch,tensor alignment can be solved. but in my code,i use list2numpy.
so,i set window_size=None to solve the input_ids not alignment.
@BarryKCL
Our model is trained on the hyper-parameter window_size = 32. It might slightly affect performance after changing this hyper-parameter.