wangcanlong

Results 6 comments of wangcanlong

Is window_size necessary for inference? window_size = 32 in _truncate_texts(window_size, texts, query_ids), start = max(0, query_id - window_size // 2) end = min(len(text), query_id + window_size // 2) truncated_text =...

(数字+英文字母)测试图从BGR转RGB可以解决训练过程中测试准确率很高,但是cpp_recognition输出结果不对的问题!!! #~~~~~~~~~~~~~~~~~~原因如下~~~~~~~~~~~~~~~~~~~~# 我们做数据的时候:img = caffe.io.load_image(os.path.join(img_path, image)) caffe.io里面:img = skimage.img_as_float(skimage.io.imread(filename, as_grey=not color)).astype(np.float32) 问题所在:cv2的存储格式是BGR,而skimage的存储格式是RGB(recognition.cpp里面的读图是用opencv,使用cv::cvtColor(resizeimg, resizeimg, cv::COLOR_BGR2RGB);)

you can try g2pw. 睡得着觉? G2pM: ['shui4', 'de2', 'zhe5', 'jue2', '?'] 睡得着觉? lazy_pinyin: ['shui4', 'de2', 'zhe', 'jue2', '?'] 睡得着觉? G2pW: [['shui4', 'de5', 'zhao2', 'jiao4', None]] 小数点 G2pM: ['xiao3', 'shu4', 'dian3']...

不需要ComputeKaldiPitch就行了

any progress? I use python source code instead of command_line to forced alignment.But the alignment speed is still slow,due to the Kaldi style feature.

I find a implementation in: https://github.com/open-speech/speech-aligner