kenlm C++ 初始化kenlm 时报段错误

C++ 代码 lm::ngram::Config config; model = new lm::ngram::Model(language_model_path);

初始话C++ 版本的 kenlm 模型报错相同代码时而能成功 ,但常时间下是报错的 Segmentation fault (core dumped)

段错误可能是哪儿指针存在问题,求大神救救我

Nov 06 '20 09:11 tongchangD

Sorry I don't speak Chinese. But I can use machine translation. Need more context. 对不起，我不会说中文。但是我可以使用机器翻译。需要更多上下文。

Nov 08 '20 12:11 kpu

这个是因为boost没装好的原因，还有就是加上这个参数-S 10%

Sep 01 '21 07:09 nocoolsandwich

好的，谢谢感觉NLP机会渺茫，做了三年NLP，我现在改行CV了，hhh

Sep 01 '21 07:09 tongchangD

@tongchangD Hello bro,

好的，谢谢感觉NLP机会渺茫，做了三年NLP，我现在改行CV了，hhh

Hello, does your vocab separated by spaces or not? Sorry for asking because I will try to build chinese language model but actually I am not a chinese, this is just for research purpose only.

Jan 25 '22 09:01 MaarufB

@tongchangD Hello bro,

好的，谢谢感觉NLP机会渺茫，做了三年NLP，我现在改行CV了，hhh

Hello, does your vocab separated by spaces or not? Sorry for asking because I will try to build chinese language model but actually I am not a chinese, this is just for research purpose only.

yes ,you mast separated sentence by spaces

Jan 26 '22 10:01 tongchangD

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Jan 26 '22 10:01 kpu

@tongchangD Hello bro,

好的，谢谢感觉NLP机会渺茫，做了三年NLP，我现在改行CV了，hhh

Hello, does your vocab separated by spaces or not? Sorry for asking because I will try to build chinese language model but actually I am not a chinese, this is just for research purpose only.

yes ,you mast separated sentence by spaces

Thank you for the your reply. "这是一个中国样本" is this okay?

Jan 26 '22 10:01 MaarufB

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Hi sir where can i find that thir-party tokenizer? Thanks for your work by the way. I am new to NLP. I am doing this for re search purpose.

Jan 26 '22 10:01 MaarufB

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Hi sir where can i find that thir-party tokenizer? Thanks for your work by the way. I am new to NLP. I am doing this for re search purpose.

for example : jieba

Jan 26 '22 13:01 tongchangD

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Hi sir where can i find that thir-party tokenizer? Thanks for your work by the way. I am new to NLP. I am doing this for re search purpose.

thir-party tokenizer, you can see it in this picture, you can find the code in github

Jan 27 '22 02:01 tongchangD

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Hi sir where can i find that thir-party tokenizer? Thanks for your work by the way. I am new to NLP. I am doing this for re search purpose.

thir-party tokenizer, you can see it in this picture, you can find the code in github

Thank you so much for your reply sir. I'll try that and come back here to share my progress.

Jan 27 '22 04:01 MaarufB

kenlm kenlm copied to clipboard

C++ 初始化kenlm 时 报段错误

kenlm
kenlm copied to clipboard

C++ 初始化kenlm 时报段错误