kenlm icon indicating copy to clipboard operation
kenlm copied to clipboard

C++ 初始化kenlm 时 报段错误

Open tongchangD opened this issue 4 years ago • 11 comments

C++ 代码 lm::ngram::Config config; model = new lm::ngram::Model(language_model_path);

初始话C++ 版本的 kenlm 模型报错 相同代码时而能成功 ,但常时间下是报错的 Segmentation fault (core dumped)

段错误 可能是哪儿指针存在问题,求大神救救我

tongchangD avatar Nov 06 '20 09:11 tongchangD

Sorry I don't speak Chinese. But I can use machine translation. Need more context. 对不起,我不会说中文。 但是我可以使用机器翻译。 需要更多上下文。

kpu avatar Nov 08 '20 12:11 kpu

这个是因为boost没装好的原因,还有就是加上这个参数-S 10%

nocoolsandwich avatar Sep 01 '21 07:09 nocoolsandwich

好的,谢谢 感觉NLP机会渺茫,做了三年NLP,我现在改行CV了,hhh

tongchangD avatar Sep 01 '21 07:09 tongchangD

@tongchangD Hello bro,

好的,谢谢 感觉NLP机会渺茫,做了三年NLP,我现在改行CV了,hhh

Hello, does your vocab separated by spaces or not? Sorry for asking because I will try to build chinese language model but actually I am not a chinese, this is just for research purpose only.

MaarufB avatar Jan 25 '22 09:01 MaarufB

@tongchangD Hello bro,

好的,谢谢 感觉NLP机会渺茫,做了三年NLP,我现在改行CV了,hhh

Hello, does your vocab separated by spaces or not? Sorry for asking because I will try to build chinese language model but actually I am not a chinese, this is just for research purpose only.

yes ,you mast separated sentence by spaces

tongchangD avatar Jan 26 '22 10:01 tongchangD

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

kpu avatar Jan 26 '22 10:01 kpu

@tongchangD Hello bro,

好的,谢谢 感觉NLP机会渺茫,做了三年NLP,我现在改行CV了,hhh

Hello, does your vocab separated by spaces or not? Sorry for asking because I will try to build chinese language model but actually I am not a chinese, this is just for research purpose only.

yes ,you mast separated sentence by spaces

Thank you for the your reply. "这 是 一 个 中 国 样 本" is this okay?

MaarufB avatar Jan 26 '22 10:01 MaarufB

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Hi sir where can i find that thir-party tokenizer? Thanks for your work by the way. I am new to NLP. I am doing this for re search purpose.

MaarufB avatar Jan 26 '22 10:01 MaarufB

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Hi sir where can i find that thir-party tokenizer? Thanks for your work by the way. I am new to NLP. I am doing this for re search purpose.

for example : jieba

tongchangD avatar Jan 26 '22 13:01 tongchangD

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Hi sir where can i find that thir-party tokenizer? Thanks for your work by the way. I am new to NLP. I am doing this for re search purpose.

thir-party tokenizer, you can see it in this picture, you can find the code in github image

tongchangD avatar Jan 27 '22 02:01 tongchangD

For all languages, the intended use is that you first run a third-party tokenizer. For Chinese it so happens that the tokenizer performs a segmentation task.

Hi sir where can i find that thir-party tokenizer? Thanks for your work by the way. I am new to NLP. I am doing this for re search purpose.

thir-party tokenizer, you can see it in this picture, you can find the code in github image

Thank you so much for your reply sir. I'll try that and come back here to share my progress.

MaarufB avatar Jan 27 '22 04:01 MaarufB