bert-extractive-summarizer
bert-extractive-summarizer copied to clipboard
How can i apply your code for Chinese?
Excuse me, can i use your code for Chinese...
The only limitation right now for Chinese is that you would need a Bert Model and tokenizer that uses Chinese. If you have both the tokenizer and model, you can easily pass it in for summarization.
OK, thanks
打扰一下,我可以用您的中文代码...
Hello, is the project about the application of ‘bert-extractive-summarizer’ applied to the Chinese abstract successful? I do n’t know how to modify it. I would like to ask.
OK, thanks
Have you ever tested this model on a Chinese dataset? It didn't work on my dataset and outputs nothing
It would need a Chinese based bert model. I am not sure if the bert-multilingual model supports Chinese or not. This would need to be in the form of a huggingface transformer.
I have tried using bert-base-chinese
model,but it outputs nothing.
this is my code:
from transformers import *
# Load model, model config and tokenizer via Transformers
custom_config = AutoConfig.from_pretrained('bert-base-chinese')
custom_config.output_hidden_states=True
custom_tokenizer = AutoTokenizer.from_pretrained('bert-base-chinese')
custom_model = AutoModel.from_pretrained('bert-base-chinese', config=custom_config)
from summarizer import Summarizer
body = '这是一个测试句子'
model = Summarizer(custom_model=custom_model, custom_tokenizer=custom_tokenizer)
model(body)
I have solve the problem,the default spacy is using English as sentence segmentioner,just change it to Chinese,and it works well. Thanks @dmmiller612
I have solve the problem,the default spacy is using English as sentence segmentioner,just change it to Chinese,and it works well. Thanks @dmmiller612
Excuse me, where have you changed to use jieba rather than spacy? I can't find it, thank U
I have solve the problem,the default spacy is using English as sentence segmentioner,just change it to Chinese,and it works well. Thanks @dmmiller612
Excuse me, where have you changed to use jieba rather than spacy? I can't find it, thank U
Sorry to reply so late.
just change two lines code in sentence_handler.py
https://github.com/dmmiller612/bert-extractive-summarizer/blob/f94c0243954171b2e5233d2624a8d2fcad1ea9ba/summarizer/sentence_handler.py#L3
change to
from spacy.lang.zh import Chinese
and https://github.com/dmmiller612/bert-extractive-summarizer/blob/f94c0243954171b2e5233d2624a8d2fcad1ea9ba/summarizer/sentence_handler.py#L8
change to
def __init__(self, language=Chinese):
and this code https://github.com/dmmiller612/bert-extractive-summarizer/issues/45#issuecomment-650879240 works well.
@ttxs69 Why is the final output of the Chinese original text after I modify the Chinese model according to your steps? Urgently want to know, hope can reply!
@ttxs69 Why is the final output of the Chinese original text after I modify the Chinese model according to your steps? Urgently want to know, hope can reply!
i just try and it can work after i follow the steps to change the two lines code, you can run step into model(body) for debug
@ttxs69 Ok, thanks,I will try. If it is convenient, could you please send me a copy of the code you run? My email address is [email protected].
@jnkr36 I'm sorry that I read the wrong name this morning. First of all, thank you very much for replying to me. I'm a little urgent now, but I can't find the mistake, so I will try the method you said, at the same time if it is convenient, could you please send me a copy of the code you run? My email address is [email protected]! Thank you very much again
@jnkr36 I came again ! I just have a question that if you've downloaded zh_core_web_sm before.
@jnkr36 I came again ! I just have a question that if you've downloaded zh_core_web_sm before.
@jnkr36 I'm sorry that I read the wrong name this morning. First of all, thank you very much for replying to me. I'm a little urgent now, but I can't find the mistake, so I will try the method you said, at the same time if it is convenient, could you please send me a copy of the code you run? My email address is [email protected]! Thank you very much again
sorry for late response. i have sent you my project. please check you email. any other questions, we can talk again.
Just for convenience, I forked the repo and modified it as the suggestion above, it works nicely.
pip install git+https://github.com/FrontMage/bert-extractive-summarizer.git
@FrontMage Hello! I've installed your modified fork, transformers, spacy 3.0.0 and downloaded zh_core_web_sm, then tried to run model as in ttxs69 snippet, but model generates empty output on Chinese sentences. Could you, please, provide more details on your setup?
If it is convenient, could you please send me a copy of the code you run? My email address is [email protected] thanks
@ttxs69 为什么我按照你的步骤修改了中文模型后最终输出的是中文原文? 急想知道,望能回复!
我只是尝试,在我按照步骤更改两行代码后它可以工作,您可以运行 step into model(body) 进行调试
If it is convenient, could you please send me a copy of the code you run? My email address is [email protected] thanks
For the outputs is original text, I just found out that you need to change every sentence in your long text to a Chinese period.