Senta
Senta copied to clipboard
能否输出sentiment score
你好,请问预测英文句子级情感分类任务当中,除了Positive和negative,是否可以同时输出sentiment score进行比较?
import numpy as np
from senta import Senta
from senta.common.rule import InstanceName
from senta.data.util_helper import convert_texts_to_ids, structure_fields_dict
from senta.utils.util_helper import array2tensor, text_type
class SentimentChineseClassifier(Senta):
def __init__(self):
super(SentimentChineseClassifier, self).__init__()
self.init_model()
def predict(self, texts_, aspects=None):
if isinstance(texts_, text_type):
texts_ = [texts_]
if isinstance(aspects, text_type):
aspects = [aspects]
return_list = convert_texts_to_ids(
texts_, self.tokenizer, self.max_seq_len, self.truncation_type, self.padding_id)
record_dict = structure_fields_dict(return_list, 0, need_emb=False)
input_list = []
for item in self.input_keys:
kv = item.split("#")
name = kv[0]
key = kv[1]
input_item = record_dict[InstanceName.RECORD_ID][key]
input_list.append(input_item)
inputs = [array2tensor(ndarray) for ndarray in input_list]
result = self.inference.run(inputs)
batch_result = self.model_class.parse_predict_result(result)
results = []
if self.inference_type == 'seq_lab':
for text, probs in zip(texts_, batch_result):
label = [self.label_map[l] for l in probs]
results.append((text, label, probs.tolist()))
else:
for text, probs in zip(texts_, batch_result):
label = self.label_map[np.argmax(probs)]
results.append((text, label, probs.tolist()))
return results
def main():
classifier = SentimentChineseClassifier()
texts = ["PHP 是世界上最好的语言"]
result = classifier.predict(texts)
print(result)
if __name__ == '__main__':
main()
可以参考下这段 Sample Code。
import numpy as np from senta import Senta from senta.common.rule import InstanceName from senta.data.util_helper import convert_texts_to_ids, structure_fields_dict from senta.utils.util_helper import array2tensor, text_type class SentimentChineseClassifier(Senta): def __init__(self): super(SentimentChineseClassifier, self).__init__() self.init_model() def predict(self, texts_, aspects=None): if isinstance(texts_, text_type): texts_ = [texts_] if isinstance(aspects, text_type): aspects = [aspects] return_list = convert_texts_to_ids( texts_, self.tokenizer, self.max_seq_len, self.truncation_type, self.padding_id) record_dict = structure_fields_dict(return_list, 0, need_emb=False) input_list = [] for item in self.input_keys: kv = item.split("#") name = kv[0] key = kv[1] input_item = record_dict[InstanceName.RECORD_ID][key] input_list.append(input_item) inputs = [array2tensor(ndarray) for ndarray in input_list] result = self.inference.run(inputs) batch_result = self.model_class.parse_predict_result(result) results = [] if self.inference_type == 'seq_lab': for text, probs in zip(texts_, batch_result): label = [self.label_map[l] for l in probs] results.append((text, label, probs.tolist())) else: for text, probs in zip(texts_, batch_result): label = self.label_map[np.argmax(probs)] results.append((text, label, probs.tolist())) return results def main(): classifier = SentimentChineseClassifier() texts = ["PHP 是世界上最好的语言"] result = classifier.predict(texts) print(result) if __name__ == '__main__': main()可以参考下这段 Sample Code。
非常感谢!我会仔细研究这个案例。另外我想问一下,如果是对英文文本进行处理,应该在哪里改动以使用对应的英文预训练模型呢?
改动 self.init_model() 这一行,使用英文模型初始化即可。