WordSegment icon indicating copy to clipboard operation
WordSegment copied to clipboard

max_ngram问题请教

Open SCismycat opened this issue 5 years ago • 1 comments

#计算转移概率
        Trans_dict = self.load_model(word_trans_path)
        for pre_word, post_info in Trans_dict.items():
            for post_word, count in post_info.items():
                word_pair = pre_word + ' ' + post_word
                self.trans_dict_count[word_pair] = float(count)
                if pre_word in self.word_dict_count.keys():
                    print(key)
                    self.trans_dict[key] = math.log(count / self.word_dict_count[pre_word])  # 取自然对数,归一化
                else:
                    self.trans_dict[key] = self.word_dict[post_word]

请问:self.trans_dict[key] = math.log(count / self.word_dict_count[pre_word]) # 取自然对数,归一化中为什么key是trans_dict的keys呀?

SCismycat avatar Jan 14 '20 03:01 SCismycat

借楼问一下:get_unknow_word_prob return math.log(1.0 / (self.all_freq ** len(word))) OverflowError: int too large to convert to float 溢出怎么解决呀

HongyanJiao avatar Apr 21 '20 09:04 HongyanJiao