BERT-Embedding-Frequently-Asked-Question icon indicating copy to clipboard operation
BERT-Embedding-Frequently-Asked-Question copied to clipboard

FAQ-based Question Answering System using BERT

Results 14 BERT-Embedding-Frequently-Asked-Question issues
Sort by recently updated
recently updated
newest added

Bumps [sanic](https://github.com/sanic-org/sanic) from 20.6.3 to 20.12.7. Release notes Sourced from sanic's releases. Version 20.12.7 Resolves #2477 and #2478 See also #2495 and https://github.com/sanic-org/sanic/security/advisories/GHSA-8cw9-5hmv-77w6 Full Changelog: https://github.com/sanic-org/sanic/compare/v20.12.6...v20.12.7 Version 20.12.6 What's Changed...

dependencies

Bumps [numpy](https://github.com/numpy/numpy) from 1.18.2 to 1.22.0. Release notes Sourced from numpy's releases. v1.22.0 NumPy 1.22.0 Release Notes NumPy 1.22.0 is a big release featuring the work of 153 contributors spread...

dependencies

假如问答数据有几十万条,每次训练都需要执行很长的时间,是否可以增量更新,增量训练?

我觉得使用 SentenceTransformer 作为句子向量化不好,dimension 高达 512,句子向量化存储到 annoy 里导致空间特别大

你好,我想问下,搜索查询的时候使用的结巴对query进行分词、去停用词,然后match处理过的process_question,但是es的分词用的是IK,这样是不是有问题的呀?

进入docker后 cd 进入 src 文件夹下, `nohup python -u main_faq.py > "logs/log$(date +"%Y-%m-%d-%H").txt" 2>&1 &` 会显示如下错误 ![image](https://user-images.githubusercontent.com/48014740/124603645-b81b6680-de9c-11eb-88a0-219f3c848035.png) 但是associative_questions_server.py这个服务可以正常启动,调用接口也正常 ![image](https://user-images.githubusercontent.com/48014740/124603827-ea2cc880-de9c-11eb-993a-a9efcb752e34.png) 请问有可能是什么导致main_faq无法启动呢?

def search_annoy(self, owner_name, question, num=5): ''' Author: xiaoyichao param {type} Description: 使用Annoy 召回 ''' sentences = read_vec2bin.read_bert_sents(owner_name=owner_name) annoy_index_path = os.path.join( dir_name, '../es/search_model/%s_annoy.index' % owner_name) **encodearrary = self.sentenceBERT.get_bert([question])** tc_index = AnnoyIndex(f=512,...

请问您是如何利用sentense-bert生成词向量的?是在自己的问答对(正例、反例)数据集上进行微调后保存model,接着再用生成的模型生成词向量的吗?