mindmeld Support for GPU usage (w/ batching) when loading a KB using ElasticsearchQA

Support for GPU usage (w/ batching) when loading a KB using ElasticsearchQA

Open murali1996 opened this issue 3 years ago • 0 comments

Loading a large KB while using the Elasticsearch Question Answerer with query_type {'embedder', 'embedder_text', 'embedder_keyword'} can be time consuming if the process of obtaining embeddings is not batched or is configured to use GPU when available.

What can be modified in the codebase: This method def _doc_generator(data_file, embedder_model=None, embedding_fields=None): in the question_answerer.py file can first obtain all the embeddings of the all docs, dump the embeddings cache and then use the transform method on each doc while creating docs for elasticsearch index creation.

Optional comments on memory optimization: The solution suggested above as well as the current implementation isn't optimized for embeddings held in RAM memory. Meaning, all the embeddings are kept in memory for elasticsearch to query embeddings of each KB doc. Maybe this is something to look into if we think of loading large KBs smoothly (say order of >50K documents).

Oct 05 '21 06:10 murali1996

mindmeld mindmeld copied to clipboard

Support for GPU usage (w/ batching) when loading a KB using ElasticsearchQA

mindmeld
mindmeld copied to clipboard