akshara-project icon indicating copy to clipboard operation
akshara-project copied to clipboard

Improve and document ingestion workflow

Open anupdhml opened this issue 6 years ago • 1 comments

It should be easy for anyone in the team to add new documents to elasticsearch, once we have the raw docs.

  • schema/format to follow (for all our sources: crawled docs, OCR)
  • where to store the raw docs
  • how to start indexing the docs to elasticsearch
  • how to verify results of indexing
  • rollback to the previous state in case of issues

anupdhml avatar Sep 03 '18 17:09 anupdhml

Partially done in https://github.com/Code4Nepal/akshara-project/pull/80

anupdhml avatar May 14 '19 19:05 anupdhml