engineercms icon indicating copy to clipboard operation
engineercms copied to clipboard

海量pdf进行ocr识别,上传解析至elasticsearch,实现全文检索服务

Open 3xxx opened this issue 2 years ago • 0 comments

psc 用adobe acrobat pro进行批量识别。 再上传engineercms,调用tika解析pdf,存入elasticsearch,用ik插件进行中文分词。

3xxx avatar Nov 06 '21 14:11 3xxx