knowledge-repo icon indicating copy to clipboard operation
knowledge-repo copied to clipboard

[Feature Request] text search

Open koaning opened this issue 8 years ago • 7 comments
trafficstars

Currently it is not possible to search based on the text in the .md,.Rmd, .ipynb document. To facilitate this you may want the backend to be written in elasticsearch. I am wondering if this will ever be a feature/if it is feasible.

Auto-reviewers: @NiharikaRay @matthewwardrop @earthmancash @danfrankj

koaning avatar Mar 14 '17 10:03 koaning

Hi @koaning! We've so far been focussing on other things, but more complete search functionality is definitely on the roadmap. We are concerned about adding too many infrastructure dependencies and so may not adopt elastic-search, per se; but we are definitely interested in making post metadata and content more searchable. Watch this space!

matthewwardrop avatar Mar 14 '17 16:03 matthewwardrop

Out of curiosity, if not elastic ... what other tool might help facilitate this? Keep a seperate tf/idf table into something SQL-like?

koaning avatar Mar 14 '17 18:03 koaning

I've done some fiddling with a library called whoosh, which does local full-text search pretty efficiently. If you have any ideas, let us know too :).

matthewwardrop avatar Mar 15 '17 21:03 matthewwardrop

Hi @matthewwardrop and team, Is this still a part of the roadmap? It would be an extremely useful feature and if there's anyway that I can help, that would be great! On a side note, at @socialcopsdev, we are trying to make knowledge-repo our primary knowledge management tool. To that end, we are experimenting with it. The experiments, around Quip integration and an Rstudio addin, are documented in this blog - https://blog.socialcops.com/engineering/airbnb-knowledge-repository-scale-knowledge/. Thanks again for open sourcing this amazing project!

analyticalmonk avatar Feb 24 '18 20:02 analyticalmonk

@analyticalmonk Hi Akash, I'm a data scientist at Airbnb. Thanks for sharing! This looks awesome! We actually have an RStudio add in internally that we use to publish posts from RStudio. Hoping to open source this addin in the near future. We also use the Linked Posts (Web Proxy) feature to add Google Docs and other linked documents into the knowledge repo.

Regarding search, I'm exploring building this functionality. I looked into whoosh and it looks like it hasn't been maintained much recently and therefore thinking of implementing an ElasticSearch integration as you suggested. I saw a number of examples online showing integrating ElasticSearch into a Flask app with SQL database, which is what the knowledge repo uses. Wondering if you would be interested in collaborating on implementing this? :)

bulam avatar May 27 '20 04:05 bulam

@bulam I would've loved to have the R addin back when I'd worked with the knowledge repo initially. It's good to know that your team is considering open-sourcing it. :)

Collaborating on search implementation does sound interesting. It would be a really useful feature! Do you have any initial thoughts on how to go about the implementation?

analyticalmonk avatar May 27 '20 14:05 analyticalmonk

@analyticalmonk I'm at the exploratory stage, but saw guides like this for integrating ElasticSearch with a Flask app that uses MySQL https://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-xvi-full-text-search

bulam avatar May 29 '20 02:05 bulam