galaxy icon indicating copy to clipboard operation
galaxy copied to clipboard

Modernize Galaxy Tool Search function

Open uwwint opened this issue 1 month ago • 1 comments

Galaxy's search function is based on Whoosh, which is unmaintained. Bioblend currently doesnt use any intelligent search feature (probably for performance reasons) Performance and usability issues in the frontend as well as backend implementation. This PR is seeking to modernise galaxy's tool search infrastructure in preparation for its use inside the MCP.

Current list of identified issues and their implementation status [x] replace whoosh with pytantivy which is maintained and unlike Lucene doesn't pull jvm into the dependencies [ ] re-jig frontend to generate pytantivy search queries [ ] advanced search frontend currently places a lot of load on the server -> performance test [ ] current index generation creates one document per tool_id move towards aggregating by tool with changes (yields a 300% decrease in search index) [ ] cache text search indices for consumption by the frontend [ ] re-write the fast frontend search to use a full text library - stream in cache search index then switch over to complete online full text search (if possible)

How to test the changes?

(Select all options that apply)

  • [ ] I've included appropriate automated tests.
  • [ ] This is a refactoring of components with existing test coverage.
  • [ ] Instructions for manual testing are as follows:
    1. [add testing steps and prerequisites here if you didn't write automated tests covering all your changes]

License

  • [x] I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

uwwint avatar Nov 20 '25 10:11 uwwint

Thank you for working on this! Should be very cool!

Some future plans in the comment here: https://github.com/galaxyproject/galaxy/pull/20747#issuecomment-3214837773 that could be relevant?

ahmedhamidawan avatar Nov 20 '25 22:11 ahmedhamidawan