catalyst icon indicating copy to clipboard operation
catalyst copied to clipboard

How to Use Catalyst for Lemmatizing in a Multi-Language App with On-Demand Model Downloads

Open ramjke opened this issue 1 year ago • 0 comments

I'm working on an application that supports multiple languages chosen by the user. I want to integrate Catalyst for its lemmatizing capabilities. However, I've encountered a couple of challenges and I would appreciate your guidance on how to address them:

Challenges

  1. Preinstallation of NuGet Packages for Each Language: As per the documentation, it appears that I need to preinstall the NuGet package for each language I intend to support. Given the number of languages, this approach would lead to a significant increase in the bundle size of my application, which is not ideal.

  2. Using Only the Lemmatizing Feature: My primary need from Catalyst is the lemmatizing feature. I want to minimize the resources and dependencies required by my application by using only this specific functionality.

Questions

  1. On-Demand Model Downloads: Is there a way to implement Catalyst such that I can download language models on demand, based on the user's selected language? This would help in keeping the initial bundle size small and load models only when necessary.

  2. Minimal Usage for Lemmatizing: How can I configure Catalyst to use just enough resources for the lemmatizing feature? Are there any specific configurations or optimizations that I should be aware of to achieve this?

Use Case

Here is a brief outline of what I am trying to achieve:

  • The user selects a language.
  • The application downloads the necessary model for that language.
  • The application uses Catalyst's lemmatizing feature for processing text in the chosen language.

Any advice, code snippets, or references to relevant parts of the documentation would be highly appreciated.

ramjke avatar May 23 '24 11:05 ramjke