classifai
classifai copied to clipboard
Add Multilingual I/O
Is your enhancement related to a problem? Please describe. Improving input/output languages support in ClassifAI to match or work-around language restrictions/support in IBM Watson’s Natural Language Understanding API
Describe the solution you'd like
- [ ] improve language support (post tagging, excerpt generation) so the plugin can be used by publishers that work in any supported language
- [ ] improve image support (alt text, image tagging) so the plugin can be used by publishers that work in any supported language
Designs n/a
Describe alternatives you've considered none
Additional context none
@jeffpaul I am bumping this issue as I was about to raise a similar one myself. I see it was added to the 1.6.0 milestone, is there a status update or forecast for possible implementation?
We're facing a scenario where we need to support IBW Watson with multiple languages. My thought was to have some internal APIs to handle filtering of the language identifier sent to Watson. I am unsure if there is a lot more involvement here though.
In our specific case, we are needing to implement ClassifAI in a project leveraging en-US and Arabic.
@simondowdles I know that @moraleida is working on this as part of an internal project, so if you have any further input on ideal workflow for this feature that will help inform his planning and implementation. Also, is there an ideal timeframe on when this is available within ClassifAI for your use? We're likely to get the v1.5 release out this month, so v1.6 while currently unplanned should ideally happen this half of the year.
Given no updates on work on this issue, I'm going to punt this to Future Release
and we'll look to milestone this when we have a better sense that a branch/PR is in-flight and more ready for testing/merging.
And if it is not real Multilingual, it would be nice to have a setting for other languages as English. The results with German are not so well.
Adding a link of languages supported by NLU - https://cloud.ibm.com/docs/natural-language-understanding?topic=natural-language-understanding-language-support
I've moved this item in the 2.3.0 milestone for us to consider which features include text input/output and might want to identify if respective service providers can (with an update) or do (with current integration) support more than English. Then look to expose that in the setup flow and/or in the settings for each feature/service, while keeping English as the default and perhaps only surfacing during the setup if the default Site Language is not set to "English (United States)".
Noting here that https://gist.github.com/felipeelia/9570cac412f42b6768b8d8ecb650c3c6 from @felipeelia could be a starting point or at least a point of reference for what we might do on this larger topic of multilingual support for ClassifAI.
Related request would be to be able to select a text-to-speech voice on a per post level in case it's written in another language. As a first step a filter could be added somewhere here I guess.
@ocean90 note that a filter like you mentioned is in #537 at c40ed57.
Note that the prompt-based features from OpenAI will now likely get the ability to edit the prompt via settings in #587 / #594, so the linked PR here of #537 is on-hold and will likely be closed once 587 is done/merged. We'll likely still want to consider a separate PR for this issue of Multilingual support as the Watson post tagging and Azure image tagging do not have prompts to edit and could still benefit from a language setting to change from the default English setting.
Hello I would like to work on this issue
In assessing what options could benefit from multilingual support, I think the following currently fit that need:
- Language Processing > Classification > IBM Watson NLU
- Image Processing > Descriptive Text Generator > Microsoft Azure AI Vision
- Image Processing > Image Tags Generator > Microsoft Azure AI Vision
Each of these could have a setting akin to "Generate <something>
in: <language>
" where the something
is "Classifications", "Descriptive Text", and "Image Tags" respectively and language
is the list of output languages supported by the respective AI service provider.