pymusas icon indicating copy to clipboard operation
pymusas copied to clipboard

Python Multilingual Ucrel Semantic Analysis System

Results 17 pymusas issues
Sort by recently updated
recently updated
newest added

To incorporate Multi Word Expressions (MWE) into the rule based tagger. The MWEs will come from MWE lexicons, of which examples of these can be found in the [Multilingual USAS...

documentation
enhancement

[pydoc-markdown](https://github.com/NiklasRosenstein/pydoc-markdown) has made a change since version `4.6.0` to use [Novella](https://niklasrosenstein.github.io/novella/) as the render, this means the current way we create our Python API documentation needs to change if we...

documentation
low priority
Potential Future Enhancement

To support wildcard (`*`) syntax for single word lexicon files. This would also be useful for rules like all punctuation tokens, which should be labelled as the semantic category `PUNCT`,...

enhancement

To incorporate auxiliary verb rules into the [USAS Rule Based Tagger](https://ucrel.github.io/pymusas/api/taggers/rule_based#usasrulebasedtagger). # Definition of auxiliary verb rules All POS tags used here are from the [CLAWS C7 tagset](https://ucrel.lancs.ac.uk/claws7tags.html). In English...

low priority
Potential Future Enhancement

To incorporate `Df` tags from MWE templates to enhance the [USAS Rule Based Tagger](https://ucrel.github.io/pymusas/api/taggers/rule_based#usasrulebasedtagger). # Definition of Df tags A small number (currently 93) of English MWE templates have the...

low priority
Potential Future Enhancement

For the [lexicon lookup](https://github.com/UCREL/pymusas/blob/main/pymusas/basic_tagger.py#L9) it might be worth looking into a [trie data structure](https://en.m.wikipedia.org/wiki/Trie), this was used as a multi word lexicon lookup in the [skweak project](https://github.com/NorskRegnesentral/skweak/blob/551e91edf05e48764e7228b4c6e8abb7d950e256/skweak/gazetteers.py#L126).

enhancement
low priority

# Problem At the moment the documentation site is update when it is update on the main GitHub branch. This will cause problems when we add functionality to the latest...

documentation

Add a `CITATION.cff` file so that users can cite the software, the Turing way book has a great [guide on how to do this](https://the-turing-way.netlify.app/communication/citable/citable-cff.html#). As we may create a paper...

documentation
low priority

The contributing guidelines are currently under development. Some resources on how to create guidelines and examples guidelines can be found below: 1. https://the-turing-way.netlify.app/project-design/project-repo/project-repo-participation.html 2. https://github.com/PurpleBooth/a-good-readme-template/blob/main/CONTRIBUTING.md 3. https://github.com/explosion/spaCy/blob/master/CONTRIBUTING.md 4. https://github.com/allenai/allennlp/blob/main/CONTRIBUTING.md

documentation