ordia issues

Improve tokenization for CJK languages

2

Japanese example attached — the sentence ```下記方法で体内への侵入を防止すること``` from [here](https://ja.wikipedia.org/w/index.php?title=2019%E6%96%B0%E5%9E%8B%E3%82%B3%E3%83%AD%E3%83%8A%E3%82%A6%E3%82%A4%E3%83%AB%E3%82%B9%E3%81%AB%E3%82%88%E3%82%8B%E6%80%A5%E6%80%A7%E5%91%BC%E5%90%B8%E5%99%A8%E7%96%BE%E6%82%A3&oldid=76973353#%E5%80%8B%E4%BA%BA%E3%81%A7%E3%81%A7%E3%81%8D%E3%82%8B%E4%BA%88%E9%98%B2%E5%AF%BE%E7%AD%96) should be tokenized somewhat like the following, with a single pipe character standing for a word boundary, two for lexeme boundaries...

Daniel-Mietchen

Feature request: Chinese support

3

Ordia currently does not support Chinese at all. Proper support will need #95, of course...

Artoria2e5

Text-to-lexemes remove hyphens in the end of words

This makes it harder to enter affixes. E.g. from https://sv.wikipedia.org/wiki/Lista_%C3%B6ver_prefix_i_svenskan

dpriskorn

Support SVG syndepgraph

https://www.wikidata.org/wiki/User:Mahir256/syndepgraph.js

dpriskorn

Add possible link to Bodh

Add possible link to Bodh. It is a kind of lexeme Tabernacle available from: https://bodh.toolforge.org/ and documented at https://www.wikidata.org/wiki/Wikidata:Bodh

fnielsen

Build text-to-lexemes variant for phrases with N words

3

This would allow to better capture more complex constructs like [matrix-assisted laser desorption/ionization time-of-flight mass spectrometry](https://www.wikidata.org/w/index.php?sort=relevance&search=matrix-assisted+laser+desorption%2Fionization+time-of-flight+mass+spectrometry&title=Special%3ASearch&profile=advanced&fulltext=1&advancedSearch-current=%7B%7D&ns0=1&ns120=1) ([Q1792222](https://www.wikidata.org/wiki/Q1792222)). Ideally, the user could set lower and upper bounds for N.

Daniel-Mietchen

bodhisattwawiki

ordia
ordia copied to clipboard

Metadata

Improve tokenization for CJK languages

Feature request: Chinese support

Text-to-lexemes remove hyphens in the end of words

Support SVG syndepgraph

Add possible link to Bodh

Build text-to-lexemes variant for phrases with N words

In text-to-lexemes, show and filter by number of occurrences

Add link to search in Smurf

Add ability to add statements?

Translate the tool

← Metadata

Owner

Metadata

ordia ordia copied to clipboard

Metadata

← Metadata

Owner

Metadata

ordia
ordia copied to clipboard