wikiparsec issues

updates text-icu dependency, https://github.com/plfa/plfa.github.io/p…

I was unable to compile your project until I updated the `text-icu` dependency. Amazing work, thanks!

Handle a new class of labels called 'warnings'

This change to wikiparsec helps us avoid output that detracts from ConceptNet, perhaps because it involves obsolete forms of words, or because it outputs offensive definitions that are removed from...

rspeer

Added MultiTemplate

I wanted different handling on some templates for en wiki parsing so I created multtTemplate. It currently only supports English but hopefully it will be useful to someone else too....

chessgecko

Feature request: add pronunciation and homonyms for FR and EN wikitionary parsers

1

mhham

FYI: Necessary trick to build on macos

The path to the icu lib should be manually specified in the `stack build` command: https://github.com/haskell/haskell-ide-engine/issues/275

mhham

[Do not merge] Implement a poor man's solution for extracting

3

gender information from German Wiktionary. Not very smart but I do not know any Haskell. For my purposes, it works and may serve as a starting point for fixing https://github.com/LuminosoInsight/wikiparsec/issues/4

wrznr

Feature request: Add gender to the information extracted from the German wiktionary dump

1

Each article title for nouns has information on the gender of the corresponding noun. It would be very helpful to have them extracted as well.

wrznr

Feature Request add Article Borders to output of wiki2text:

Hello im trying to use your library to extract wiki text from dumps. The text looks good but i can not see how you would clearly identify article borders. Could...

yassin-taskin

Feature request: Add IPA and hyphenation to the information extracted from the German wiktionary dump

Many thanks for your wonderful tool! It would be a great addition to have the hyphenation patterns and the IPA representation in the set of extracted information.

wrznr

wikiparsec
wikiparsec copied to clipboard

Metadata

updates text-icu dependency, https://github.com/plfa/plfa.github.io/p…

Handle a new class of labels called 'warnings'

Added MultiTemplate

Feature request: add pronunciation and homonyms for FR and EN wikitionary parsers

FYI: Necessary trick to build on macos

[Do not merge] Implement a poor man's solution for extracting

Feature request: Add gender to the information extracted from the German wiktionary dump

Feature Request add Article Borders to output of wiki2text:

Feature request: Add IPA and hyphenation to the information extracted from the German wiktionary dump

← Metadata

Owner

Metadata

wikiparsec wikiparsec copied to clipboard

Metadata

← Metadata

Owner

Metadata

wikiparsec
wikiparsec copied to clipboard