wikiparsec icon indicating copy to clipboard operation
wikiparsec copied to clipboard

An LL parser for extracting information from Wiki text, particularly Wiktionary.

Results 9 wikiparsec issues
Sort by recently updated
recently updated
newest added

I was unable to compile your project until I updated the `text-icu` dependency. Amazing work, thanks!

This change to wikiparsec helps us avoid output that detracts from ConceptNet, perhaps because it involves obsolete forms of words, or because it outputs offensive definitions that are removed from...

I wanted different handling on some templates for en wiki parsing so I created multtTemplate. It currently only supports English but hopefully it will be useful to someone else too....

The path to the icu lib should be manually specified in the `stack build` command: https://github.com/haskell/haskell-ide-engine/issues/275

gender information from German Wiktionary. Not very smart but I do not know any Haskell. For my purposes, it works and may serve as a starting point for fixing https://github.com/LuminosoInsight/wikiparsec/issues/4

Each article title for nouns has information on the gender of the corresponding noun. It would be very helpful to have them extracted as well.

Hello im trying to use your library to extract wiki text from dumps. The text looks good but i can not see how you would clearly identify article borders. Could...

Many thanks for your wonderful tool! It would be a great addition to have the hyphenation patterns and the IPA representation in the set of extracted information.