extraction-framework icon indicating copy to clipboard operation
extraction-framework copied to clipboard

The software used to extract structured data from Wikipedia

Results 150 extraction-framework issues
Sort by recently updated
recently updated
newest added

Should we drop support for Table mappings in the framework? This didn't work as expected so far http://mappings.dbpedia.org/index.php?title=Special%3APrefixIndex&prefix=Table&namespace=204 http://mappings.dbpedia.org/index.php/How_to_edit_DBpedia_Mappings#How_to_map_a_Wikipedia_Table

enhancement
type: data
status: fix-required
status: minidump-test-required

We need to create a test for validating contributions to the server ignore list and enable tests on the server module (only for now) related PRs that caused problem: https://github.com/dbpedia/extraction-framework/pull/331...

GSoC Warmup task
type: data
status: triage-discussion-needed

In the mapping server we already provide statistics about the template & template property mapping coverage. A great addition would be an estimated class & property instance count based on...

GSoC Warmup task
type: data
status: triage-discussion-needed

I've just written unit tests for the FlagTemplateParser and many of them miserably fail. Here are things not working as advertised: - {{flag|...}} template with a country code as 1st...

type: software-bug
type: data
status: fix-provided
status: verification-discussion-needed

This would facilitate working with for example the lookup code.

enhancement
status: verification-discussion-needed

Each time someone changes an ignore list through the browser, it is [saved on the server](/dbpedia/extraction-framework/blob/master/server/src/main/scala/org/dbpedia/extraction/server/stats/IgnoreList.scala#L101) in the location where it was checked out from the repo. Then it should...

type: software-bug
enhancement
status: triage-discussion-needed

Stub categories seem to be implemented most of the time through transclusion of templates. Eg the article https://en.wikipedia.org/w/index.php?title=Şıra&action=edit has: ``` {{Turkey-cuisine-stub}} {{nonalcoholic-drink-stub}} ``` The fact that something is a stub...

type: data
status: triage-discussion-needed

http://mappings.dbpedia.org/server/extraction/en/extract?title=Great_Britain_men%27s_national_basketball_team&format=turtle-triples&extractors=custom makes triples with en.dbpedia.org (which does not resolve) instead of dbpedia.org, eg: http://en.dbpedia.org/resource/Great_Britain_men's_national_basketball_team (as subject) and http://en.dbpedia.org/resource/British_Basketball (as object). So at least the extraction sampler is broken in this...

GSoC Warmup task
type: data
status: triage-discussion-needed

(A simple warm-up task) See eg http://mappings.dbpedia.org/server/extraction/en/extract?title=Great_Britain_men%27s_national_basketball_team&format=turtle-triples&extractors=custom . It's very hard to read because it doesn't use prefixes. Add as many common prefixes as possible, so the listing is more...

type: data
status: triage-discussion-needed

As noted on the mailing list by Andy Mabbett, The English-Wikipedia community has decided to deprecate Persondata: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_%28proposals%29#RfC:_Should_Persondata_template_be_deprecated_and_methodically_removed_from_articles.3F aka https://goo.gl/ie8yed (page section will be archived shortly) In future, such data...

type: data
status: triage-discussion-needed