extraction-framework icon indicating copy to clipboard operation
extraction-framework copied to clipboard

The software used to extract structured data from Wikipedia

Results 150 extraction-framework issues
Sort by recently updated
recently updated
newest added

What is skos:related between categories? I don't think it's explained anywhere. There are 40286 on dbpedia.org: ``` PREFIX skos: select count(*) {?x skos:related ?y} ``` Also, as #385 shows, some...

type: data
status: triage-discussion-needed

# Details I’ve just realized that there were some mistakes in the resource `dbr:France` [1]. The comments for `dbr:France `includes information from `dbr:Franca` [2]. (As of 2021-10-26) [1] https://dbpedia.org/resource/France [2]...

type: software-bug
status: fix-required
status: test-method-required

template: http://mappings.dbpedia.org/index.php/Template:PropertyMapping says: - language: if the datatype is of rdf:langString we can define the language of the language tag using the wikipedia language code (e.g. language = de) "datatype"...

type: data
status: fix-required
status: minidump-test-required

Some cats on dbpedia.org don't have rdf:type skos:Concept. I discovered this while investigating skos:related. Eg try this query ``` PREFIX skos: select * {?x skos:related ?y. #filter not exists{?x a...

Needs More Examples
type: data
status: triage-discussion-needed

Filter out maintenance (hidden) categories and don't emit them in the dataset. These categories are useful only to Wikipedia maintainers and are not useful for content consumers. - https://en.wikipedia.org/wiki/Category:Hidden_categories has...

enhancement
type: data
status: triage-discussion-needed

As explained in #387 and acknowledged in [ArticleCategoriesExtractor.scala](https://github.com/dbpedia/extraction-framework/blob/f7fa4ab3564a36c7d7b26b15f133b7212011d651/core/src/main/scala/org/dbpedia/extraction/mappings/ArticleCategoriesExtractor.scala#L33), links starting with ":" mean a mere article->cat link, as opposed to categorization. ArticleCategoriesExtractor currently skips such links. I think it should...

type: data
status: fix-required
status: minidump-test-required

When comparing SMILES in dbpedia with the SMILES in Wikipedia I noticed that all of the ones I looked at seem to be truncated. It looks as if this happens...

type: data
status: fix-required
status: minidump-test-required

https://bg.wikipedia.org/wiki/Шаблон:Геообект defines a group of "other info" props. Eg (translation is also provided) ``` |друго-тип = [[БВП]] # other-type = GDP |друго-инфо = $10 000 # other-info = $10 000...

enhancement
type: data
status: triage-discussion-needed

`{{Infobox beverage}}` is applied in [page Dalton_Winery](https://en.wikipedia.org/w/index.php?title=Dalton_Winery&action=edit) several times to make the separate Wines of [Dalton Winery](http://live.dbpedia.org/page/Dalton_Winery): - [Dalton_Winery__Red_canaan__1](http://live.dbpedia.org/page/Dalton_Winery__Red_canaan__1) - Dalton_Winery__White_canaan__1 - Dalton_Winery__Cabernet_Sauvignon__1 - Dalton_Winery__Shiraz__1 - Dalton_Winery__Merlot__1 The wine nodes...

type: data
status: fix-required
status: minidump-test-required
status: triage-discussion-needed

I've found many errors in the lat/long reported in both `geo_coordinates_en.ttl` and `geo_coordinates_mappingbased_en.ttl` (2016-04). For instance: - For `Western_Australia`, the [wikipedia page](https://en.wikipedia.org/wiki/Western_Australia) report `26°S 121°E` while the [DBpedia resource](http://dbpedia.org/page/Western_Australia) points...

status: duplicate
type: data
status: fix-required
status: minidump-test-required