extraction-framework icon indicating copy to clipboard operation
extraction-framework copied to clipboard

The software used to extract structured data from Wikipedia

Results 150 extraction-framework issues
Sort by recently updated
recently updated
newest added

Hi, I've encountered a bug in dbpedia service, dates between 0-99 A.D are mapped to 19(0-99), example here: http://dbpedia.org/page/Nero I haven't dug into dbpedia code, but I'm assuming that this...

type: data
status: fix-required
status: minidump-test-required

I have a few concerns about [citationIri](https://github.com/dbpedia/extraction-framework/blob/807d7bc8fd825da8e404e4d8050d9c6ae3207b0d/core/src/main/scala/org/dbpedia/extraction/mappings/CitationExtractor.scala#L106). It's trying to make a URL for the citation from its properties: 1. @jimkont please confirm that even though it's a `for` loop,...

type: data
status: fix-required
status: minidump-test-required

See http://mappings.dbpedia.org/server/extraction/sr/extract?revid=19189789&format=trix&extractors=custom the extraction framework outputs the following iri for this resource http://sr.dbpedia.org/resource/Project_talk:Администраторска_табла however the actual namespace (and wikipedia article) name is https://sr.wikipedia.org/wiki/Разговор_о_Википедији:Администраторска_табла through http://sr.wikipedia.org/wiki/Project_talk:Администраторска_табла you still get redirected to...

type: data
status: fix-required
status: minidump-test-required

- Feeder reads 5000 records and puts them in queue - these get extracted, but seems like they don't get updated in cache db - next request it gets the...

type: software-bug
_DBpedia Live
status: triage-discussion-needed

My issue is pretty much the same as the first problem described in https://github.com/dbpedia/extraction-framework/issues/556 - during extraction of the (German) wikipedia dump a lot of `Tried to convert inconvertible unit`...

question
status: triage-discussion-needed

Running rapper over the changesets at http://downloads.dbpedia.org/live/changesets/2019/ Rapper log: http://95.217.42.166/rapper-changesets-2019.bz2 `find changesets/2019 | grep 'nt.gz$' | xargs zcat | rapper -i ntriples -c - http://base.org 2>&1 | lbzip2 -zc >...

type: software-bug
_DBpedia Live
type: data
status: triage-discussion-needed

Performing the next query to dbpedia: ```sparql PREFIX dbo: PREFIX dbr: PREFIX foaf: SELECT ?country ?label ?longName ?name WHERE { ?country a dbo:Country. ?country dbo:capital ?capital. ?country rdfs:label ?label ....

type: data
status: fix-provided
status: minidump-test-required

While converting the `nif-text-links_lang=en.ttl` from RDF to HDT using https://github.com/rdfhdt/hdt-cpp/tree/develop/libhdt I get following error: > error: /data/milan/nif-text-links_lang=en.ttl:7388119:282: invalid IRI escape `nif-text-links_lang=en.ttl ` comes from https://databus.dbpedia.org/marvin/text/nif-text-links/ version `2020.02.01` The problem is...

type: data
status: fix-required
status: minidump-test-provided

https://en.wikipedia.org/wiki/The_Ren_%26_Stimpy_Show is encoded as: https://dbpedia.org/resource/The_Ren_&_Stimpy_Show check: `curl http://dbpedia-mappings.tib.eu/release/mappings/mappingbased-literals/2019.06.01/mappingbased-literals_lang=en.ttl.bz2 | bzcat | cut -f1 -d '>' | grep '&'` on https://databus.dbpedia.org/marvin/mappings/mappingbased-literals/2019.06.01

type: data
status: fix-required
status: minidump-test-provided

https://github.com/dbpedia/extraction-framework/blob/live-deployed/live/src/main/java/org/dbpedia/extraction/live/feeder/EventStreamsFeeder.java if AKKA stream fails, the initial time is used, not latestProcessDate, cascading in maxLine exceeded Illegal state exception. There was an attempt to fix this, but it is unclear...

type: software-bug
_DBpedia Live
status: triage-discussion-needed