extraction-framework icon indicating copy to clipboard operation
extraction-framework copied to clipboard

The software used to extract structured data from Wikipedia

Results 150 extraction-framework issues
Sort by recently updated
recently updated
newest added

# Issue still valid? > DBpedia updates frequently in this order: 1. DIEF software, 2. monthly dumps, 3. online services loaded from dumps. > We update http://dief.tools.dbpedia.org/server/extraction/ on a daily...

type: hosting
de.dbpedia.org

Moved here from: #670 ``` define sql:signal-unconnected-variables 1 define sql:signal-void-variables 1 define input:default-graph-uri SELECT DISTINCT ?lc ?subj WHERE { { { { + ?subj . } UNION { /* ?subj...

type: hosting
dbpedia.org/.*
status: fix-required

raised by : https://dbpedia.slack.com/archives/C0HN7KP9R/p1616432752012600 # Ontology review https://www.dbpedia.org/resources/ontology/ if all info from https://dbpedia.slack.com/archives/C0HN7KP9R/p1616433777019600 is there and upfront. Also this paper should be linked: http://www.semantic-web-journal.net/content/dbpedia-large-scale-multilingual-knowledge-base-extracted-wikipedia-0 as it explains the ontology #...

priority
type: documentation

## Source ### Release Dumps DBpedia 2016-10 version, Chinese dump: https://downloads.dbpedia.org/2016-10/core-i18n/zh/ ## Error Description Empty `mappingbased_literals_zh.ttl` and no `mappingbased_objects_zh.ttl` ## Error specification - Affected extraction artifacts: - https://downloads.dbpedia.org/2016-10/core-i18n/zh/mappingbased_literals_zh.ttl.bz2 - https://downloads.dbpedia.org/2016-10/core-i18n/zh/mappingbased_objects_zh.ttl.bz2...

type: data
status: fix-required
priority
status: minidump-test-required

> Hamid Ghofrani [[email protected]] > For Elvis_Presley, the DBpedia types are just > http://dbpedia.org/ontology/Agent > http://dbpedia.org/ontology/MilitaryPerson > http://dbpedia.org/ontology/Person Wikipedia has this: https://en.wikipedia.org/w/index.php?title=Elvis_Presley&action=edit ``` {{Infobox person | occupation = Singer, actor...

type: data
status: fix-required
status: minidump-test-required

# Where did the problem occur (e.g. dbpedia.org/sparql, lookup, spotlight)? > Please give the full URL, if possible http://dief.tools.dbpedia.org/server/extraction/en/extract?title=United+States&revid=&format=turtle-triples&extractors=custom # Problem description > Please state the nature of your technical...

type: hosting
status: fix-provided
priority
status: accepted
status: test-method-required
status: verification-discussion-needed

A minor problem with text extraction: space after a link is eaten up. Eg https://bg.wikipedia.org/w/index.php?title=Джон_Кенеди&action=edit includes: ``` | description = [[Ich bin ein Berliner|Речта]] от Ратхаус Шьонеберг на Джон Кенеди,...

GSoC Warmup task
type: data
status: fix-required
status: minidump-test-required

most of the things seem to be newly introduced ``` grep -R spark * main/scala/org/dbpedia/databus/mod/EvalMod.scala:import org.apache.spark.sql.{SQLContext, SparkSession} main/scala/org/dbpedia/databus/mod/EvalMod.scala: val sparkSession = SparkSession.builder() main/scala/org/dbpedia/databus/mod/EvalMod.scala: .config("spark.local.dir", "./.spark") main/scala/org/dbpedia/databus/mod/EvalMod.scala: sparkSession.sparkContext.setLogLevel("WARN") main/scala/org/dbpedia/databus/mod/EvalMod.scala: val sqlContext:...

question
type: sofware-build

Hello, I want to create a new extractor but i am unable to understand the following: 1: I want to create new output dataset file, just creating a new dataset...

type: data
status: triage-discussion-needed

For now, this is what we support for conditional mappings: [1] we would like to extend the syntax to allow more complex conditions (like &&, ||, !). A discussion on...

GSoC Warmup task
type: data
status: fix-required
status: minidump-test-required