extraction-framework icon indicating copy to clipboard operation
extraction-framework copied to clipboard

The software used to extract structured data from Wikipedia

Results 150 extraction-framework issues
Sort by recently updated
recently updated
newest added

This PR (auto) updates all wikipedia settings from the Wikipedia API and adds newly added wikipedia languages ## Summary by CodeRabbit - New Features - Broadened disambiguation detection and redirect...

this is a temporary pull request in order to check how well older commits from the dev branch can be merged into current master ## Summary by CodeRabbit - New...

This PR refines Amharic mappings and updates local statistics and ignore list as part of GSoC 2025 contributions. ## Summary by CodeRabbit - Chores - Expanded the Amharic statistics ignore...

Changes required for Hindi Chapter.

New datasets for InfoboxReferencesExtractor

New version of the InfoboxReferencesExtractor. Added also integration with CitationExtractor

New datasets for InfoboxReferencesExtractor

Haven't tested this on sample data yet, mustn't merge. @chile12 could you check this?

It seems that many places in South Africa store incorrect coordinates. Take for instance (but it seems to apply to most cities in South Africa): http://dbpedia.org/page/Port_Elizabeth http://dbpedia.org/page/Johannesburg http://dbpedia.org/page/Cape_Town http://dbpedia.org/page/Centurion,_Gauteng On...

GSoC Warmup task
feature-fix-required-by-community
type: data
status: fix-required
status: minidump-test-required