data-extraction topic

List data-extraction repositories

sayn

117
Stars
14
Forks
Watchers

Data processing and modelling framework for automating tasks (incl. Python & SQL transformations).

flash

90
Stars
6
Forks
Watchers

Golang Keyword extraction/replacement Datastructure using Tries instead of regexes

cyac

94
Stars
15
Forks
Watchers

High performance Trie and Ahocorasick automata (AC automata) Keyword Match & Replace Tool for python

ScrapeMate

96
Stars
13
Forks
Watchers

Scraping assistant tool. Editing and maintaining CSS/XPath selectors across webpages.

format_parser

62
Stars
18
Forks
Watchers

file metadata parsing, done cheap

newspaper3_usage_overview

132
Stars
17
Forks
Watchers

This repository provides usage examples for the Python module Newspaper3k.

PlotDigitizer

112
Stars
21
Forks
Watchers

A Python utility to digitize plots.