datapusher-plus Roadmap Tracking Issue

Roadmap Tracking Issue - EPIC

Open jqnatividad opened this issue 2 years ago • 0 comments

by enriching resources, so that right after a file is pushed by DP+, it does a lot of data-wrangling tasks that are typically done manually:
- a lot of metadata is inferred, so the Data Publisher does not have to laboriously enter it in
- descriptive statistics are computed, allowing the Data Publisher and the end-user to better understand the resource
- location information is automatically normalized and geocoded
- related datasets/resources are automatically inferred
- auto-tagging
by taking advantage of PostgreSQL native features
- also use it as a Document Database leveraging JSONB?
- partitioning/sharding?
by tapping into the rich PostgreSQL extensions ecosystem (in particular - PostGIS, Timescale, Citus, CartoDB, Apache Age and ZomboDB)
give it "Data Lake"-like capabilities
enable Datastore API users to issue performant, reliable SQL queries

[ ] #98
[ ] #18
[x] #11
[ ] Auto-tagging
[ ] Automatic spatial extent calculation
[ ] Automatic processing/recognition of whitelisted common column names (e.g. latitude, longitude, status, open date, closed date, etc.)
[x] #53
[x] #47
[x] #27
[x] #9
[ ] Auto partitioning
[ ] #60
[ ] Deferred datapush on initial package creation to allow per package Datapusher+ Configuration
[ ] #87
[x] #17
[ ] Enabling record-level search
[x] #8
[ ] #13
[ ] #54
[x] #10
[x] #19
[x] #30
[ ] Native PostGIS support
[ ] Native time-series support with Timescale
[ ] #34
[ ] #35
[x] #46

Apr 27 '22 04:04 jqnatividad