lida icon indicating copy to clipboard operation
lida copied to clipboard

Dataset Finder (Data Discovery Tool)

Open victordibia opened this issue 2 years ago • 3 comments

What

Data analysis and exploration typically begins with the assumption that the right dataset exists. For many scenarios, this assumption holds (e.g., organizational data already exists is a tidy csv or json file). However, for other use cases, the right dataset may not exist and needs to be found.

The high level goal of this functionality is

provide a set of approaches to finding data given some query or representation of the user's intent.

How

Supported approaches may include the following:

  • Heuristic strategy: define a work flow for identifying datasets that may be relevant. For example, support fixed providers like
    • data.gov
    • GHO https://www.who.int/data/gho/info/gho-odata-api
    • github to find csvs, or json files relevant to queries.
  • Live agent strategy: define some mechanism that leverages web search in identifying related relevant datasets.

Possibly start off with a a base DataFinder class (find method), HeuristicsDataFinder subclass, AgentDataFinder subclass.

p.s. if you are interested in working on this, please share thoughts on your general approach for discussion and comment.

victordibia avatar Aug 21 '23 01:08 victordibia

Is the goal here to allow users to upload their own datasets or to offer a platform for data analysis from a bank of "pre"-provided ready datasets?

0xaaiden avatar Aug 23 '23 07:08 0xaaiden

Thanks Aiden. I am leaning more towards supporting discovery of data as opposed to hosting data (we probably can assume the user is able to do this already).

I updated the initial description to add more information

victordibia avatar Aug 26 '23 14:08 victordibia

Cool! This is what we’re doing at wobby.ai We ingest tons of public data, enrich it with AI and let you analyze it.

Right now were working with journalists, making it easy for them to find data stories in public data.

Would be cool to see this in LIDA. Love this project :)

Check us on:

https://wobby.ai/

nathantetro avatar Sep 08 '23 19:09 nathantetro