tern icon indicating copy to clipboard operation
tern copied to clipboard

Proposal: Create a database backend with an associated API

Open jeevers opened this issue 6 years ago • 8 comments

It could be useful to have a database backend so that data can be more easily organized and queried. I think SQLite would be a good fit (at least at first) due to its ease of setup and management via the sqlite3 module in the standard library. Eventually we can add support for other databases.

jeevers avatar May 15 '18 20:05 jeevers

@PrajwalM2212 recommended sqlite as well: I think we can just choose sqlite3 because 1. It is faster 2. It is good for applications where code that executes sql statements and the application reside on the same machine. 3. It also supports huge amount of data upto 140TB with greater performance 4. It is provided as part of python standard lib https://www.sqlite.org/whentouse.html

nishakm avatar Feb 26 '20 21:02 nishakm

The main requirement is that the storage be self contained ,right? that's why redis is not an option? @nishakm

zoek1 avatar Mar 25 '20 11:03 zoek1

@zoek1 That was one of the reasons why I suggested sqlite. Since we are only using the cache for analysis purpose ( our internal use ) , sqlite gives the best value.

PrajwalM2212 avatar Mar 26 '20 09:03 PrajwalM2212

At this time, my main concern is to move away from storing data in a YAML file and into something that is queryable. The discussion I would really like to have is whether we should be using a key-value store (like Redis) or a relational database (like sqlite). One thing about choosing a relational database is that you will need to put time into designing the database. Once done, it is difficult to undo. Key-value stores are easier to change, but suffer from the same problems as the flat YAML file which is that as more data gets added, it becomes less queryable. I am personally leaning towards implementing this in sqlite because we already have a data model and making an API for queries means the database can be switched with something else.

nishakm avatar Mar 26 '20 14:03 nishakm

My research shows that using a json file as a backend greatly improves performance:

yaml backend: 76 seconds json backend: 0.47 seconds

We would still like a database backend so folks can set up a centralized repository which is queryable but for now, replacing the caching format from json to yaml is an easy improvement.

nishakm avatar Jun 10 '20 23:06 nishakm

  1. Design CRUD API for different items in the database #792
  2. Implement the database #863
  3. Implement the sync mechanism #862

nishakm avatar Jan 19 '21 20:01 nishakm

What's the status of this proposal and can I work on it?

ashok-arora avatar Nov 08 '21 17:11 ashok-arora

I don't know if it is possible but since we are aiming to store the container image into database, can't we convert docker image to JSON format and then store in JSON data in redis database. Since JSON greatly increase the performance and also accessing database through Redis is faster.

urmilkalaria avatar Feb 16 '22 14:02 urmilkalaria