flux-core job-archive: store jobspec/R as JSON type in sqlite when available

Recent versions of sqlite have embedded support for JSON via the JSON column type. Using this type allows queries to reference arbitrary keys in the stored JSON, which sounds pretty appealing if we can one day open up access to the job-archive database for more advanced queries.

As a test, here's a Python script which will ingest all the current jobs in an instance and store them in a local test.db using JSON columns for annotations, eventlog, jobspec and R.

#!/usr/bin/python3

import sys
import sqlite3
import json
from datetime import datetime, timedelta

import flux
from flux.job import JobList
from flux.hostlist import Hostlist

handle = flux.Flux()
alljobs = JobList(handle, attrs=flux.job.list.VALID_ATTRS, max_entries=0)

conn = sqlite3.connect("test.db")
c = conn.cursor()
c.execute(
    "CREATE TABLE if not exists jobs("
    "  id CHAR(16) PRIMARY KEY,"
    "  userid INT,"
    "  ranks TEXT,"
    "  t_submit REAL,"
    "  t_run REAL,"
    "  t_cleanup REAL,"
    "  t_inactive REAL,"
    "  name TEXT,"
    "  nnodes INT,"
    "  ntasks INT,"
    "  waitstatus INT,"
    "  success BOOLEAN,"
    "  result TEXT,"
    "  annotations JSON,"
    "  eventlog JSON,"
    "  jobspec JSON,"
    "  R JSON"
    ")"
)

insert_sql = """
    INSERT INTO
      jobs(
        id,
        userid,
        ranks,
        t_submit,
        t_cleanup,
        t_inactive,
        name,
        nnodes,
        ntasks,
        waitstatus,
        success,
        result,
        annotations,
        eventlog,
        R,
        jobspec)
      VALUES(?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)
"""


def lookup_cb(future, job):
    try:
        result = future.get()
        job.jobspec = result["jobspec"]
        job.R = result["R"]
        job.eventlog = result["eventlog"]
    except Exception as exc:
        print(f"{job.id}: {exc} (ignoring)")
        pass


jobs = []
for job in alljobs.jobs():
    jobs.append(job)
    keys = [ "jobspec", "eventlog" ]
    if job.nnodes:
        keys.append("R")
    handle.rpc(
        "job-info.lookup", {"id": job.id, "keys": keys, "flags": 0}
    ).then(lookup_cb, job)

handle.reactor_run()

for job in jobs:
    if hasattr(job, "R"):
        print (job.result)
        c.execute(
            insert_sql,
            (
                job.id,
                job.userid,
                job.ranks,
                job.t_submit,
                job.t_cleanup,
                job.t_inactive,
                job.name,
                job.nnodes,
                job.ntasks,
                job.waitstatus,
                job.success,
                job.result,
                json.dumps(job.annotations.annotationsDict),
                job.eventlog,
                job.R,
                job.jobspec,
            ),
        )


conn.commit()
conn.close()

Once the database is created you can experiment with various queries. In my case the queries are pretty slow, but I'm not a sql/sqlite expert, perhaps there are ways to speed this kind of thing up. The main benefit I guess is that queries are not limited to the fields normally broken out by job-list/job-archive.

sqlite> select count(*) from jobs;
8973
sqlite> select count(*) from jobs where json_extract(jobspec, "$.tasks[0].command[0]") == "flux";
116
sqlite> select count(*) from jobs where json_extract(jobspec, "$.attributes.system.duration") > 0;
3

I'm not exactly sure where to go next with this, but thought I'd share in case it is useful.

Mar 21 '22 21:03 grondo

This seems pretty interesting and useful.

On the speed, well, this has got to be faster than grabbing everything and filtering on the client end, plus if we could identify common queries based on what actually gets run, then we could add columns/indexes to speed things up.

There is also "generated columns" in the pipeline: https://dgl.cx/2020/06/sqlite-json-support

I don't know where we want to go with this either, but I do believe we are going to need to start purging the KVS of inactive jobs older than some configured threshold. Presumably this puts a burden on job-archive to provide this data when queried, although I don't really know what that looks like. (Will job-list need to answer queries about jobs that ran 5 days ago?)

Mar 22 '22 01:03 garlick

I don't really know what that looks like. (Will job-list need to answer queries about jobs that ran 5 days ago?)

I think this will be pretty common. In fact one of our open issues is about filtering jobs based on node over a week or two to look for patterns and find bad nodes. I'd hate to offer two separate interfaces, one for "recent" jobs vs "jobs N days old" or whatever, but I'm not sure I have a great solution.

In a way it would be nice (and I think we've discussed this before) if job-list and job-archive were combined. All jobs could be stashed in the sqlite database, perhaps with an in-memory cache of recent jobs. For queries that can't be answered by the cache, a the database could be used, which would have access to all jobs. On restart, job-list would only have to process currently active jobs, not read the whole kvs history.

BTW, do we know roughly how many jobs our largest clusters run per week?

Mar 22 '22 02:03 grondo

while experimenting with some things for #4273, one idea that came to me would be to add a new "attribute" to the job-list module that returns all data job-list data into a json object and returns it. Like hypothetically a "dump" attribute.

This could be useful in several ways.

its easier to do this than requesting like 25+ different attributes from job-list
we could archive just this one object and re-load it semi-quickly (related to #4273)
- and this is easier than having a sql table with like 25+ columns and inserts with 25+ column names
flexible for future queryable things in job-list that don't exist today

May 02 '22 19:05 chu11

returns all data job-list data into a json object and returns it.

This would be useful also with some of the Python API and some frontend commands which currently grab all attributes by default. We have to keep the VALID_ATTRS constant in Python in sync with job-list and use it as the default value for attrs. (I had tried to fix this before in job-list but gave up before I finished)

May 02 '22 20:05 grondo

This would be useful also with some of the Python API and some frontend commands which currently grab all attributes by default. We have to keep the VALID_ATTRS constant in Python in sync with job-list and use it as the default value for attrs. (I had tried to fix this before in job-list but gave up before I finished)

Then maybe this is a good idea in general, I'll write up an issue for it. It may be something necessary as a prereq for this issue and #4273

May 02 '22 22:05 chu11

I'm hardly an expert of the JSON support within sqlite, but it appears the primary value add is the ability to filter / parse the JSON values within sql queries.

With job-archive, I imagine the majority of the time, we'd prefer to use jansson's API. This is especially true given so much other code within flux-core is jansson specific.

I'm curious if @cmoussa1 seems a value add here? Not sure what kind of queries currently are done with the job-archive.

May 05 '22 22:05 chu11

The key statement from the initial proposal is this:

if we can one day open up access to the job-archive database for more advanced queries.

i.e. we're hoping the single-consumer design of job-archve can hopefully be replaced with something like the job-db experiment. In that case, the flexibility of the JSON type might open up more flexible queries (e.g. see the first comment). Just as a random example, jobs could be queried by contents of their environment, any attribute or shell option, etc.

May 05 '22 22:05 grondo

I'm curious if @cmoussa1 seems a value add here? Not sure what kind of queries currently are done with the job-archive.

I could definitely see this being useful for the jobpsec and R columns (which it looks like the columns in the example above use JSON type), as that is how i extract the optional bank attribute from a user's job and the number of nodes used by their job. For extracting the nnodes count, I switched to using the ResourceSet Python class a month or two ago, and that has been really useful and easy to use. Not sure if that behavior will need to change as a result of changing the column type in this DB, but I figured I would throw it out there.

FWIW, from the flux-accounting side, I think it is fine either way. I agree that using the JSON type might allow more flexible queries, but as long as I can continue to access the basic information (i.e attributes section of jobspec and number of nodes), I think either approach sounds fine. :)

May 06 '22 15:05 cmoussa1

it appears from sqlite 3.9 -> 3.37.2, json support is optionally compiled in, and in 3.38 it's optionally compiled out. Further complicating this, its not clear if there's a easy way to determine if the version of sqlite you are running supports json unless you just try it.

The 3.26 version on rhel8 supports it. But we may wish to any sqlite-json we do to be backwards compatible to non-json supporting sqlites for some time (e.g. in job-archive we might wish to avoid hard coding use of json in sql queries).

May 09 '22 20:05 chu11

flux-core flux-core copied to clipboard

job-archive: store jobspec/R as JSON type in sqlite when available

flux-core
flux-core copied to clipboard