datasette icon indicating copy to clipboard operation
datasette copied to clipboard

Implement ?_extra and new API design for TableView

Open simonw opened this issue 3 years ago • 12 comments
trafficstars

Part of:

  • #262
  • #1518

simonw avatar Apr 28 '22 22:04 simonw

I'm going to change the default API response to look like this:

{
  "rows": [
    {
      "pk": 1,
      "created": "2019-01-14 08:00:00",
      "planet_int": 1,
      "on_earth": 1,
      "state": "CA",
      "_city_id": 1,
      "_neighborhood": "Mission",
      "tags": "[\"tag1\", \"tag2\"]",
      "complex_array": "[{\"foo\": \"bar\"}]",
      "distinct_some_null": "one",
      "n": "n1"
    },
    {
      "pk": 2,
      "created": "2019-01-14 08:00:00",
      "planet_int": 1,
      "on_earth": 1,
      "state": "CA",
      "_city_id": 1,
      "_neighborhood": "Dogpatch",
      "tags": "[\"tag1\", \"tag3\"]",
      "complex_array": "[]",
      "distinct_some_null": "two",
      "n": "n2"
    }
  ],
  "next": null,
  "next_url": null
}

Basically https://latest.datasette.io/fixtures/facetable.json?_shape=objects but with just the rows, next and next_url fields returned by default.

simonw avatar Apr 28 '22 22:04 simonw

Then I'm going to implement the following ?_extra= options:

  • ?_extra=facet_results - to see facet results
  • ?_extra=suggested_facets - for suggested facets
  • ?_extra=count - for the count of total rows
  • ?_extra=columns - for a list of column names
  • ?_extra=primary_keys - for a list of primary keys
  • ?_extra=query - a {"sql" "select ...", "params": {}} object

I thought about having ?_extra=facet_results returned automatically if the user specifies at least one ?_facet - but that doesn't work for default facets configured in metadata.json - how can the user opt out of those being returned? So I'm going to say you don't see facets at all if you don't include ?_extra=facet_results.

I'm tempted to add ?_extra=_all to return everything, but I can decide if that's a good idea later.

simonw avatar Apr 28 '22 22:04 simonw

This means filtered_table_rows_count is going to become count. I had originally picked that terrible name to avoid confusion between the count of all rows in the table and the count of rows that were filtered.

I'll add ?_extra=table_count for getting back the full table count instead. I think count is clear enough!

simonw avatar Apr 28 '22 22:04 simonw

(I remain keen on the idea of shipping a plugin that restores the old default API shape to people who have written pre-Datasette-1.0 code against it, but I'll tackle that much later. I really like how jQuery has a culture of doing this.)

simonw avatar Apr 28 '22 22:04 simonw

I may be able to implement this mostly in the json_renderer() function: https://github.com/simonw/datasette/blob/94a3171b01fde5c52697aeeff052e3ad4bab5391/datasette/renderer.py#L29-L34

simonw avatar Apr 28 '22 22:04 simonw

I'm not sure what to do about the "truncated": true/false key.

It's not really relevant to table results, since they are paginated whether or not you ask for them to be.

It plays a role in query results, where you might run select * from table and get back 1000 results because Datasette truncates at that point rather than returning everything.

Adding it to every table result and always setting it to "truncated": false feels confusing.

I think I'm going to keep it exclusively in the default representation for the /db?sql=... query endpoint, and not return it at all for tables.

simonw avatar Apr 28 '22 23:04 simonw

OK, the prototype of this is looking really good - it's very pleasant to use.

http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_sort=id&_size=5&_extra=query_ms&_extra=count&_col=body returns this:

{
  "rows": [
    {
      "id": 338854988,
      "body": "    /database-name/table-name?name__contains=simon&sort=id+desc\r\n\r\nNote that if there's a column called \"sort\" you can still do sort__exact=blah\r\n\r\n"
    },
    {
      "id": 346427794,
      "body": "Thanks. There is a way to use pip to grab apsw, which also let's you configure it (flags to build extensions, use an internal sqlite, etc). Don't know how that works as a dependency for another package, though.\n\nOn November 22, 2017 11:38:06 AM EST, Simon Willison <[email protected]> wrote:\n>I have a solution for FTS already, but I'm interested in apsw as a\n>mechanism for allowing custom virtual tables to be written in Python\n>(pysqlite only lets you write custom functions)\n>\n>Not having PyPI support is pretty tough though. I'm planning a\n>plugin/extension system which would be ideal for things like an\n>optional apsw mode, but that's a lot harder if apsw isn't in PyPI.\n>\n>-- \n>You are receiving this because you authored the thread.\n>Reply to this email directly or view it on GitHub:\n>https://github.com/simonw/datasette/issues/144#issuecomment-346405660\n"
    },
    {
      "id": 348252037,
      "body": "WOW!\n\n--\nPaul Ford // (646) 369-7128 // @ftrain\n\nOn Thu, Nov 30, 2017 at 11:47 AM, Simon Willison <[email protected]>\nwrote:\n\n> Remaining work on this now lives in a milestone:\n> https://github.com/simonw/datasette/milestone/6\n>\n> —\n> You are receiving this because you were mentioned.\n> Reply to this email directly, view it on GitHub\n> <https://github.com/simonw/datasette/issues/153#issuecomment-348248406>,\n> or mute the thread\n> <https://github.com/notifications/unsubscribe-auth/AABPKHzaVPKwTOoHouK2aMUnM-mPnPk6ks5s7twzgaJpZM4Qq2zW>\n> .\n>\n"
    },
    {
      "id": 391141391,
      "body": "I'm going to clean this up for consistency tomorrow morning so hold off\nmerging until then please\n\nOn Tue, May 22, 2018 at 6:34 PM, Simon Willison <[email protected]>\nwrote:\n\n> Yeah let's try this without pysqlite3 and see if we still get the correct\n> version.\n>\n> —\n> You are receiving this because you authored the thread.\n> Reply to this email directly, view it on GitHub\n> <https://github.com/simonw/datasette/pull/280#issuecomment-391076458>, or mute\n> the thread\n> <https://github.com/notifications/unsubscribe-auth/AAihfMI-H6CBt-Py0xdBbH2xDK0KsjT2ks5t1EwYgaJpZM4UI_2m>\n> .\n>\n"
    },
    {
      "id": 391355030,
      "body": "No objections;\r\nIt's good to go @simonw\r\n\r\nOn Wed, 23 May 2018, 14:51 Simon Willison, <[email protected]> wrote:\r\n\r\n> @r4vi <https://github.com/r4vi> any objections to me merging this?\r\n>\r\n> —\r\n> You are receiving this because you were mentioned.\r\n> Reply to this email directly, view it on GitHub\r\n> <https://github.com/simonw/datasette/pull/280#issuecomment-391354237>, or mute\r\n> the thread\r\n> <https://github.com/notifications/unsubscribe-auth/AAihfM_2DN5WR2mkO-VK6ozDmkUQ4IMjks5t1WlcgaJpZM4UI_2m>\r\n> .\r\n>\r\n"
    }
  ],
  "next": "391355030,391355030",
  "next_url": "http://127.0.0.1:8001/github_memory/issue_comments.json?_search=simon&_size=5&_extra=query_ms&_extra=count&_col=body&_next=391355030%2C391355030&_sort=id",
  "count": 57,
  "query_ms": 21.780223003588617
}

simonw avatar Apr 28 '22 23:04 simonw

That prototype is a very small amount of code so far:

diff --git a/datasette/renderer.py b/datasette/renderer.py
index 4508949..b600e1b 100644
--- a/datasette/renderer.py
+++ b/datasette/renderer.py
@@ -28,6 +28,10 @@ def convert_specific_columns_to_json(rows, columns, json_cols):
 
 def json_renderer(args, data, view_name):
     """Render a response as JSON"""
+    from pprint import pprint
+
+    pprint(data)
+
     status_code = 200
 
     # Handle the _json= parameter which may modify data["rows"]
@@ -43,6 +47,41 @@ def json_renderer(args, data, view_name):
     if "rows" in data and not value_as_boolean(args.get("_json_infinity", "0")):
         data["rows"] = [remove_infinites(row) for row in data["rows"]]
 
+    # Start building the default JSON here
+    columns = data["columns"]
+    next_url = data.get("next_url")
+    output = {
+        "rows": [dict(zip(columns, row)) for row in data["rows"]],
+        "next": data["next"],
+        "next_url": next_url,
+    }
+
+    extras = set(args.getlist("_extra"))
+
+    extras_map = {
+        # _extra=   :    data[field]
+        "count": "filtered_table_rows_count",
+        "facet_results": "facet_results",
+        "suggested_facets": "suggested_facets",
+        "columns": "columns",
+        "primary_keys": "primary_keys",
+        "query_ms": "query_ms",
+        "query": "query",
+    }
+    for extra_key, data_key in extras_map.items():
+        if extra_key in extras:
+            output[extra_key] = data[data_key]
+
+    body = json.dumps(output, cls=CustomJSONEncoder)
+    content_type = "application/json; charset=utf-8"
+    headers = {}
+    if next_url:
+        headers["link"] = f'<{next_url}>; rel="next"'
+    return Response(
+        body, status=status_code, headers=headers, content_type=content_type
+    )
+
+
     # Deal with the _shape option
     shape = args.get("_shape", "arrays")
     # if there's an error, ignore the shape entirely

simonw avatar Apr 28 '22 23:04 simonw

Deployed a preview of this here: https://latest-1-0-alpha.datasette.io/

Examples:

  • https://latest-1-0-alpha.datasette.io/fixtures/facetable.json
  • https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count

Second example produces:

{
  "rows": [],
  "next": null,
  "next_url": null,
  "count": 15,
  "facet_results": {
    "state": {
      "name": "state",
      "type": "column",
      "hideable": true,
      "toggle_url": "/fixtures/facetable.json?_size=0&_extra=facet_results&_extra=count",
      "results": [
        {
          "value": "CA",
          "label": "CA",
          "count": 10,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=CA",
          "selected": false
        },
        {
          "value": "MI",
          "label": "MI",
          "count": 4,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MI",
          "selected": false
        },
        {
          "value": "MC",
          "label": "MC",
          "count": 1,
          "toggle_url": "https://latest-1-0-alpha.datasette.io/fixtures/facetable.json?_facet=state&_size=0&_extra=facet_results&_extra=count&state=MC",
          "selected": false
        }
      ],
      "truncated": false
    }
  }
}

simonw avatar Apr 30 '22 16:04 simonw

Related:

  • #1558

Which talks about how there was confusion in this example: https://latest.datasette.io/fixtures/facetable.json?_facet=created&_facet_date=created&_facet=tags&_facet_array=tags&_nosuggest=1&_size=0

Which I fixed in #625 by introducing tags and tags_2 keys, but actually the facet results would be better if they were a list rather than a dictionary.

simonw avatar Apr 30 '22 18:04 simonw

I had to look up what hideable means - it means that you can't hide the current facet because it was defined in metadata, not as a ?_facet= parameter:

https://github.com/simonw/datasette/blob/4e47a2d894b96854348343374c8e97c9d7055cf6/datasette/facets.py#L228

That's a bit of a weird thing to expose in the API. Maybe change that to source so it can be metadata or request? That's very slightly less coupled to how the UI works.

simonw avatar Apr 30 '22 19:04 simonw

but actually the facet results would be better if they were a list rather than a dictionary

I think facet_results in the JSON should match this (used by the HTML) instead:

https://github.com/simonw/datasette/blob/942411ef946e9a34a2094944d3423cddad27efd3/datasette/views/table.py#L737-L741

simonw avatar Apr 30 '22 19:04 simonw