datasette icon indicating copy to clipboard operation
datasette copied to clipboard

Use YAML examples in documentation by default, not JSON

Open simonw opened this issue 4 years ago • 10 comments
trafficstars

YAML configuration is much better for multi-line strings, and I'm increasingly adding configuration options to Datasette that benefit from that - fragments of HTML in description_html or SQL queries used to configure things like https://github.com/simonw/datasette-atom for example.

Rather than confusing things by showing both in the documentation, I should switch all of the default examples to use YAML instead.

simonw avatar Dec 18 '20 22:12 simonw

Could there be a little widget that offers conversion from one to the other?

mroswell avatar Mar 23 '21 05:03 mroswell

That's a good idea. I could do that with JavaScript - loading YAML and converting it to JSON in JavaScript shouldn't be hard, and it's better than JSON-to-YAML because there's only one correct JSON representation of a YAML file whereas you can represent a JSON document in YAML in a bunch of different ways.

simonw avatar Mar 23 '21 16:03 simonw

... actually I think I would do that conversion in Python. The client-side YAML parsers all look a little bit heavy to me in terms of additional page weight.

simonw avatar Mar 23 '21 16:03 simonw

https://cdnjs.cloudflare.com/ajax/libs/js-yaml/4.0.0/js-yaml.min.js is only 12.5KB zipped, 38KB total - so that's not a bad option.

https://github.com/nodeca/js-yaml

simonw avatar Mar 23 '21 16:03 simonw

https://docs.datasette.io/en/stable/metadata.html has this example:

title: Demonstrating Metadata from YAML
description_html: |-
  <p>This description includes a long HTML string</p>
  <ul>
    <li>YAML is better for embedding HTML strings than JSON!</li>
  </ul>
license: ODbL
license_url: https://opendatacommons.org/licenses/odbl/
databases:
  fixtures:
    tables:
      no_primary_key:
        hidden: true
    queries:
      neighborhood_search:
        sql: |-
          select neighborhood, facet_cities.name, state
          from facetable join facet_cities on facetable.city_id = facet_cities.id
          where neighborhood like '%' || :text || '%' order by neighborhood;
        title: Search neighborhoods
        description_html: |-
          <p>This demonstrates <em>basic</em> LIKE search

I ran this in the browser dev tools:

var s = document.createElement('script')
s.src = 'https://cdnjs.cloudflare.com/ajax/libs/js-yaml/4.0.0/js-yaml.min.js'
document.head.appendChild(s)
var yamlExample = document.querySelector('.highlight-yaml').textContent);
console.log(JSON.stringify(window.jsyaml.load(yamlExample), null, 4))

And got:

{
    "title": "Demonstrating Metadata from YAML",
    "description_html": "<p>This description includes a long HTML string</p>\n<ul>\n  <li>YAML is better for embedding HTML strings than JSON!</li>\n</ul>",
    "license": "ODbL",
    "license_url": "https://opendatacommons.org/licenses/odbl/",
    "databases": {
        "fixtures": {
            "tables": {
                "no_primary_key": {
                    "hidden": true
                }
            },
            "queries": {
                "neighborhood_search": {
                    "sql": "select neighborhood, facet_cities.name, state\nfrom facetable join facet_cities on facetable.city_id = facet_cities.id\nwhere neighborhood like '%' || :text || '%' order by neighborhood;",
                    "title": "Search neighborhoods",
                    "description_html": "<p>This demonstrates <em>basic</em> LIKE search"
                }
            }
        }
    }
}

simonw avatar Mar 23 '21 16:03 simonw

One downside of doing this conversion in JavaScript: it's much harder to get the same JSON syntax highlighting as that provided by Sphinx:

Metadata_—_Datasette_documentation

simonw avatar Mar 23 '21 16:03 simonw

I used this code to get that:

var jsonVersion = JSON.stringify(window.jsyaml.load(document.querySelector('.highlight-yaml').textContent), null, 4);
div.querySelector('.highlight pre').innerText = jsonVersion;
div.querySelector('.highlight pre').style.whiteSpace = 'pre-wrap'

simonw avatar Mar 23 '21 16:03 simonw

Beginnings of a UI element for switching between them:

<div style="border: 1px solid rgb(225, 228, 229);
  background-color: rgb(238, 255, 204);
  padding: 0.3em;
  position: relative;
  top: 3px;
  font-family: courier;">
<a href="#" style="display: inline-block; padding-left: 0px; padding-right: 2em;">JSON</a>
<a href="#" style="display: inline-block;">YAML</a>
</div>
Metadata_—_Datasette_documentation

That <pre> has a padding of 12px, so using 12px padding on the tab links should get them to line up better.

simonw avatar Mar 23 '21 17:03 simonw

I could use https://github.com/pradyunsg/sphinx-inline-tabs for this - recommended by https://pradyunsg.me/furo/recommendations/

simonw avatar May 20 '22 19:05 simonw

Undocumented Sphinx feature: you can add extra classes to a code example like this:

.. code-block:: json
   :class: metadata-json

    {
        "databases": {
            "russian-ads": {
                "tables": {
                    "display_ads": {
                        "fts_table": "ads_fts",
                        "fts_pk": "id",
                        "searchmode": "raw"
                    }
                }
            }
        }
    }

https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-code-block doesn't mention this.

Filed an issue about the lack of documentation here:

  • https://github.com/sphinx-doc/sphinx/issues/10461

simonw avatar May 20 '22 19:05 simonw

I was inspired to finally address this after seeing sphinx-inline-tabs at work in https://webcolors.readthedocs.io/en/latest/install.html

simonw avatar Jul 08 '23 16:07 simonw

I'm using cog and this utility function to generate the YAML/JSON tabs:

https://github.com/simonw/datasette/blob/3b336d8071fb5707bd006de1d614f701d20246a3/docs/metadata_doc.py#L1-L13

Example usage:

https://github.com/simonw/datasette/blob/3b336d8071fb5707bd006de1d614f701d20246a3/docs/metadata.rst?plain=1#L17-L53

simonw avatar Jul 08 '23 16:07 simonw

https://docs.datasette.io/en/latest/metadata.html

inline-tabs

simonw avatar Jul 08 '23 16:07 simonw

Hit a problem:

Exception occurred:
  File "/opt/hostedtoolcache/Python/3.9.17/x64/lib/python3.9/site-packages/docutils/nodes.py", line 2028, in unknown_visit
    raise NotImplementedError(
NotImplementedError: <class 'docutils.writers.docutils_xml.XMLTranslator'> visiting unknown node type: TabContainer
The full traceback has been saved in /tmp/sphinx-err-tfujyw1h.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!

That's happening here: https://github.com/simonw/datasette/blob/0183e1a72d4d93b1d9a9363f4d47fcc0b5d5849c/.github/workflows/deploy-latest.yml#L42-L48

My https://github.com/simonw/sphinx-to-sqlite tool can't handle the new TabContainer elements introduced by sphinx-inline-tabs.

simonw avatar Jul 08 '23 17:07 simonw

Actually no it's in sphinx-build:

% sphinx-build -b xml . _build
Running Sphinx v6.1.3
building [mo]: targets for 0 po files that are out of date
writing output... 
building [xml]: targets for 28 source files that are out of date
updating environment: [new config] 28 added, 0 changed, 0 removed
reading sources... [100%] writing_plugins                                                                                                                          
looking for now-outdated files... none found
pickling environment... done
checking consistency... done
preparing documents... done
writing output... [  3%] authentication                                                                                                                            
Exception occurred:
  File "/Users/simon/.local/share/virtualenvs/datasette-AWNrQs95/lib/python3.10/site-packages/docutils/nodes.py", line 2028, in unknown_visit
    raise NotImplementedError(
NotImplementedError: <class 'docutils.writers.docutils_xml.XMLTranslator'> visiting unknown node type: TabContainer
The full traceback has been saved in /var/folders/x6/31xf1vxj0nn9mxqq8z0mmcfw0000gn/T/sphinx-err-1wkxmkji.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!

simonw avatar Jul 08 '23 18:07 simonw

Relevant code: https://github.com/docutils/docutils/blob/3b53ded52bc439d8068b6ecb20ea0a761247e479/docutils/docutils/nodes.py#L2021-L2031

    def unknown_visit(self, node):
        """
        Called when entering unknown `Node` types.

        Raise an exception unless overridden.
        """
        if (self.document.settings.strict_visitor
            or node.__class__.__name__ not in self.optional):
            raise NotImplementedError(
                '%s visiting unknown node type: %s'
                % (self.__class__, node.__class__.__name__))

simonw avatar Jul 08 '23 18:07 simonw

Running with -P opens a debugger when it hits the error:

sphinx-build -P -b xml . _build
(Pdb) list
2023 	
2024 	        Raise an exception unless overridden.
2025 	        """
2026 	        if (self.document.settings.strict_visitor
2027 	            or node.__class__.__name__ not in self.optional):
2028 ->	            raise NotImplementedError(
2029 	                '%s visiting unknown node type: %s'
2030 	                % (self.__class__, node.__class__.__name__))
2031 	
2032 	    def unknown_departure(self, node):
2033 	        """
(Pdb) self.optional
('meta',)
(Pdb) node.__class__.__name__
'TabContainer'
(Pdb) self.document.settings.strict_visitor
(Pdb) type(self.document.settings.strict_visitor)
<class 'NoneType'>

So if I can get TabContainer into that self.optional list I'll have fixed this problem.

simonw avatar Jul 08 '23 18:07 simonw

I figured out a workaround:

extensions = [
    "sphinx.ext.extlinks",
    "sphinx.ext.autodoc",
    "sphinx_copybutton",
]
if not os.environ.get("DISABLE_SPHINX_INLINE_TABS"):
    extensions += ["sphinx_inline_tabs"]

That way I can run sphinx-build -b xml . _build successfully if I set that environment variable.

I get some noisy warnings, but it runs OK. And the resulting docs.db file has rows like this, which I think are fine:

image

simonw avatar Jul 08 '23 18:07 simonw

This one was tricky:

image

I wanted complete control over the YAML example here, so I could ensure it used multi-line strings correctly.

I ended up changing my cog helper function to this:

import json
import textwrap
from yaml import safe_dump
from ruamel.yaml import round_trip_load


def metadata_example(cog, data=None, yaml=None):
    assert data or yaml, "Must provide data= or yaml="
    assert not (data and yaml), "Cannot use data= and yaml="
    output_yaml = None
    if yaml:
        # dedent it first
        yaml = textwrap.dedent(yaml).strip()
        # round_trip_load to preserve key order:
        data = round_trip_load(yaml)
        output_yaml = yaml
    else:
        output_yaml = safe_dump(data, sort_keys=False)
    cog.out("\n.. tab:: YAML\n\n")
    cog.out("    .. code-block:: yaml\n\n")
    cog.out(textwrap.indent(output_yaml, "        "))
    cog.out("\n\n.. tab:: JSON\n\n")
    cog.out("    .. code-block:: json\n\n")
    cog.out(textwrap.indent(json.dumps(data, indent=2), "        "))
    cog.out("\n")

This allows me to call it ith YAML in some places:

.. [[[cog
    metadata_example(cog, yaml="""
    databases:
      fixtures:
        queries:
          neighborhood_search:
            fragment: fragment-goes-here
            hide_sql: true
            sql: |-
              select neighborhood, facet_cities.name, state
              from facetable join facet_cities on facetable.city_id = facet_cities.id
              where neighborhood like '%' || :text || '%' order by neighborhood;
    """)
.. ]]]

I had to introduce https://pypi.org/project/ruamel.yaml/ as a dependency here in order to load YAML from disk while maintaining key order.

I'm still using safe_dump(data, sort_keys=False) from PyYAML as I couldn't get the result I wanted for outputting YAML from an input of JSON using PyYAML.

simonw avatar Jul 08 '23 18:07 simonw

ERROR: Could not find a version that satisfies the requirement Sphinx==6.1.3; extra == "docs" (from datasette[docs,test]) (from versions: 0.1.61611, 0.1.61798, 0.1.61843, 0.1.61945, 0.1.61950, 0.2, 0.3, 0.4, 0.4.1, 0.4.2, 0.4.3, 0.5, 0.5.1, 0.5.2b1, 0.5.2, 0.6b1, 0.6, 0.6.1, 0.6.2, 0.6.3, 0.6.4, 0.6.5, 0.6.6, 0.6.7, 1.0b1, 1.0b2, 1.0, 1.0.1, 1.0.2, 1.0.3, 1.0.4, 1.0.5, 1.0.6, 1.0.7, 1.0.8, 1.1, 1.1.1, 1.1.2, 1.1.3, 1.2b1, 1.2b2, 1.2b3, 1.2, 1.2.1, 1.2.2, 1.2.3, 1.3b1, 1.3b2, 1.3b3, 1.3, 1.3.1, 1.3.2, 1.3.3, 1.3.4, 1.3.5, 1.3.6, 1.4a1, 1.4b1, 1.4, 1.4.1, 1.4.2, 1.4.3, 1.4.4, 1.4.5, 1.4.6, 1.4.7, 1.4.8, 1.4.9, 1.5a1, 1.5a2, 1.5b1, 1.5, 1.5.1, 1.5.2, 1.5.3, 1.5.4, 1.5.5, 1.5.6, 1.6b1, 1.6b2, 1.6b3, 1.6.1, 1.6.2, 1.6.3, 1.6.4, 1.6.5, 1.6.6, 1.6.7, 1.7.0b1, 1.7.0b2, 1.7.0, 1.7.1, 1.7.2, 1.7.3, 1.7.4, 1.7.5, 1.7.6, 1.7.7, 1.7.8, 1.7.9, 1.8.0b1, 1.8.0, 1.8.1, 1.8.2, 1.8.3, 1.8.4, 1.8.5, 1.8.6, 2.0.0b1, 2.0.0b2, 2.0.0, 2.0.1, 2.1.0, 2.1.1, 2.1.2, 2.2.0, 2.2.1, 2.2.2, 2.3.0, 2.3.1, 2.4.0, 2.4.1, 2.4.2, 2.4.3, 2.4.4, 2.4.5, 3.0.0b1, 3.0.0, 3.0.1, 3.0.2, 3.0.3, 3.0.4, 3.1.0, 3.1.1, 3.1.2, 3.2.0, 3.2.1, 3.3.0, 3.3.1, 3.4.0, 3.4.1, 3.4.2, 3.4.3, 3.5.0, 3.5.1, 3.5.2, 3.5.3, 3.5.4, 4.0.0b1, 4.0.0b2, 4.0.0, 4.0.1, 4.0.2, 4.0.3, 4.1.0, 4.1.1, 4.1.2, 4.2.0, 4.3.0, 4.3.1, 4.3.2, 4.4.0, 4.5.0, 5.0.0b1, 5.0.0, 5.0.1, 5.0.2, 5.1.0, 5.1.1, 5.2.0, 5.2.0.post0, 5.2.1, 5.2.2, 5.2.3, 5.3.0)
ERROR: No matching distribution found for Sphinx==6.1.3; extra == "docs"

I'm going to drop Python 3.7.

simonw avatar Jul 08 '23 18:07 simonw

Some examples:

  • https://docs.datasette.io/en/latest/sql_queries.html#canned-queries
  • https://docs.datasette.io/en/latest/sql_queries.html#canned-query-parameters
  • https://docs.datasette.io/en/latest/authentication.html#access-to-an-instance
  • https://docs.datasette.io/en/latest/facets.html#facets-in-metadata
  • https://docs.datasette.io/en/latest/full_text_search.html#configuring-full-text-search-for-a-table-or-view
  • https://docs.datasette.io/en/latest/metadata.html
  • https://docs.datasette.io/en/latest/custom_templates.html#custom-css-and-javascript
  • https://docs.datasette.io/en/latest/plugins.html#plugin-configuration

I need to fix this section: https://docs.datasette.io/en/latest/writing_plugins.html#writing-plugins-that-accept-configuration

simonw avatar Jul 08 '23 20:07 simonw

https://docs.datasette.io/en/latest/writing_plugins.html#writing-plugins-that-accept-configuration is fixed now.

simonw avatar Jul 08 '23 20:07 simonw