ideas icon indicating copy to clipboard operation
ideas copied to clipboard

Use YAML files to store content

Open hdodov opened this issue 5 years ago • 8 comments

In Kirby, we are very familiar with the YAML format thanks to blueprints. It's a nice, human-readable format. Couldn't it be used to store the actual content as well?

Benefits:

  1. Widely supported format, so it can be consumed by other software more easily
  2. Because it's a .yml file, you'd have syntax highlighting when editing content directly
  3. YAML supports comments, and Kirby content would too. I can't think of many use-cases for this, but it might be useful to be able to leave comments for yourself in the content?
  4. It can still be intelligently parsed

I think that last part is the most important. My guess is that the current TXT format is used because it could be quickly split by the four ---- dashes to get the values of each field, i.e. it's fast to parse.

Example. Suppose Kirby uses YAML files and this is a content file:

title: My Site
date: 25.12.2020
struct:
  - hello: world
    number: 42
  - hello: friend
    number: 64

When parsed, it would decode all fields, including the structure field, which is suboptimal and not how Kirby currently works. What if that structure is huge and never actually needed? It was parsed for nothing.

However, as I said, it can be intelligently parsed because YAML is predictable due to its dependence on whitespace. Fields are stored at the top level of the data structure, meaning that the line they begin on does not start with whitespace. Take this example:

title:
  foo: bar
data:
  bar: baz
struct:
  - hello: world
  - hello: friend

The only lines beginning without whitespace are:

title:
data:
struct:

...meaning Kirby could avoid parsing the whole file (and structures inside of it). If a line begins with no whitespace and matches [a-z]+:, that's a field. Everything after that is part of that field's content until you reach another line without whitespace that matches [a-z]+: - that's a new field.

When parsed by this methodology, the parsed result would be:

[
  'title' => '\n  foo: bar',
  'data' => '\n  bar: baz',
  'struct' => '\n  - hello: world\n  - hello: friend'
]

If I need to further parse the YAML in struct, for example, I'd call $page->struct()->toStructure(), as I would now.


Obviously, this doesn't matter much if you use the panel, but if you're editing files, the syntax highlighting might be helpful? It's also a more concise format. Compare this:

Title: My Site

----

Logo: 

- logo.svg

----

Text: Hello World

...to the syntax highlighted:

title: My Site
logo: 
  - logo.svg
text: Hello World

It would be a pretty different experience to edit YAML files, but I'm wondering if it would be a better one. What do you think?

hdodov avatar Mar 28 '20 06:03 hdodov

Personally, I like yaml for short definitions (a key followed by a few words value). I wouldn't like it for longer text content, the indentation there just gets messy and the ---- feels like such a better way to visually separate fields. But that's all taste.

My biggest issue right from the top would be that this would define a total breaking change from Kirby 3 to Kirby 4. While breaking changes are of course part of such big version changes, something like this would render a transition very difficult.

distantnative avatar Mar 28 '20 10:03 distantnative

Considering that many people already have problems with indentation in blueprints, I personally wouldn't like that either. We write all the documentation for the website by manually editing the content files and the dashes between fields together with the space surrounding them really helps keeping on overview. A concise format doesn't really help human readability and keeping indentation intact when using long text is a pain. Just my 2ct.

texnixe avatar Mar 28 '20 10:03 texnixe

And one more thing: Currently, you can write in a text editor like iAWriter etc. and get a markdown preview of your stuff. You would completely lose that when writing YAML syntax. Of course, all this in not relevant if you use the Panel, but I think there is still a certain amount of people who use Kirby without the Panel, simply editing the text files.

texnixe avatar Mar 28 '20 12:03 texnixe

I was asked many times why we don't use frontmatter or other formats for our content and the answer is always a mixture of readability and performance. None of the YAML libraries can ever be as fast as exploding the file by the separators.

bastianallgeier avatar Mar 30 '20 15:03 bastianallgeier

@bastianallgeier yep, I thought that performance played a major role in your decision. But do you think that splitting by lines starting with no whitespace could be comparable? In theory, it should be pretty quick. I guess there could be more edge cases to handle do to this whole whitespace thing, though.

hdodov avatar Mar 30 '20 17:03 hdodov

@hdodov Splitting by lines that don't start with whitespace is a pretty clever solution, but I expect multiple edge-cases. YAML is actually a very complex language if you read the spec and all that complexity would need to be respected in a custom YAML parser.

I also see two additional issues with the suggestion:

  • When you start to parse only parts of the YAML, it's not really YAML anymore. The user will expect that they can use any kind of YAML syntax for their content, but actually it would just be a field syntax that just looks like YAML. That reduces the advantage of switching to it by a lot – Kirbydata is a format that is easy to understand and where it's very clear what each part of formatting does. A custom YAML-like syntax won't have many advantages over that.
  • I agree with what @texnixe and @distantnative wrote: I personally think that Kirbydata is much easier to read and to write by hand than YAML. If we lost that feature, we wouldn't be true to our roots anymore. I myself came to Kirby because I liked how simple it was and still is and I think many users think the same.

What we could do however is to allow content storage plugins that could implement whatever content format the user needs and wants.

lukasbestle avatar Mar 31 '20 16:03 lukasbestle

@lukasbestle yep, valid points.

What we could do however is to allow content storage plugins that could implement whatever content format the user needs and wants.

Yes! When it comes to Kirby, flexibility is always the correct answer, it seems. A custom content storage plugin could also provide a solution to #524. Kirby says "give me the content for page X" and then it's up to the plugin to read and parse the files needed to return the content and/or translations for it. When there are multiple use-cases, don't shoehorn one to solution to all of them, but provide the option to handle them differently. 👍

hdodov avatar Mar 31 '20 17:03 hdodov

Yep. It would be a bit more complex though: The component would also need to handle template detection (right now the template to use is based on the filename of the content file) and Kirby would need to know which file(s) inside the page directory is/are the content file(s) to ensure that those files are not included in the $page->files() collection.

So it's not something that can easily be added, but we can keep this in mind for later – which is what idea issues are for after all.

lukasbestle avatar Mar 31 '20 18:03 lukasbestle