pardall_markdown
pardall_markdown copied to clipboard
Add a YAML FrontMatter metadata parser
What does this PR do?
- adds
yaml.exfor users to use as a metadata parser when they prefer to use YAML syntax for their front matter.
Closes #41
@alfredbaudisch so far things are going well I am running into some problems I don't fully understand:
- I assume I should not have to prefix the
contentsarg indo_parse.("\n---\n#{contents}"). I get a nastygram fromElixirMapwhen it tries to parse the data after parsing the YAML. It makes sense because I have already called:binary.split/2once with a similar pattern. - I am unsure how the lifecycle is working with the
case do_parse.("\n---\n#{contents}")block. I understand that this call is using theElixirMapparser, but I am lost how the YAML that has been parsed to a map is being passed through. Ultimately I am ending up in theothercase and the value is being seen asnil.
@rockchalkwushock nice, thanks for starting the pull request!
- You are right, I don't think that's needed and it can be problematic for bigger files or for a big volume of files (the
do_parse.("\n---\n#{contents}")). - I think this will be solved after (1) is investigated as well.
By the weekend I'll check it out.
@alfredbaudisch I will play with it more tomorrow morning and see if I can get that figured out. I am close there is just something I am missing.
@alfredbaudisch I am curious what you think about this idea, it would require some refactoring to the ElixirMap:
- We update the pattern to match against in
ElixirMap
- :binary.split(contents, ["\n---\n", "\r\n---\r\n"])
+ :binary.split(contents, ["<!-- -->"])
<!-- --> is the commenting format in .md files.
In making this change we side step the issue of the overlapping patterns with the ElixirMap and the YAML parser (or any future parser.
From what I am seeing in the erlang docs for :binary.split/2 and from my own hacking around with it the default is to only split on the first instance of the pattern match so we would not get into any trouble with the user commenting in the markdown throughout the file. We would want to reach for :binary.split/3 and the [:global] option to match on all instances of the given pattern.
Elixir Map Format (default)
%{
author: Turd Ferguson
date: 2021-11-20
title: That's a funny name
}
---
<!-- -->
Post content...
YAML Format
---
author: Turd Ferguson
date: 2021-11-20
title: That's a funny name
---
<!-- -->
Post content...
Joplin Format
I am not familiar with Joplin so this could be incorrect.
That's a funny name
<!-- -->
Post content...
<!-- --> can act as the separator between attrs and body in any case (hopefully?) in which case then the parsers for metadata can be applied solely to attrs while we continue to just pass the body through until we are ready to parse that data.
Crude POC
# base_parser.ex
# default metadata parser is ElixirMap
do_parse = fn split_contents ->
apply(parser, :parse, [path, split_contents, opts])
end
def parse(path, contents, opts) do
...
case :binary.split(contents, ["<!-- -->"]) do
[_] -> do_parse(contents)
[_, contents] -> do_parse(contents)
[attrs, contents] ->
# process `attrs` with corresponding parser
case do_parse(contents) do
{:ok, frontmatter, body} ->
{:ok, frontmatter, body}
other ->
other
end
end
end
@rockchalkwushock I'm really bad with keeping track of my 10.000 personal projects. I'm very sorry for keeping you waiting on this one. I'll try to come back to it soon.
@alfredbaudisch no worries I have been pretty busy as well. I will circle back this evening and give this another look. Perhaps I can get it over the hump.