denote
denote copied to clipboard
Refining the renaming mechanism
I am posting this here since it is where @jeanphilippegg is active with changes to the code.
cc @hpgisler
-
@jeanphilippegg consolidated the renaming commands to just
denote-rename-file
anddenote-dired-rename-marked-files
. I just tested everything and updated the relevant docs. This addresses the issue we had before with ergonomics, namely, that we had distinct commands for rewriting and for adding front matter in addition to the renaming operation. -
We now are ready for the function that completes the front matter, as we discussed on the mailing list. In short: add missing entries to the front matter block.
-
The "reverse rename" operation, which would read the front matter and rename the file accordingly.
I feel that 2 is the trickiest one because we do not define such entries one-by-one but only as a block (e.g. denote-org-front-matter
). I would thus leave this for last.
Unless @jeanphilippegg already has some code in the works, I will now start working on 3.
Mailing list thread for the sake of completeness (I wish they were not split like that, but here we are):
- https://lists.sr.ht/~protesilaos/denote/%3C87k081l6vw.fsf%40silverstone.mail-host-address-is-not-set%3E
- https://lists.sr.ht/~protesilaos/denote/%3C878rogw5kk.fsf%40protesilaos.com%3E
- https://lists.sr.ht/~protesilaos/denote/%3C87fsiljv1s.fsf%40hu.mail-host-address-is-not-set%3E
- https://lists.sr.ht/~protesilaos/denote/%3C87r122afe3.fsf%40hu.mail-host-address-is-not-set%3E
Unless @jeanphilippegg already has some code in the works, I will now start working on 3.
Actually, I do! I had made functions denote-rename-file-using-front-matter
as well as denote-rename-marked-files-using-front-matter
. I had kept them for myself until things would be more stable and to see how we wanted to do it exactly.
Do you want me to share them as-is? Should they be their own functions?
Please share, though note I just pushed the ones I had. We should consolidate everything.
The code for the "reverse rename" is completed with PR #75. I will include them in the manual a bit later.
2. We now are ready for the function that completes the front matter, as we discussed on the mailing list. In short: add missing entries to the front matter block.
I think this should be done automatically by denote-rename-file
. Right now, in denote-rename-file
, we check if a front matter is present using denote-edit-front-matter-p
which validates that the title and the keywords are both present. We would just need to consider more cases.
This is still not simple to implement, for the reasons you mentioned, but I would aim for that end result.
[...] but I would aim for that end result.
Agreed!
We need to see how best to treat the front matter entries individually. I will give it a try now and post any updates.
Also, we have to think about how we want to add the missing information.
I suggest this:
- If both title and keywords are present, just rewrite the front matter, as we do now.
- If both title and keywords are missing, I think it is safe to add a new "clean" front matter, as we do now.
- If title is missing, add it above the keywords line.
- If keywords is missing, add is below the title line.
I suggest this:
Yes, this is where it gets tricky.
If both title and keywords are present, just rewrite the front matter, as we do now.
But that will not complete the set if, say, the identifier
is missing from the front matter.
If both title and keywords are missing, I think it is safe to add a new "clean" front matter, as we do now.
Makes sense. There could be a duplicate date
as a result but we can add that check as well.
- If title is missing, add it above the keywords line.
- If keywords is missing, add is below the title line.
I say we do it like this, but if we can think of some smart way to read the order in denote-org-front-matter
(and related) that would be even better.
But that will not complete the set if, say, the
identifier
is missing from the front matter.
I think it is not an issue to not have an identifier. In fact, as discussed in the mailing list, I would rather have it be the default!
Makes sense. There could be a duplicate
date
as a result but we can add that check as well.
Yes, we can also try to handle the date
.
The identifier
and date
cases could be handled independently from the title
and keywords
cases, as a last step in the denote-rename-file
function.
my 2 cents:
Current property ordering is (if I am not mistaken):
#+title:
#+date:
#+filetags:
#+identifier:
I like Jean-Philippe's suggestion:
#+title:
#+filetags:
#+date:
#+identifier:
Assuming this suggestion, what about following algorithm:
Basic concept: Walk through the frontmatter ((partially) existing or not) property by property - in the well kown order:
title -> filetags -> date -> identifier
i.e.
-
Loop-Init: property-cursor = title property-location-cursor = a) if #+filetags: exist: line above filetags b) if not exist and no buffer-local-variables defined on first line: 1. line c) if not exist but buffer-local-variables defined on first line: 2. line
-
search for property-cursor property in file a) if found: a) reset property-location-cursor to that location b) replace property b) if not found: a) insert property-cursor value according at property-location-cursor position
-
increment: a) property-cursor to next property b) property-location-cursor to next line
-
if not finished, goto 2.
(please let me know, if my interfering is not helpful)
Best Hanspeter
Jean-Philippe Gagné Guay @.***> writes:
Also, we have to think about how we want to add the missing information.
I suggest this:
• If both title and keywords are present, just rewrite the front matter, as we do now. • If both title and keywords are missing, I think it is safe to add a new "clean" front matter, as we do now. • If title is missing, add it above the keywords line. • If keywords is missing, add is below the title line.
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.**>
But that will not complete the set if, say, the identifier is > missing from the front matter.
jeanphilippegg: I think it is not an issue to not have an identifier. In fact, as discussed in the mailing list, I would rather have it be the default!
Ah yes, I fogrot about this point. It is a breaking change which changes what denote--format-front-matter
passes to variables such as denote-org-front-matter
. For denote-text-front-matter
in particular, it would make the 5th argument become the 4th.
hpgisler: Current property ordering is (if I am not mistaken):
#+title: #+date: #+filetags: #+identifier:
I like Jean-Philippe's suggestion:
#+title: #+filetags: #+date: #+identifier:
Continuing from the point above, I would suggest not to touch the default on the sole premise that we have variables like denote-org-front-matter
which expect arguments in a given sequence. Any change we make will affect users who already configure those variables. Configuration can be done directly or with let
bindings in a user's own functions. So we would need to provide a migration path. Ultimately, I think it is too much work for a cosmetic change.
If I am missing something and we can change any of this in a backward-compatible way, then of course it's all good from my side.
As for how the entries should be ordered post-rename, I am fine with the order you two prefer. Also, I am fine with making any changes to this because we do not affect any existing code, as I explained above.
hpgisler: Assuming this suggestion, what about following algorithm: [...]
This looks right. It just needs to fit in the existing mechanisms we have. I'm working on it and will have something to share a bit later.
I am still refining the code we have as I noticed that tasks that needed to be finalised. @jeanphilippegg I observed that the helper I had earlier to split the string works even with faulty input in the front matter. Maybe this helps us avoid errors with the file name:
(defun the-helper-i-pushed-earlier (keywords)
(split-string keywords "[:,\s]+" t "[][ \"']+"))
(the-helper-i-pushed-earlier "one two three")
;; => ("one" "two" "three")
(the-helper-i-pushed-earlier "one two three four")
;; => ("one" "two" "three" "four")
(the-helper-i-pushed-earlier "[\"one\", \"two\", \"three\"]")
;; => ("one" "two" "three")
(the-helper-i-pushed-earlier "[\"one\", \"two\", wrong nodelimiter]")
;; => ("one" "two" "wrong" "nodelimiter")
(the-helper-i-pushed-earlier ":one:two:three:four")
;; => ("one" "two" "three" "four")
(the-helper-i-pushed-earlier ":one::two:::three four")
;; => ("one" "two" "three" "four")
This always returns a list as we expect it to be and then produces the correct file name. We thus guard against typos or common mistakes.
Whereas with faulty data in something like:
---
title: "Hewdy folks"
date: 2022-08-05T07:36:24+03:00
tags: ["one", "two" three]
identifier: "20220805T073624"
---
The return value of denote--extract-keywords-from-front-matter
is ("one" "two\" three")
and this inserts quotes into the file name. While we can better protect what goes into the file name, I feel the other helper is simpler. The key is whether it does the same job or there is something it does wrongly.
What do you think?
EDIT: To be clear, I suggest this change if it causes no regressions:
diff --git a/denote.el b/denote.el
index b9a606b..64b1ce4 100644
--- a/denote.el
+++ b/denote.el
@@ -671,19 +671,11 @@ (defun denote--format-front-matter-keywords (keywords &optional type)
(t
(denote--format-org-keywords kw)))))
-(defun denote--extract-keywords-from-front-matter (file &optional type)
- "Extract keywords from front matter of FILE with TYPE.
+(defun denote--front-matter-keywords-to-list (file)
+ "Return keywords from front matter of FILE as list of strings.
This is the reverse operation of `denote--format-front-matter-keywords'."
- (let ((fm-keywords (denote--retrieve-value-keywords file)))
- (cond
- ((or (eq type 'markdown-toml) (eq type 'markdown-yaml) (eq type 'md))
- (split-string
- (string-trim-right (string-trim-left fm-keywords "\\[") "\\]")
- ", " t "\s*\"\s*"))
- ((eq type 'text)
- (split-string fm-keywords " " t " "))
- (t
- (split-string fm-keywords "\\([:]\\|\s\s\\)" t "\\([:]\\|\s\\)")))))
+ (let ((keywords (denote--retrieve-value-keywords file)))
+ (split-string keywords "[:,\s]+" t "[][ \"']+")))
(defvar denote-toml-front-matter
"+++
We shall update the relevant calls like this: (denote--front-matter-keywords-to-list file)
Yes, denote--extract-keywords-from-front-matter
was more "robust" in the sense that it handled keywords in the expected way given an Org file. For example, from key1:key2
, we would get key1
and key2
. However, it would not work for faulty input such as ["key1", "key2"]
in an Org file.
denote--front-matter-keywords-to-list
handles the faulty cases.
I am okay with either approach!
I have a slight preference toward denote--extract-keywords-from-front-matter
because the handling is more precise and it can be split into individual functions for each file type at some point, if necessary. I am also okay if we choose not to handle faulty user inputs in the front matter.
This also raises a thought I had the other day: currently, the title before sluggification is recorded as-is in the front matter, which is good. However, the keywords are sluggified, even in the front matter. Is this right? For example, in an md file, would it be wrong to have these keywords: some:key other_key myKey
. denote--extract-keywords-from-front-matter
, even though it does not handle faulty keywords, looks more future-proof. It will be easy to support new file types with possibly conflicting syntaxes with other file types. (I also don't know if there are limitations imposed by Org.)
(To answer my own question: I think the reason the sluggification also happens in the front matter with the keywords might be because when we infer keywords, we only check the file names. However this can be a documented limitation of keywords inference.)
Sorry about this discussion on this "off-topic" matter. This is for your consideration! For now, either approach is good with me.
I am okay with either approach!
I have a slight preference toward
denote--extract-keywords-from-front-matter
because the handling is more precise and it can be split into individual functions for each file type at some point, if necessary.
The reason I bring this up is to avoid potential problems on our end. With the original renaming operation, we were basically telling the user "Give us your input, we will then process it and do the right thing". Whereas the front-matter-based renaming says "Write your input and we will apply it with the hope it was written correctly---any malformed input puts us in uncharted territory".
If a user has faulty front matter that is their problem because they created it. But if we take that faulty input in one of our commands, we are assuming that problem as our own.
It is with this idea that I was thinking about handling the wrong input. I don't mind how exactly it is done. I just thought it would basically keep Denote safe from creating more trouble in a user's setup.
This also raises a thought I had the other day: currently, the title before sluggification is recorded as-is in the front matter, which is good. However, the keywords are sluggified, even in the front matter. Is this right?
Good point! I think we should consider it in another thread so that we don't go off topic here. There is no good reason for the current behaviour. Just a stylistic one. I am fine with something like myKey
in the front matter.
Again, this comes back to what the front matter means to Denote: we provide it for convenience, so there is no strong reason to be opinionated about it.
Yes, we can use denote--front-matter-keywords-to-lists
. We can always revisit this later if there is a need!
Any thoughts on this diff?
denote.el | 22 ++++++----------------
1 file changed, 6 insertions(+), 16 deletions(-)
diff --git a/denote.el b/denote.el
index b53692e..c8a9796 100644
--- a/denote.el
+++ b/denote.el
@@ -671,19 +671,11 @@ (defun denote--format-front-matter-keywords (keywords &optional type)
(t
(denote--format-org-keywords kw)))))
-(defun denote--extract-keywords-from-front-matter (file &optional type)
- "Extract keywords from front matter of FILE with TYPE.
+(defun denote--front-matter-keywords-to-list (file)
+ "Return keywords from front matter of FILE as list of strings.
This is the reverse operation of `denote--format-front-matter-keywords'."
- (let ((fm-keywords (denote--retrieve-value-keywords file)))
- (cond
- ((or (eq type 'markdown-toml) (eq type 'markdown-yaml) (eq type 'md))
- (split-string
- (string-trim-right (string-trim-left fm-keywords "\\[") "\\]")
- ", " t "\s*\"\s*"))
- ((eq type 'text)
- (split-string fm-keywords " " t " "))
- (t
- (split-string fm-keywords "\\([:]\\|\s\s\\)" t "\\([:]\\|\s\\)")))))
+ (let ((keywords (denote--retrieve-value-keywords file)))
+ (split-string keywords "[:,\s]+" t "[][ \"']+")))
(defvar denote-toml-front-matter
"+++
@@ -1390,8 +1382,7 @@ (defun denote-rename-file-using-front-matter (file)
(when (buffer-modified-p)
(user-error "Save buffer before proceeding"))
(if-let* ((title (denote--retrieve-value-title file))
- (keywords (denote--extract-keywords-from-front-matter
- file (denote--filetype-heuristics file)))
+ (keywords (denote--front-matter-keywords-to-list file))
(extension (file-name-extension file t))
(id (denote--file-name-id file))
(dir (file-name-directory file))
@@ -1441,8 +1432,7 @@ (defun denote-dired-rename-marked-files-using-front-matter ()
(let* ((dir (file-name-directory file))
(id (denote--file-name-id file))
(title (denote--retrieve-value-title file))
- (keywords (denote--extract-keywords-from-front-matter
- file (denote--filetype-heuristics file)))
+ (keywords (denote--front-matter-keywords-to-list file))
(extension (file-name-extension file t))
(new-name (denote--format-file
dir id keywords (denote--sluggify title) extension)))
If you think it is okay, I will commit it now. I'm about to go to bed, so I may have missed something.
Seems good!
Okay. Worst case scenario, we edit it afterwards.
Yes! I don't have a strong opinion on the matter. Both approaches are good. And we are working on rare use-cases anyway.
Yes! I don't have a strong opinion on the matter. Both approaches are good. And we are working on rare use-cases anyway.
Indeed! These are all edge cases.
Anyhow, I'm going to sleep now. Goodnight! (timezones notwithstanding)
Indeed! These are all edge cases.
Anyhow, I'm going to sleep now. Goodnight! (timezones notwithstanding)
Thanks! Good night!
I am now working on changing how we check for the front matter. Returning a list with the entries seems like the most flexible approach. We will then check for those in the relevant commands/functions.
WORK-IN-PROGRESS:
(defun denote--edit-front-matter-p (file)
"Test if FILE should be subject to front matter edits.
This is relevant for operations that insert, complete, or rewrite
the front matter in a Denote note.
For the purposes of this test, FILE is a Denote note when it (i)
is a regular file, (ii) is writable, (iii) has a supported file
type extension per `denote-file-type', and (iv) is stored in the
variable `denote-directory'.
The return value is a list of strings which contains the front
matter entries that were found in FILE."
(when-let* ((ext (file-name-extension file))
((and (file-regular-p file)
(file-writable-p file)
(not (denote--file-empty-p file))
(string-match-p "\\(md\\|org\\|txt\\)\\'" ext)
;; Heuristic to check if this is one of our notes
(string-prefix-p (denote-directory) (expand-file-name default-directory)))))
(let (entries)
(dolist (rx (list denote--retrieve-title-front-matter-key-regexp
denote--retrieve-date-front-matter-key-regexp
denote--retrieve-keywords-front-matter-key-regexp
denote--retrieve-id-front-matter-key-regexp))
(when-let ((key (denote--retrieve-search file rx :key)))
(push key entries)))
entries)))
Hello again folks,
I feel we will have to put a hold on the plan to implement an insert/rewrite/complete mechanism for the front matter. I have been working on it for a few hours now and get the impression that it gets quite complex and prone to errors.
Some of the obvious problems:
-
The front matter is treated as a whole block. The format is stored in variables like
denote-org-front-matter
. While these are not user options per se, they are designed with the idea that an advanced or DIY user can rely on them to affect how the front matter looks (for example, I got an email earlier asking me how to remove thedate
). -
Our supported file types all have different syntax for the entries in the front matter. Org is
#+title
, plain text as well as Markdown+YAML istitle:
, Markdown+TOML istitle=
. Repeat this for all four entries oftitle
,tags
,date
,identifier
. -
The code we have right now is fairly straightforward because we read one variable and know how the whole block should look. If we break those into individual entries, we need four formatting variables per each of the variants. And we need the relevant code that fetches the right format to do the processing.
-
The insert operation is easy now because it is an all-or-nothing approach. We just go to the top of the buffer and add a block of text. Whereas individuated action requires that we check for context. This gets messy once we have to complete the set, because we need to add conditionality which figures out which entry is missing from the set (or entries).
-
Whether we read the file or the buffer may affect the value we retrieve. If you have a file with
title
, delete that but do not save the buffer, our functions will still find thetitle
because they read the file on disk. If, in turn, we read from the buffer, we get the opposite result.
All these add considerable complexity for the purposes of giving the user a complete block of front matter.
I then ask, why not provide a crude-yet-servicable command which is like denote
in terms of its prompts that simply inserts the formatted front matter at the top of the buffer?
The user can then edit it however they want. It's four extra lines, after all.
What do you think?
A follow-up on my previous comment. This is the sort of command I have in mind:
(defun denote-add-front-matter (file title keywords)
"PROOF-OF-CONCEPT.
Insert front matter at the top of FILE.
When called interactively, FILE is the return value of the
function `buffer-file-name'. It is checked to determine whether
it is a note for Denote's purposes.
TITLE is a string. Interactively, it is the user input at the
minibuffer prompt.
KEYWORDS is a list of strings. Interactively, it is the user
input at the minibuffer prompt. This supports completion for
multiple entries, each separated by the `crm-separator' (normally
a comma)."
(interactive
(list
(buffer-file-name)
(denote--title-prompt)
(denote--keywords-prompt)))
(denote--add-front-matter file title keywords (denote--file-name-id file)))
Hello Prot,
I understand your points and - as I wrote at the beginning of the whole discussion - I am not sure, whether my described use case really justifies the now visible, not insignificant effort the cover this edge case.
After all, I stumbled over it only when I converted my notes to denote format. Sacrificing code stability and maintainablity would not be good.
The only thing which seems important from my point of view is this: Adding forntmatter to the beginning of a file should respect an existing first line possibly already containing the setting of buffer local variables - in which case, the frontmatter should be placed from 2nd line on forward. Ohterwise, the setting will break. No?
Protesilaos Stavrou @.***> writes:
Hello again folks,
I feel we will have to put a hold on the plan to implement an insert/rewrite/complete mechanism for the front matter. I have been working on it for a few hours now and get the impression that it gets quite complex and prone to errors.
Some of the obvious problems:
The front matter is treated as a whole block. The format is stored in variables like denote-org-front-matter. While these are not user options per se, they are designed with the idea that an advanced or DIY user can rely on them to affect how the front matter looks (for example, I got an email earlier asking me how to remove the date).
Our supported file types all have different syntax for the entries in the front matter. Org is #+title, plain text as well as Markdown+YAML is title:, Markdown+TOML is title=. Repeat this for all four entries of title, tags, date, identifier.
The code we have right now is fairly straightforward because we read one variable and know how the whole block should look. If we break those into individual entries, we need four formatting variables per each of the variants. And we need the relevant code that fetches the right format to do the processing.
The insert operation is easy now because it is an all-or-nothing approach. We just go to the top of the buffer and add a block of text. Whereas individuated action requires that we check for context. This gets messy once we have to complete the set, because we need to add conditionality which figures out which entry is missing from the set (or entries).
Whether we read the file or the buffer may affect the value we retrieve. If you have a file with title, delete that but do not save the buffer, our functions will still find the title because they read the file on disk. If, in turn, we read from the buffer, we get the opposite result.
All these add considerable complexity for the purposes of giving the user a complete block of front matter.
I then ask, why not provide a crude-yet-servicable command which is like denote in terms of its prompts that simply inserts the formatted front matter at the top of the buffer?
The user can then edit it however they want. It's four extra lines, after all.
What do you think?
— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.**>
I agree with your analysis, @protesilaos. We can put this change on hold and take our time to make it right (or not make it). I might give it a try, but I will probably face the same issues.
Your denote-add-front-matter
seems good!
I understand your points and - as I wrote at the beginning of the whole discussion - I am not sure, whether my described use case really justifies the now visible, not insignificant effort the cover this edge case.
After all, I stumbled over it only when I converted my notes to denote format. Sacrificing code stability and maintainablity would not be good.
Maybe I have missed something though. Perhaps Jean-Philippe has another idea, in which case I am happy to act on it.
The only thing which seems important from my point of view is this: Adding forntmatter to the beginning of a file should respect an existing first line possibly already containing the setting of buffer local variables - in which case, the frontmatter should be placed from 2nd line on forward. Ohterwise, the setting will break. No?
You mean the first line can be something like this?
-*- some-variable: t -*-
I believe this is not allowed in Markdown with either YAML or TOML, if those are to be parsed by other programs (e.g. YAML with the Jekyll static site generator). So it would affect Org and plain text, right?
We already have code which inserts front matter to the top of the file, but does not save the buffer (the revised commands denote-rename-file
and denote-dired-rename-marked-files
). It is expected that the user inspects the file for changes, optionally using diff-buffer-with-file
. Perhaps we should make an exception if the first line starts and ends with -*-
to use the second line instead? I see no obvious problem with this, though I need to be clear we are talking about the same thing.
As for the tentative command which inserts front matter, we would adapt it accordingly.
Okay, I just saw the latest post by @jeanphilippegg: https://github.com/protesilaos/denote/issues/74#issuecomment-1207244562. In that case, we put the insert/rewrite/complete on hold for the time being.
Your
denote-add-front-matter
seems good!
Thanks! I will give it another look, document it in the manual, and push the changes.
Hello again folks! I am finalising the release of version 0.5.0
which includes the changes discussed herein. I am thus closing this issue, though feel welcome to start new threads for relevant discussions.
You mean the first line can be something like this?
-- some-variable: t --
I believe this is not allowed in Markdown with either YAML or TOML, if those are to be parsed by other programs (e.g. YAML with the Jekyll static site generator). So it would affect Org and plain text, right?
Yes, in org-mode, I might write something like this on the 1st line:
-- mode: org; org-attach-id-dir: "~/org/customer/data"; --
As for markdown, I believe you are right: There is no actual 'comment' marker supported; however, there seem to exist various workarounds, e.g. [//]: # (Comment) etc. (see e.g. https://stackoverflow.com/questions/4823468/comments-in-markdown)
So looking for: ... '--' ... '--' ... on the first line line might be a sufficient indicator?