zk
zk copied to clipboard
zk graph --format json is not handling note titles with inner double quotes
Check if applicable
- [X] I have searched the existing issues (required)
- [X] I'm willing to help fix the problem and contribute a pull request
Describe the bug
I have four notes with inner double quotes in their titles, e.g. This is a "note" title
.
Running zk graph --json > notes.json
dumps json fine except for these notes. In their place, is a singular comma:
3 ▎ {filename:h5u1.md,filenameStem:h5u1,path:h.....<valid json continues>
E 2 ▎ , ■ Value expected
1 ▎ {filename:jbig.md,filen......<valid json continues>
The weird thing is that there's only three instances of this, despite there being four notes with this quote pattern. Additionally, all four notes are not in the json file. In other words, three have been replaced with a singular column, and one has vanished altogether.
Additionally, it seems like notes with double quotes in their title create some strange behaviour. it is exclusively those notes that are printed to stdout, and after the printing of each note's body, my zsh environment variables are printed as well. Which is super weird, and would freak me out if I didn't know that this project is a legitimate project run by honest people haha.
I'm sure that will be fixed, once the strings get handled, but just wanted to put it out there.
How to reproduce?
- run
zk graph --json > notes.json
and check if it's already bug free - create a note
This is a "test" note
- run
zk index
- run
zk graph --json > notes2.json
Theoretically, there should be a singular line with a comma
zk configuration
# zk configuration file
#
# Uncomment the properties you want to customize.
# NOTE SETTINGS
#
# Defines the default options used when generating new notes.
[note]
language = "en"
default-title = "untitled"
filename = "{{id}}"
extension = "md"
template = "default.md"
# Path globs ignored while indexing existing notes.
ignore = [
"d/**",
]
# Configure random ID generation.
id-charset = "alphanum"
id-length = 4
id-case = "lower"
# EXTRA VARIABLES
#
# A dictionary of variables you can use for any custom values when generating
# new notes. They are accessible in templates with {{extra.<key>}}
[extra]
#key = "value"
# GROUP OVERRIDES
# Omitting `paths` is equivalent to providing a single path equal to the name of
# the group. This can be useful to quickly declare a group by the name of the
# directory it applies to.
[group.daily]
paths = ["d/2024"]
[group.daily.note]
filename = "{{format-date now}}"
extension = "md"
template = "daily.md"
# MARKDOWN SETTINGS
[format.markdown]
link-format = "markdown"
# in case for some reason, a white space file makes it into my vault
link-encode-path = true
link-drop-extension = true
# use YAML tags only
hashtags = false
colon-tags = false
multiword-tags = false
# EXTERNAL TOOLS
[tool]
# Command used to preview a note during interactive fzf mode.
# Set it to an empty string "" to disable preview.
# bat is a great tool to render Markdown document with syntax highlighting.
#https://github.com/sharkdp/bat
#fzf-preview = "bat -p --color always {-1}"
# LSP
[lsp]
[lsp.diagnostics]
# Each diagnostic can have for value: none, hint, info, warning, error
# wiki-title = "hint"
dead-link = "error"
[lsp.completion]
# Show the note title in the completion pop-up, or fallback on its path if empty.
note-label = "{{title-or-path}}"
# Filter out the completion pop-up using the note title or its path.
#note-filter-text = "{{title}} {{path}}"
# Show the note filename without extension as detail.
#note-detail = ""
# NAMED FILTERS
[filter]
# Matches the notes created the last two weeks. For example:
# $ zk list recents --limit 15
recents = "--sort created- --created-after 'last two weeks'"
# COMMAND ALIASES
[alias]
# Show a random note.
lucky = '$EDITOR $(zk list --format path --sort random --limit 1)'
daily = 'zk new --no-input "$ZK_NOTEBOOK_DIR/d/`date +%Y`"'
s = 'zk edit $ZK_NOTEBOOK_DIR/i/149x.md'
random-orphan = '$EDITOR $(zk list --orphan --format path --sort random -n1 -x d -x g)'
rg = 'rg $1 $(zk list --format path --delimiter " " $2 $3 $4 $5)'
conf = '$EDITOR "$ZK_NOTEBOOK_DIR/.zk/config.toml"'
# Edit the last modified note.
last = "zk edit --limit 1 --sort modified- $@"
# Edit the notes selected interactively, "recents" is a filter from above
recent = "zk edit recents --interactive"
# Print paths separated with colons for the notes found with the given
# arguments. This can be useful to expand a complex search query into a flag
# taking only paths. For example:
# zk list --link-to "`zk path -m potatoe`"
#path = "zk list --quiet --format {{path}} --delimiter , $@"
# Returns the Git history for the notes found with the given arguments.
# Note the use of a pipe and the location of $@.
#hist = "zk list --format path --delimiter0 --quiet $@ | xargs -t -0 git log --patch --"
Environment
zk 0.14.0
system: Darwin 22.6.0 arm64
@mcDevnagh @julio-lopez @jurica
Would it be a case of manually sanitizing the note titles pre-json output? e.g, in the internal/core/note_format.go file?
I had a look at the go json.Marshal
documentation.
It states something about a syntax like below being possible?
Filename string `json:"filename,string"`
Which should format it appropriately for wrapping in double quotes, but it didn't work on first try.
We shouldn't have to sanitize. encoding/json does the sanitation for us. For example,
package main
import "encoding/json"
type Test struct {
Title string
}
func main() {
json, err := json.Marshal(Test{
Title: `"`,
})
if err != nil {
panic(err)
}
print(string(json))
}
outputs
{"Title":"\""}
Is json.Marshal
outputting everything in zk graph --json
? This would be an upstream issue, but I'd find that very surprising. More likely we're calling json.Marshal
for each note, and ignoring errors (or simply logging them). If that is the case, the best place to start fixing this would be to look at that error.
Ok great, thanks for this. As a heads up, I won't be able to look at this (or anything else) until March 17. I'll be on holiday 🏝️
note: narrowed this down a little bit and it actually looks like the culprit is osEnv()
call in newNoteFormatter
.
When it's getting called in graph.go, line 70, something about the double quoted title is breaking it. This is why it spits out the entire shell env variables to stdout
. The double quotes in the file name is not the culprit as I'm using ids as file names.
Changing the file title from: what does "this" mean?
to what does 'this' mean?
executes without error.
It's actually due to the quotes not being escaped in link_format.go.
These functions are used to render the links within documents as well. So escaping the double quotes here will render links within notes with the backslashes: [[a linked \"note\"]]
. This of course breaks the links.
Does that mean that the true culprit is links to titles with quotes and not titles with quotes themselves?
In other words, quotes are being escaped properly everywhere in json output except the string for the filename's own link.
If I hardcode the problem field output to foo
, the call to zk graph --format json
returns json without error.
{
"filename": "another \"test\" note.md",
"filenameStem": "another \"test\" note",
"path": "another \"test\" note.md",
"absPath": "/Users/tjex/.local/src/zk-org/workbench/test-vault/another \"test\" note.md",
"title": "another \"test\" note",
"link": "foo",
"lead": "",
"body": "",
"snippets": [],
"rawContent": "# another \"test\" note\n\n\n",
"wordCount": 4,
"tags": [],
"metadata": {},
"created": "2024-04-01T03:11:03.540446352Z",
"modified": "2024-04-01T03:11:05.217630323Z",
"checksum": "a121f8d06da1bff78c5e5eb4816d0417ab2ec22317bad43baedb51a20c60df68"
}
Otherwise it would in effect return this invalid json:
{
"filename": "another \"test\" note.md",
"filenameStem": "another \"test\" note",
"path": "another \"test\" note.md",
"absPath": "/Users/tjex/.local/src/zk-org/workbench/test-vault/another \"test\" note.md",
"title": "another \"test\" note",
+ "link": "[[another "test" note]]", <- the quotes around test are not escaped
"lead": "",
"body": "",
"snippets": [],
"rawContent": "# another \"test\" note\n\n\n",
"wordCount": 4,
"tags": [],
"metadata": {},
"created": "2024-04-01T03:11:03.540446352Z",
"modified": "2024-04-01T03:11:05.217630323Z",
"checksum": "a121f8d06da1bff78c5e5eb4816d0417ab2ec22317bad43baedb51a20c60df68"
}
With the initial idea for the fix in #400 , the output looks like this:
{
"filename": "another \"test\" note.md",
"filenameStem": "another \"test\" note",
"path": "another \"test\" note.md",
"absPath": "/Users/tjex/.local/src/zk-org/workbench/test-vault/another \"test\" note.md",
"title": "another \"test\" note",
"link": "[[another \"test\" note]]",
"lead": "",
"body": "",
"snippets": [],
"rawContent": "# another \"test\" note\n\n\n",
"wordCount": 4,
"tags": [],
"metadata": {},
"created": "2024-04-01T03:11:03.540446352Z",
"modified": "2024-04-01T03:11:05.217630323Z",
"checksum": "a121f8d06da1bff78c5e5eb4816d0417ab2ec22317bad43baedb51a20c60df68"
}
Links are still rendered correctly within documents:
# note title
[another "test" note](another%20%22test%22%20note)
[[another "test" note]]
I have a lot of notes with titles that contain quotes and I have recently noticed this issue when I was trying to export notes in JSON format
zk list -Pqf json | jq
I believe I have identified the source of this problem: #440.
fixed with #440