obsidian-dataview icon indicating copy to clipboard operation
obsidian-dataview copied to clipboard

Bug report: task implicit `tags` field detects erroneous items that follow `#` symbol but are not tags, `file.tags` correctly follows Obsidian behavior on same items

Open AnnaKornfeldSimpson opened this issue 1 year ago • 9 comments

What happened?

The implicit tags field on Tasks incorrectly parses some items that follow a # symbol but are not tags, such as the color in <mark style="background: #FFF3A3A6;">, as tags, in operations such as GROUP BY tags. Obsidian does not recognize these items as tags, so dataview should not either. In fact, if using GROUP BY file.tags instead, the behavior is correct and no erroneous tags are invented, so the parsing error is specific to the implicit tags field on tasks. Does file.tags use some Obsidian API for getting tags that a task's tags field does not?

Found thanks to @claremacrae investigation of a similar issue in the Tasks plugin: https://github.com/obsidian-tasks-group/obsidian-tasks/issues/929 Related: #1268 seems to find the same issue with dataviewjs.

DQL

Test case (found by @claremacrae on obsidian-tasks, wrapped in codeblock to display correctly on Github, remove the codeblock when copying)

- [ ] <mark style="background: #FFF3A3A6;">Make phone call 10:50 today</mark>
- [ ] [Enable preview feature #678](https://github.com/obsidian-tasks-group/obsidian-tasks)
- [ ] [Internal: Better abstraction for grouping of tasks · Discussion #920 · obsidian-tasks-group/obsidian-tasks · GitHub](https://github.com/obsidian-tasks-group/obsidian-tasks/discussions/920)

Incorrect results (test case above is grouped by erroneous items following # that are not tags)

TASK
FROM "path/To/Test/File"
GROUP BY tags

Correct results (all in same group, no tags)

TASK
FROM "path/To/Test/File"
GROUP BY file.tags

JS

/* https://github.com/blacksmithgu/obsidian-dataview/issues/1268 Issue # 1268 seems to have repro this in JS also. I did not test JS. */

Dataview Version

0.5.41

Obsidian Version

0.15.8

OS

Windows

AnnaKornfeldSimpson avatar Jul 25 '22 19:07 AnnaKornfeldSimpson

Possible fix: change data-import/common.ts#L5 regex to require either start of line or at least one white-space in front of the tag? EDIT: I do not have time to exhaustively test this and submit a PR, but hopefully this will help someone who does! EDIT 2: Is it not possible to just get the tags from Obsidian?

AnnaKornfeldSimpson avatar Jul 30 '22 22:07 AnnaKornfeldSimpson

It is possible to get tags from Obsidian but not the way it can be extracted for each task.

AB1908 avatar Jul 30 '22 23:07 AB1908

@AB1908 said:

It is possible to get tags from Obsidian but not the way it can be extracted for each task.

Could you say a little more please about why extracting for each task does not work? I thought there was a way to convert from an Obsidian "Pos" to a line number? (though I cannot find in Obsidian API right now) CacheItems have a field with "Pos".

AnnaKornfeldSimpson avatar Jul 31 '22 00:07 AnnaKornfeldSimpson

My assumption here is that blacksmithgu is using some regex to parse the task text and also extract tags. The Obsidian metadataCache does give us pos attributes (didn't check for tags though) but then it would require correlating with the task metadata. Now that I think through it, it does seem like it should work but I imagine it was somewhat easier to throw a tag matcher in an existing task text parser.

AB1908 avatar Jul 31 '22 00:07 AB1908

My assumption here is that blacksmithgu is using some regex to parse the task text and also extract tags. The Obsidian metadataCache does give us pos attributes (didn't check for tags though) but then it would require correlating with the task metadata. Now that I think through it, it does seem like it should work but I imagine it was somewhat easier to throw a tag matcher in an existing task text parser.

Yes it is a regex currently at data-import/common.ts#L5 to extract tags. Your logic makes sense, or at least it is easier until you hit test cases like the HTML highlight one above.

AnnaKornfeldSimpson avatar Jul 31 '22 01:07 AnnaKornfeldSimpson

It was indeed easier to do per-task matching then to traverse the tags array for positions. Will see if I can swap to using the Obsidian cache instead.

blacksmithgu avatar Aug 06 '22 23:08 blacksmithgu

It was indeed easier to do per-task matching then to traverse the tags array for positions. Will see if I can swap to using the Obsidian cache instead.

Hi @blacksmithgu - just to suggest a possibly easier alternative that I think would give a pretty good result.

If you can pass file.tags in to the code that parses the task line, then that code could intersect the tags it found in the line with the tags in the file, and only keep in the Task the tags that are in both lists.

There could be some very rare cases where a tag elsewhere in the file matches a not-tag in the line, but speaking personally, in the Tasks plugin, I would take a simple fix like this, that's easy to write and review and release, over a rewrite at this point. So I wanted to suggest it to you too.

claremacrae avatar Aug 07 '22 08:08 claremacrae

Was anyone able to write a quick fix that we can deploy manually?

P.S. I can confirm that the bug is also seen when using "Link to headings", such as:

- [ ] Rewrite this section: [[Article#Section]]

alicangok avatar Sep 25 '22 16:09 alicangok

Funnily enough I am working on exactly this right now in Tasks.

When I'm happy with it, I will share the information back here.

claremacrae avatar Sep 25 '22 21:09 claremacrae