obsidian-dataview icon indicating copy to clipboard operation
obsidian-dataview copied to clipboard

Generic library & CLI

Open davecan opened this issue 4 years ago • 10 comments

Idea: convert the core query engine into a library that is then used by the plugin and can also be used by a CLI.

The ability to query structured data in Markdown files seems useful beyond just Obsidian since there are a growing set of tools that work on local .md files.

Ex:

$ mdq . --list --from #foo --where bar=baz --sort file.name
File A
File B
...
File N

Or of course just:

$ mdq . [query]
$ mdq . [list from #foo where bar=baz sort file.name]

I have some ideas on use cases where this could be very powerful. Being able to run a local script that can query data from a set of files would reduce the need for writing a lot of custom error-prone regexes. And it could be leveraged by other scripts etc.

Of course this would be something further on the roadmap, but I wanted to put it out there so if you agree with the idea you could keep it in mind as you build out the capabilities.

davecan avatar Apr 10 '21 13:04 davecan

I like it; the direct way would be to just implement it as a TS/JS node package, and then build the CLI on top of that and the plugin would just keep it as a dependency.

For performance, it would be interesting to build it in another language (C, Rust, whatever), though it would probably be difficult to interface with in Dataview.

blacksmithgu avatar Apr 11 '21 06:04 blacksmithgu

Yep pretty much what I was thinking, end up with three projects: lib, CLI, plugin.

It runs plenty fast enough for me right now, especially for a 0.2-level release, but maybe it has issues with larger vaults?

The things that stand out to me more are the fact it doesn't auto-refresh when queried notes are updated (which may be a limitation of the platform and is easily worked around by re-opening the note / going back+forward etc) and the fact it occasionally seems to "remember" a note that has been renamed as two separate notes with the same name. Clicking one of the links has a 50/50 chance of opening either the correct note or a new blank note with the same name in the default location. (if the original note is not in that same location) But this also may be a limitation imposed by Obsidian, not sure how you handle indexing/caching, didn't dig that far into the code yet. It only seems to happen occasionally and is worked around by closing and re-opening the vault (presumably forcing a cache flush of some kind?) so its not a major problem either.

Are there scenarios where it slows down considerably? File count thresholds, nesting thresholds, ... ?

davecan avatar Apr 11 '21 13:04 davecan

Even manual queries over the whole vault where you don't use the indices (doing something like LIST WHERE <thing> without a FROM statement) is pretty fast - it's just doing some simple operations over text. I would imagine performance would be fine as-is with only basic optimizations for tens of thousands of notes, maybe ~100,000 before it starts degrading. If you use FROM statements appropriately then it will work fine for much larger vaults. The big cost is the initial indexing time when you first open a vault, which could be fixed by caching index state.

The wierdness around renames is probably because I haven't properly handled renames (I handle create/delete, but rename is separate from those).

blacksmithgu avatar Apr 12 '21 04:04 blacksmithgu

Coincidentally just a couple minutes ago I deleted a file that was listed in a dataview and it wasn't updated in the view even when I went to another file then went back to the view in the same pane. It still listed the file with its original name but the link color was changed by Obsidian to reflect it was a missing file. Had to close the vault and reopen for it to be removed.

It's not a major problem, just something I literally just ran into right before coming back to this thread.

davecan avatar Apr 12 '21 14:04 davecan

Oddly, I just deleted several files and when I navigated back & forward in the view with the query it was updated correctly. So it is intermittent at best I suppose.

Edit And I just deleted a few more from another view in the same note, then when I navigated away & back to it they were not deleted. ¯\_(́ ◡◝ )_/¯

davecan avatar Apr 12 '21 20:04 davecan

I've actually wanted to do this for portability reasons in something like Rust or Python. Anyone game for that?

AB1908 avatar Oct 19 '22 09:10 AB1908

I think the core of dataview (the query part) wouldn't be that bad to implement in a Rust daemon. The main downside is we need to get a lot more rigorous with the data model (we can't change fields easily once they are in a CLI).

blacksmithgu avatar Oct 19 '22 21:10 blacksmithgu

Seems like a fun side project. I wish there was a way to rewrite the parser purely in Rust and then somehow embed it into Obsidian.

blacksmithgu avatar Oct 19 '22 21:10 blacksmithgu

I think there's a plugin that does something similar. I'll try and give it a look and present my findings here.

AB1908 avatar Oct 19 '22 21:10 AB1908

Bump because this would be really useful <3

Tejeev avatar Feb 20 '25 04:02 Tejeev