obsidian-dataview
obsidian-dataview copied to clipboard
Add implicit attribute: file.wordcount
Is your feature request related to a problem? Please describe. The currently available wordcount plugins only count word in individual notes. For many usecases, e.g. longform writing where you split your project across multiple notes, counting words for many notes together would be an essential feature.
Describe the solution you'd like
An implicit attribute file.wordcount
to easily display the word count. Basically what file.size
is already doing, but in words as opposed to bytes/characters.
Describe alternatives you've considered
The longform plugin does no have such a feature yet. The wordcount plugins do not offer such a feature yet. I also considered writing a small shell script with wc
and prepending the result to every file as yaml-header. However, such a solution would be impractical, as it creates yaml headers where you actually often do not want them (longform's compile does not address yaml headers yet, e.g.).
It's my first time doing this but how hard is it to do word counts? I could probably try tackling this since it seems like a minor indexing enhancement.
Well, proper word counts isn't as trivial as I thought in the beginning. Stuff like non-ascii characters (special characters in many languages), markdown markup, and questions like "should footnotes count or not, should comments count or not" etc. lead to a bit of configuration needed. You can take a look at my WordCount Dashboard: https://gist.github.com/chrisgrieser/ac16a80cdd9e8e0e84606cc24e35ad99
Sounds fairly painful, in that case. I wonder if it would make sense to add a separate plugin as a dependency that already does this well. In any case, I feel like I'm out of my depth here haha.
A very rough word count isn't too hard, but getting an accurate one that doesn't count markup/keywords can be challenging. Maybe a rough-ish one may still be useful?
Well, while I said the word count isn't trivial, I didn't say that it is hard to implement – in the word count dashboard linked above (which is a big dataviewjs snippet, after all), I basically solved most related issues 😝
When you use the code base I have already written, it probably shouldn't be that hard to get an accurate word count. For some word-count-config (include/exclude comments, etc.), it will probably make sense to either have some toggles in the dataview plugin settings, or to simply set separate attributes, similar to dataview's current file.tags
and file.etags
. Also, if you have any questions about the details of getting an accurate word count, feel free to ask me – I have spent way more time on that topic than I should have 🙈
+1 for this feature
also +1 for this
Also +1, would love to see this.
Count me in too, this would unlock all sorts of useful possibilities (like using it to make target word count progress bars for instance)
+1 for this feature :)
Also +1
Yes! +1