better-word-count icon indicating copy to clipboard operation
better-word-count copied to clipboard

Feature Request: Counting Citation without Brackets

Open Liong1976 opened this issue 1 year ago • 9 comments

First, thank @lukeleppan and @chrisgrieser for making the latest update that allowed this plugin to count Footnotes and Citations.

As I mentioned in the Discord, I use citations without brackets, @JOETheEffect1994, instead of [@JOETheEffect1994]. The plugin only shows that I have 1 citation, even though I have many more than one.

image

The format of the only citation that was counted was like this:

^[@JOETheEffect [p. 314]]

Mostly, I put citations in inline footnotes that like this:

^[Plura mihi bona sunt, inclinet, amari petere vellent. Ab illo tempore, ab est sed immemorabili. Ullamco laboris nisi ut aliquid ex ea commodi consequat. See @JOETheEffect [p. 314].]

I wonder if this plugin also can count citations without brackets.

I appreciate your help.

Liong1976 avatar Mar 30 '23 23:03 Liong1976

All praise should go to @chrisgrieser.

So I have never used footnotes or citations in markdown but I have looked at the possible formats that @chrisgrieser implemented.

Firstly the footnote formats (<> is not part of it):

  • Inline Footnotes: ^[<any text (minimum length of 2)>]
  • Regular Footnotes: [^<any text except whitespace (spaces)>]
  • Doesn't count regular footnotes followed by ':' e.g. [^<any text except whitespace>]: will not be counted.

Secondly the citations formats (<> is not part of it):

  • @<text(only, A-Z, a-z, 0-9)>]
  • @<text(only, A-Z, a-z, 0-9)>,
  • @<text(only, A-Z, a-z, 0-9)>;
  • Won't be counted if followed directly by '('

So citations don't have to be in brackets. Also are you sure about those square brackets around the page number because obsidian seems to consider it a link, at least in Live Preview, probably (p. 214) would be better but it works either way.

It looks like you can make yours citations work by putting a comma after like ^[Plura mihi bona sunt, inclinet, amari petere vellent. Ab illo tempore, ab est sed immemorabili. Ullamco laboris nisi ut aliquid ex ea commodi consequat. See @JOETheEffect, [p. 314].]

Please let me know if I got anything wrong or if you believe this behaviour should change in anyway.

lukeleppan avatar Mar 31 '23 10:03 lukeleppan

Hi @lukeleppan,

Thanks for your response.

First, I don't get your explanation about using <> because I don't use those symbols in my footnotes. Also, for my case, I don't see any issues with the footnote counting.

Second, if I add the comma after the citation key, Pandoc will not render it correctly,

Pandoc is supposed to render as:

See Joe Donald, The Effect of Consumerism (New York: Penguin, 1994), 314.

With the comma, Pandoc will render as:

See Joe Donald, The Effect of Consumerism (New York: Penguin, 1994), [p. 314].

Liong1976 avatar Mar 31 '23 12:03 Liong1976

Hey, thanks for the kind words, both you you 😊

As @Liong1976 says, citations without brackets are indeed valid. However, the reason why I didn't include them in the regex is that a lot of plugins use some syntax with @ – for example Natural Language Dates uses things like @today. To avoid counting all those into the citation count, I wrote the regex to only count citations in square brackets.

Like, since @something isn't part of the markdown standard, it's used for many different use cases, meaning there is no perfect solution for this. I think a reasonable approach would be to add a setting and let the user decide for themselves whether they want to count citations without brackets (and potentially having false positives due to other plugins) or only count citations with brackets.

chrisgrieser avatar Mar 31 '23 13:03 chrisgrieser

Hi @chrisgrieser,

Thanks for your response.

I understand why you didn't make this plugin include citations without brackets.

However, your proposal, letting the user choose whether they want to use brackets for the citations, looks good to me.

Liong1976 avatar Mar 31 '23 13:03 Liong1976

@chrisgrieser In addition to the case of citations without brackets being discussed here, I'd like bring the issue with dots ([@cite.key]) not being recognized, which was raised by @Gewerd-Strauss in the PR in https://github.com/lukeleppan/better-word-count/pull/79#issuecomment-1492152966, into this discussion.

In my first document, Pandoc Reference List counted 14 citations, and Better Word Count only counted 4. The only problematic cases I found were the lack of brackets already discussed and then many cases of citekeys with dots in them.

FeralFlora avatar Mar 31 '23 21:03 FeralFlora

Changing what is considered as a citekey I can do easily. Problem is rather, that with a less strict definition, there are as outlined above a lot of potential incompatibilities with other plugins.

Adding the settings to customize what should be regarded as citekey would require adding some stuff to the settings UI, which I am not familiar with, that's something @lukeleppan (or someone else familiar with svelte) will have to implement.

In the meantime, it would be really helpful, if someone knows any authoritative source which characters exactly are valid in a citekey. I have looked for something like that more than once, but haven't really found something definitive.

chrisgrieser avatar Mar 31 '23 21:03 chrisgrieser

In the meantime, it would be really helpful, if someone knows any authoritative source which characters exactly are valid in a citekey. I have looked for something like that more than once, but haven't really found something definitive.

See the Pandoc docs on Citation syntax. That's as authoritative as it gets: https://pandoc.org/MANUAL.html#citation-syntax

And perhaps also this discussion on the Pandoc repo: https://github.com/jgm/pandoc/issues/6026

FeralFlora avatar Mar 31 '23 21:03 FeralFlora

okay, so it seems is the most accurate information you can get on the topic, which should be implemented by default

Unless a citation key starts with a letter, digit, or _, and contains only alphanumerics and single internal punctuation characters (:.#$%&-+?<>~/), it must be surrounded by curly braces, which are not considered part of the key. In @Foo_bar.baz., the key is Foo_bar.baz because the final period is not internal punctuation, so it is not included in the key. In @{Foo_bar.baz.}, the key is Foo_bar.baz., including the final period. In @Foo_bar--baz, the key is Foo_bar because the repeated internal punctuation characters terminate the key. The curly braces are recommended if you use URLs as keys: [@{https://example.com/bib?name=foobar&date=2000}, p. 33].

so not all examples shared here are actually valid citekeys. Nevertheless, the specification is quite a bit more complicated than I expected with all those rules concerning punctuation; so implementing it might be a bit tricky (or at least requiring some scripting in addition to pattern matching)

chrisgrieser avatar Apr 01 '23 11:04 chrisgrieser

It might be easier to just require a .bib or .json bibliography file for this feature, and then just check the keys against that. Any citekey in the file is recognized regardless of syntax. This would be similar to how Pandoc Reference list works. There could even be some functionality convergence here.

FeralFlora avatar Apr 01 '23 11:04 FeralFlora