chroma icon indicating copy to clipboard operation
chroma copied to clipboard

`Analyse()` should return the score as well

Open walles opened this issue 2 years ago • 5 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues

What problem does this feature solve?

Chroma can classify text by its contents:

lexer := lexers.Analyse("package main\n\nfunc main()\n{\n}\n")

With this API though, it's not possible for me to know if even the best match is bad.

I would like to find that out, so that I can just not highlight if the text contents is uncertain.

What feature do you propose?

One possible suggestion would be to change the API to this...

lexer, certainty := lexers.Analyse("package main\n\nfunc main()\n{\n}\n")

... where certainty is a number on a well defined and documented scale.

Then, if I feel this number is too low, I could choose not to highlight anything.

walles avatar Oct 06 '23 05:10 walles

Seems reasonable, how could we do this in a backwards compatible manner?

alecthomas avatar Oct 06 '23 06:10 alecthomas

Maybe this?

lexer, certainty := lexers.AnalyseScore("package main\n\nfunc main()\n{\n}\n")

Possibly in combination with deprecating the existing function since it's sort of unpredictable.

walles avatar Oct 06 '23 07:10 walles

In what way is it unpredictable?

alecthomas avatar Oct 07 '23 21:10 alecthomas

Not sure if "unpredictable" is the right word, but let's say:

  1. I get a file
  2. lexers.Analyse() says it's a C program, with 1% confidence

This means that even though C is the "best" guess, it's still a bad guess, and it might be better to not highlight at all.

That's why I'd like to have the confidence number as well to be able to make this judgement.

walles avatar Oct 08 '23 09:10 walles