highlight.js
highlight.js copied to clipboard
Prism vs Highlight.js: Why choose one over the other?
Related: #3619 #3621 #3623
In trying to figure out the direction of the project moving forward, where we go, how we should or shouldn't integrate with other parsers (at the 1st party level), it'd be helpful to know WHY people prefer Highlight.js to Prism - or how they think about the differences. No detail is too large or too small. Or if you evaluated other highlighting solutions (than just these two), those thoughts would be welcome here also - provided you have a few points of comparison.
The parsing engines are of course very different. But I've never really thought about which is "better" in an absolute sense... I've just always assumed that a mono-culture is bad - and that it's good we have two great choices for highlighting on the client-side - and that people obviously have reasons for preferring one over the other.
Suddenly I'm much more curious about the WHY, and what those reasons are. Anyone care to share?
Lets keep this civil and respectful... it's ok to share perceived negatives, buts lets about things like "Highlight.js sucks because" or vise versa, etc...
I'll start:
- Highlight.js often has more readable grammar definitions (for non regex wizards) - this is one of the reasons that drew me to it.
- Prism seems to have more community interest/contributors.
- Highlight.js is currently largely closed to 1st party grammars (too few maintainers).
- Prism seems more open, though I'm not sure how often new grammars are added.
@RunDevelopment How would you feel about some sort of cross-posting of this?
Prism seems to have more community interest/contributors.
This is pure speculation, but I have the feeling that Prism's popularity is tied closely to that of its author.
Highlight.js often has more readable grammar definitions (for non regex wizards)
As a contributor to both projects, I wanted to share my opinion on this. For obvious reasons, I prefer reading (and writing!) grammar definitions as used by Highlight.js. I would love to see some numbers, on how much optimized RegEx patterns contribute to the performance of Prism. If there's a significant gain, it makes sense, but developer experience is an important factor as well. Personally, I've been using retrie to optimize my contributions to Prism. In my opinion, that's a clear task for a computer and it kind of surprises me that contributions are higher Prism, because optimized Regex patterns are requested by the maintainers and not part of an automatic process. I've noticed for myself, that I'm slower at updating my Prism grammar rules because of this extra step.
How would you feel about some sort of cross-posting of this?
I just did :)
It might be good to more clearly signpost this as a survey where we are collection feedback from the community.
because optimized Regex patterns are requested by the maintainers and not part of an automatic process
Sorry, that's my fault.
I did request manually optimized keyword lists in the past, and that was honestly stupid. They were not only hard to read, write, and maintain, they were also slower than many "unoptimized" regexes. Turns out, browsers optimize character runs, so breaking up words too much can make regexes slower. Optimizing regexes based on tries sounds good in theory, but unfortunately rarely helps in practice. The old saying "premature optimization is the root of all evil" rang true once again. So I've changed course for a few years now and focus on readable regexes instead. Most of my usual nits and critiques when reviewing regexes were even automated by making a linter.
I know this isn't a Q&A, but I still wanted to respond to it, because it was directly my fault.
For me, the number one reason to use Highlight.js over anything else is the automatic language detection. I maintain a PHP library that's mostly used in forum software such as phpBB and Flarum. In my experience, most users don't fill in name of the programming language when posting code blocks. Without auto-detection or some kind of default, language-agnostic rules, there wouldn't be any highlighting.
I came to hightlight.js because I use Aciidoc-revealjs to create my slide decks and the has the best support for hightlight.js. I never worked with prism and only contributed some new keywords to hightlight, but I am think I am better with the "simple" approach taken by hightlight ;-)
Hi, im here because use highlight.js in my blog with line numbers and copy button, but cannot found a way to make also a download button. So, search for this and found prism (with download button, and seems to work as expected). Search prism and ... I'm here in this git.
So, Is there any way to make similar for highlight.js? 😅
dose Prism have all these themes?
I think a major weakness of both syntax highlighters is they're based on regular expressions. Most if not all of the languages they highlight are not regular languages. They run into issues with things such as nesting.
A syntax highlighter having proper context-free parsers would be a major reason to prefer one over all others. An earley parser with grammar data for all supported languages, for example.
I am testing out react-syntax-highlighter which offers both highlight.js and prism and for my use case highlight.js is much faster. I am basically trying to have editable code with live syntax highlighting like how its done on discord.
Prisma has better jsx/tsx support but its not worth it for the difference in speed.
highlight.js does seem to support jsx/tsx a fair amount but a small thing that messes it up is something like this
import { Comp_ } from "@component";
import { cn } from "../../Utils/cn";
import { useDarkMode } from "usehooks-ts";
const Comp = <T,/*<- this trailing comma as well as this comment*/>({className, ...props}: Comp_.Props<T>) => {
return (
<Comp_ {...props} classname={cn("w-full", className)} />
);
}
export {DatePicker};
PrismJS
- Weakness: no automatic language detection (can be solved by "CodeDetectionAPI")
- tokenization - make every words / symbols into tokens, more flexibility for making custom highlighting. eg. I can make the
operator(=,+,-,*) with my own wanted color. - because of tokenization, it might help some special use like making the token with customized css layouting
- Good for advanced usage, would like to define and identify the tokens and thus make customized CSS rendering.
- as the name suggests, this script is created to colorize something.
( I can even change the color of the ; (punctuation) with only css knowledge )
highlight.js
- Weakness: no line number, limited flexibility
- wide support of pre-defined highlighting
- no tokenization for highlight.js, meaning that it is predefined behavour for each scripting. For example,
=+-*are not the things that the author thinks that they should be colorized, then they are just plain text in the highlight.js rendering. Developers cannot change the color for just operators. Themes only apply to those the author of highlight.js believe those words/chars should be highlighted. - As the name suggests, this script is created to highlight something.
- Good for just to plug and use, give me some highlighting rather than plain text. (no need to import other scripts)
( you need to re-define the entire javascript lang script (+css) to change color for operators and ; )
Remarks
PrismJS can provide the following plugins but highlight.js can just highlight.
https://prismjs.com/plugins/copy-to-clipboard/ https://prismjs.com/plugins/line-numbers/ https://prismjs.com/plugins/line-highlight/ https://prismjs.com/plugins/show-invisibles/ https://prismjs.com/plugins/jsonp-highlight/ https://prismjs.com/plugins/remove-initial-line-feed/ https://prismjs.com/plugins/command-line/ https://prismjs.com/plugins/unescaped-markup/
I find this last comment a bit confusing/misleading on several points... so I'm hiding it. Neither library uses a simple lexer/tokenizer (that walks the code one character at a time).
- Both parsers tokenize based on regex scanning.
- Both can highlight operators/punctuation (see #2500). (depends on the grammar)
- Both have line number plugins.
I find this last comment a bit confusing/misleading on several points... so I'm hiding it. Neither library uses a simple lexer/tokenizer (that walks the code one character at a time).
- Both parsers tokenize based on regex scanning.
- Both can highlight operators/punctuation (see Discuss: Higher fidelity language highlighting (in general) #2500). (depends on the grammar)
- Both have line number plugins.
Thanks for the information.
From the https://highlightjs.org/, I cannot find any information mentioning the plugin part. No demo with line numbers.
The discussion you mentioned seems to be lengthy. Any easy way to make highlights to the operators and punctuations etc like PrismJS?
~I chose to use highlight.js in my react application for the following reasons:~
- ~https://github.com/PrismJS/prism hasn't been updated in over 2 years and it doesn't look like issues are being looked at either~ prism V2 was updates 2 weeks ago
- ~https://github.com/react-syntax-highlighter/react-syntax-highlighter hasn't been updated in over and it doesn't look like issues are being looked at either~
I liked highlightJS but unfortunately didn't end up fitting our needs for the various reasons:
- Hard to find plugins and none listed on https://highlightjs.org/
- Plugin system was hard to use, lacked functionality, and I couldn't find any good examples.
Prisme does not support language autodetection https://github.com/PrismJS/prism/issues/1313 so highlightjs is the winner, i could not find anything that does autodetection except this closed source thing https://torchlight.dev/
so the choice is easy for me
thank you highlightjs for your great work.
Plugin system was hard to use, lacked functionality,
I think we have a few good hooks, but I'd be open to adding more if they were well thought out. Open to suggestions.