pygments
pygments copied to clipboard
Add lexer for CSV
similar to text/plain
@birkenfeld can you grant me access to the repo so that I can push this change? My plan is to extend the Special Lexer similar to TextLexer
_mapping.py
'CSVLexer': ('pygments.lexers.special', 'CSV', ('csv',), ('*.csv',), ('text/csv',)),
special.py all = ['CSVLexer', 'TextLexer', 'OutputLexer', 'RawTokenLexer']
class CSVLexer(Lexer): """ "Null" lexer, doesn't highlight anything. """ name = 'CSV' aliases = ['csv'] filenames = ['*.csv'] mimetypes = ['text/csv'] priority = 0.01
def get_tokens_unprocessed(self, text):
yield 0, Text, text
def analyse_text(text):
return CSVLexer.priority
You're welcome to submit a PR.
However, the change you're suggesting here won't be accepted, since it's just a copy of the TextLexer. At the very least, CSV should make an effort to separate fields and commas, and handle quoting (althought that's already tricky since there is different quoting conventions). All in all I'm skeptical that a CSV lexer is worth it.
@birkenfeld there is a lexer implemented here https://github.com/fish2000/pygments-csv-lexer Nice thing is it highlights each column of csv which is very useful for visualizing data. Is it possible to incorporate it as a builtin for pygments?
Is it possible to incorporate it as a builtin for pygments?
Feel free to submit a PR :-)
A possible alternative would be a filter that automatically adds tabs or such so that the columns end up aligned.
@Jean-Abou-Samra not sure I understand this filter. Could you give more details on how to accomplish that?
See https://pygments.org/docs/filterdevelopment/. Basically, you could write
- a CSV lexer that just recognizes common value formats such as numbers, and produces tokens such as
Number
,Text
, andPunctuation
, the latter used for the commas, - a 'CSV tabulator' filter that is fed with lexer output and inserts a TAB character after each comma (or pads with spaces?) so that the columns end up aligned.