tokei icon indicating copy to clipboard operation
tokei copied to clipboard

List of languages to potentially add

Open codesections opened this issue 5 years ago • 9 comments

I recently noticed the similar project loccount and it looks like it supports several languages that we currently don't. I'm dropping them here in case anyone is looking for good languages to add. (If this list isn't useful, please feel free to close this issue).

EDIT: there could be false positives on this list. For example, I just edited it to remove csh, which I'd originally listed even though we support C Shell. Other languages could appear on this list if we support them under a different name.

  • [ ] ABC
  • [ ] Algol60
  • [ ] Arc
  • [ ] Asciidoc
  • [x] Asm
  • [ ] Autotools
  • [ ] Awk
  • [ ] B
  • [ ] BASIC
  • [ ] BCPL
  • [ ] BLISS
  • [ ] Batchfile
  • [ ] CLU
  • [ ] CML
  • [ ] Chapel
  • [ ] ChucK
  • [ ] Cobra
  • [ ] Dylan
  • [ ] Eiffel
  • [x] Es6
  • [ ] Expect
  • [ ] Factor
  • [ ] Fantom
  • [ ] Frege
  • [ ] Hy
  • [ ] Icon
  • [ ] Io
  • [ ] J
  • [ ] Lex
  • [ ] Livescript
  • [ ] Logo
  • [ ] M4
  • [x] MATLAB
  • [ ] ML
  • [ ] MUMPS
  • [ ] Mal
  • [ ] Man
  • [ ] Metafont
  • [ ] Modula
  • [ ] Modula2
  • [ ] Modula3
  • [ ] Nroff/troff
  • [ ] Oberon
  • [ ] Occam
  • [ ] PL/1
  • [ ] POP-11
  • [ ] PostScript
  • [x] PowerShell
  • [ ] Rebol
  • [ ] Rexx
  • [ ] SETL
  • [ ] SGML
  • [ ] SNOBOL4
  • [ ] Sather
  • [ ] Sed
  • [ ] Seed7
  • [ ] Simula
  • [ ] Skew
  • [ ] Smalltalk
  • [ ] Texinfo
  • [ ] Turing
  • [ ] VRML
  • [ ] VisualBasic
  • [ ] Waf
  • [x] WebAssembly
  • [ ] Wish
  • [ ] Yacc
  • [ ] Yorick
  • [ ] Zephir

codesections avatar Mar 15 '19 14:03 codesections

I implement wasm and llvm, see #323. And I found that Lua's multi_line is not correct.

yjhmelody avatar Mar 28 '19 06:03 yjhmelody

Add Template Toolkit File (.tt extension).

Abhinickz avatar May 13 '19 12:05 Abhinickz

Missing: cython (.pyx / .pxd / .pxi), which is currently completely invisible to tokei.

masklinn avatar May 18 '19 18:05 masklinn

Jupyter notebooks (.ipynb) might be nice to include too. At the moment, a project with a lot of notebooks can look empty, because they're skipped over. They're a bit weird though, given they're JSON that stores other source code, so a notebook with 42 lines of JSON might only have 1 real line of code. For example, a notebook with a single cell print("hello world") is rendered like:

image

but the saved form is much larger:

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "hello world\n"
     ]
    }
   ],
   "source": [
    "print(\"hello world\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.9"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}

huonw avatar Mar 06 '20 05:03 huonw

@huonw This might be feasible depending on what the expectation of the output would be. From what I can tell there are few options, that I have outlined below. Do any of these match what you would expect?

  • Report and count the JSON.
  • Report as Jupyter Notebooks and count the JSON.
  • Report as Jupyter Notebooks and count the internal language.
  • Report and count the internal language.

XAMPPRocky avatar Mar 06 '20 09:03 XAMPPRocky

I think that either 2 or 3 would make most sense for my use case. 2 gives a number that's likely approximately proportional to much code there actually is in the notebook, and so allows for comparisons over time in a single project and between different directories and even projects. 3 seems like the best case, since it gives the user all the information (although I imagine it's one of the harder cases...). For either of these, I think it would be helpful to distinguish what is being measured, because I know I would forget/get confused with a label just like Jupyter Notebooks; for instance, 2 might be Jupyter (JSON) or JSON (Jupyter) and 3 might be Jupyter (Python) or Julia (Jupyter) (or whatever the language is).

I personally don't think that 1 is the best, since the executable notebooks seem quite different to static data/configuration as JSON is typically used. And similarly 4 seems misleading too, because notebooks are often quite different to the real code from the language (e.g. notebooks with examples or tutorials for a library, along side the library's real code).

(One extra detail that you may be aware of with 3 and 4 is that Jupyter notebooks can have Markdown cells in addition to the code cells, and maybe these could be summed separately under Markdown/Markdown (Jupyter)... or maybe they just count as big comments. 😄 )

huonw avatar Mar 06 '20 11:03 huonw

@huonw Support for Jupyter Notebooks has been added in version 12 https://github.com/XAMPPRocky/tokei/releases/tag/v12.0.0

XAMPPRocky avatar Jun 22 '20 12:06 XAMPPRocky

Hi, I noticed that on the top comment of this issue Matlab has been ticked as added on the top comment of this issue; however the corresponding PR was closed by its author, and from the languages.json file it's not been implemented since as far as I can tell.

Any chance that #628 can be merged? It seems like the author had pretty much done it, and it was just a matter of details about the file extensions to associate with matlab

BachoSeven avatar May 30 '21 16:05 BachoSeven

#960 Adds support for Chapel; mentioning here so it can be ticked of if that PR gets merged.

nathanielknight avatar Dec 17 '22 23:12 nathanielknight