datasets icon indicating copy to clipboard operation
datasets copied to clipboard

There are no stats for files without lang

Open dpordomingo opened this issue 6 years ago • 2 comments

The sum of LANGS_FILES_COUNT is less than FILE_COUNT when a repository contains files wit no language detected.

In such cases, there are no stats about their size, number of lines or the amount of empty lines inside them; reported by LANGS_BYTE_COUNT, LANGS_LINES_COUNT, LANGS_FILES_COUNT, EMPTY_LINES_COUNT.

If it would be added a language unknown to the LANGS list (for example), those cases would be fully covered.

dpordomingo avatar Nov 06 '18 13:11 dpordomingo

@dpordomingo As far as I understand, unknown can be easily calculated by the 101 arithmetics you described, so not sure if it adds any value.

vmarkovtsev avatar Nov 06 '18 13:11 vmarkovtsev

To do the maths it would be needed three extra properties, providing the totals for bytes, lines and empty lines; ie: bytesCount, linesCount, emptyLinesCount

otherwise we can not do the maths, can we?

Current PGA returns for one repo:

{
    "url": "https://github.com/heroku/heroku-buildpack-scala",
    "sivaFilenames": ["eb7aa1e50236c65bf44529ebb9a75fae68e1d6b0.siva"],
    "license": "MIT:0.994",
    "langs": ["JSON", "Markdown", "Ruby", "Scala", "Shell", "Text", "YAML"],
    "langsByteCount": [585, 5528, 6850, 494, 69595, 1070, 711],
    "langsLinesCount": [23, 160, 229, 22, 2425, 10, 32],
    "langsFilesCount": [1, 2, 5, 2, 18, 1, 2],
    "emptyLinesCount": [0, 58, 19, 2, 0, 0, 1],
    "codeLinesCount": [22, 100, 112, 18, 0, 0, 29],
    "commentLinesCount": [0, 0, 8, 0, 0, 0, 0],
    "fileCount": 33,
    "commitsCount": 560,
    "branchesCount": 206,
    "forkCount": 0
}

dpordomingo avatar Nov 06 '18 14:11 dpordomingo