truegitcodechurn icon indicating copy to clipboard operation
truegitcodechurn copied to clipboard

Improve analytics data by tracking remove and add counts by line in `files`

Open NewMountain opened this issue 4 years ago • 0 comments

Related to several other issues, it would make analytics easier if the files structure, rather than storing data presently as:

{
        "README.md": {
            2: 0,
            8: 0,
            10: 0,
            11: 0,
            ...
            24: 2,
            31: 0,
            33: 1,
            35: 1,
            37: 3,
            41: 12,
        },
        "gitcodechurn.py": {
            0: 0,
            1: 190,
            2: 4,
            4: 0,
            11: -1,
            15: 6,
            16: 5,
            37: 2,
           ...
            167: 1,
            172: 0,
            173: 2,
            189: 1,
            191: 1,
            192: 5,
            193: 0,
            196: 2,
            197: 0,
            198: 25,
            200: 1,
            217: 14,
            223: 1,
            224: 1,
        },
    }

instead tracked the count of removed and count of added. This additional data would allow more detailed analytics and nuance to questions regarding user specific churn and other questions.

I propose we instead utilize a structure such as:

{
        "README.md": [
            {"added": 0, "removed": 0, "line_number": 2},
            {"added": 3, "removed": 1, "line_number": 42},
            ....
       }
}

I would be happy to submit a PR in support of this.

NewMountain avatar Nov 05 '21 04:11 NewMountain