Yegor Bugayenko issues

Results 301 issues of


                                            Yegor Bugayenko

bibcop fails

At the moment, bibcop-action [fails](https://github.com/yegor256/sqm/actions/runs/8315547655/job/22754521556) due to many style violations in the `main.bib` file. Let's fix them all.

show the cost of dataset building, in dollars

In `report.tex`, let's show the cost of dataset building, in dollars. We can use the time and the number of CPUs, and the amount of memory.

enhancement

help wanted

create `data/summary/{metric}.csv` files

For each metric, let's make `steps/aggregate.sh` create `data/summary/{metric}.csv` files, which will have the following structure (for example, `data/summary/LOC.csv`): ``` repository,count,sum,average,mean,min,max yegor256/cam,28,500,45.3,48.2,1,90 yegor256/cactoos,... yegor256/takes,... ``` Here: * `28` is the number...

enhancement

help wanted

put CaM sources into a ZIP archive

When packaging all together, let's put CaM sources into it too. This will make the dataset more "reproducible."

enhancement

help wanted

good first issue

filter out unmaintained repositories

It's possible to detect which repository is being actively maintained, for example see this study: https://dl.acm.org/doi/abs/10.1145/3239235.3240501 Let's implement such a filtering (or a similar one) inside `discover-repositories.rb` See this one...

help wanted

good first issue

filter out repositories that are not in active development

Similar to #227 Let's filter out repositories that are not being maintained and are not in active development. Maybe [this study](https://dl.acm.org/doi/pdf/10.1145/3239235.3240501) may give a hint how to do this, with...

enhancement

help wanted

good first issue

explain metrics with more details

Let's improve the details of some metrics: * DOER * FOUT * HSD, HSE, HSV * MIDX * NCSS * NULLS Currently, they are very sketchy, which makes it hard...

bug

help wanted

good first issue

groups of metrics

There are too many metrics already in the repository, it's hard to read the final report and dataset. Let's introduce "groups". Every metric, when it's generated by a script in...

enhancement

help wanted

good first issue

enable Java 21 syntax

Currently, we only support Java 8, because we use [javalang](https://github.com/c2nes/javalang) library, which supports Java 8 (three years without any updates). Let's find a way to either replace it or maybe...

enhancement

help wanted

good first issue

filter out too small and too big repositories

Let's filter our the smallest repositories and the largest (by the number of files), maybe using this statistical approach: https://en.wikipedia.org/wiki/Percentile

enhancement

help wanted

good first issue