linguist
linguist copied to clipboard
Extension popularity monitoring
I'm opening this issue to track the popularity of submitted file extensions (pull requests with the label Pending Popularity
). I've used Harvester to count the number of repositories for all extensions where it makes sense (i.e., if the number of files is below 200 we already know there won't be hundreds of repositories as required). All numbers were updated today, in the last few hours.
I'll update the numbers in a few months and will probably stop counting for extensions whose number of repositories stagnated or reduced.
Extension | PR | Files | Repo. | Date | +/- at Feb. 21 |
---|---|---|---|---|---|
~~.mxl ~~ |
#3651 | 106 | - | Aug. 4 18 | +0 files |
~~.exw ~~ |
#3754 | 618 | 41 | Aug. 4 18 | +8 repo. |
~~.exu ~~ |
#3754 | 54 | - | Aug. 4 18 | +4 files |
~~.sarl ~~ |
#3772 | 252 | 49 | Aug. 4 18 | +5 repo. |
~~.hsig ~~ |
#3855 | 459 | 64 | Aug. 4 18 | +21 repo. |
~~.imba ~~ |
#3869 | 611 | 58 | Aug. 4 18 | +57 repo. |
~~.coco ~~ |
#3872 | 641 | 79 | Aug. 4 18 | +11 repo. |
~~.pbf ~~ |
#3926 | 62 | - | Aug. 4 18 | +0 files |
~~.pbp ~~ |
#3926 | 133 | - | Aug. 4 18 | +11 files |
~~.smk ~~ |
#3953 | 1k | 157 | Aug. 4 18 | +76 repo. |
~~.zig ~~ |
#4005 | 904 | 80 | Aug. 4 18 | +144 repo. |
~~.rho ~~ |
#4071 | 222 | 19 | Aug. 4 18 | +33 repo. |
~~.wren ~~ |
#4088 | 112 | - | Aug. 4 18 | +110 files, 55 repo. |
~~.archimate ~~ |
#4128 | 581 | 258 | Aug. 4 18 | +49 repo. |
~~.varlink ~~ |
#4164 | 43 | - | Aug. 4 18 | +5 files |
~~.pq ~~ |
#4191 | 701 | 147 | Aug. 4 18 | +24 repo. |
~~.pqm ~~ |
#4191 | 51 | - | Aug. 4 18 | +26 files |
~~.m ~~ |
#4191 | 84 | - | Aug. 4 18 | -19 files |
~~.asx ~~ |
#4193 | 706 | 122 | Aug. 4 18 | +5 repo. |
~~.jsonnet ~~ |
#2653 | 7k | 93 | Oct. 3 18 | +18 repo. |
~~.htmlx ~~ |
#4323 | 5 | - | Nov. 14 18 | +0 files |
~~.rego ~~ |
#4371 | 296 | 80 | Jan. 20 19 | - |
~~cabal-ghcjs.project ~~ |
#4419 | 71 | 70 | Mar. 3 19 | - |
~~.asddls ~~ |
#4614 | 804 | 109 | Aug. 31 19 | +140 repos. |
.daml |
#4523 | 155 | 37 | Aug. 31 19 | +122 repo. |
.aplf |
#4526 | 2709 | 33 | Aug. 31 19 | +16 repo. |
.carp |
#4530 | 362 | 75 | Aug. 31 19 | +466 files. -4 repo *. |
.scilla |
#4635 | 559 | 61 | Sept. 15 19 | +21 repo. |
~~.raku ~~ |
#4731 | 74 | 29 | Jan. 7 20 | - |
~~.rakumod ~~ |
#5168 | 200 | 9 | Jan. 7 20 | +259 repo. |
.curry |
#5111 | 4437 | 56 | Jan. 6 21 | +403 files, -18 repo *. |
.ispc |
#5191 | 6790 | 142 | 9 Feb. 21 | - |
.isph |
#5191 | 3074 | 158 | 9 Feb. 21 | - |
.hla |
#5194 | 12618 | 71 | 9 Feb. 21 | - |
Please avoid discussing issues with specific extensions (such as an improved search query) here and prefer the associated pull request.
* GitHub's search indexing changed at the end of 2020 to only index repositories active in the last year. A decrease in repositories indicates less activity and a potential decrease in overall popularity.
What's the acceptance criteria? archimate has over 200 repositories and yet it's in this list with its PR closed.
What's the acceptance criteria?
That's going to depend on the extension. Hundreds of repositories is a rule of thumb; for a very specific extension with few chances of conflicts, such as .archimate
, I'd be in favor of adding it now, with only 200-300 repositories; others will need a few hundreds more (think .m
which we already have 7 languages associated to).
archimate has over 200 repositories and yet it's in this list with its PR closed.
I'm working on a new XML Strategy that should handle the .archimate
case. If it doesn't work out, I'll reopen the pull request and I'll invite its author to update the branch. In any case, we'll discuss it in #4128.
Excellent idea @pchaigno! And very useful too.
Thanks.
@pchaigno thanks for doing this!
Can we remove the .m file extension from consideration for #4191 then? The code change only includes .pq and .pqm file support (specifically because we didn't want to conflict with the existing .m file highlighters). I think .m was mentioned in the PR comments because that was our old convention (and no longer used).
Zig update - nov 4, 2018 - 1547 files, 151 repos Zig update - nov 21, 2018 - 1812 files, 186 repos
Hello @pchaigno
I have just run harvester on Zig extension and came up with these results:
- 1861 files
- 203 repos
- 93 unique users
https://gist.github.com/andrewrk/33650712e74a65873dd84d82d25f449a
It is time to re-open #4005
This issue has been automatically marked as stale because it has not had activity in a long time. If this issue is still relevant and should remain open, please reply with a short explanation (e.g. "I have checked the code and this issue is still relevant because ___."). Thank you for your contributions.
It's been five and a half months. I've update the numbers in the initial post with new counts from today. Those numbers are also below, with the decisions I think we should take as a consequence concerning support of these extensions in Linguist. The three codes for decisions are:
- Drop: I think we should stop monitoring adoption for that extension as it's growing too slowly. Of course, it doesn't mean we'll never add it to Linguist.
- Accept: I think that extension has achieved a high enough adoption to mandate support in Linguist.
- Wait: Adoption is growing and I think we should wait for another ~6 months and see then.
@lildude @Alhadis Could you take a look? If you agree with the "Decisions", I'll post an update for the accept on their original pull requests.
Extension | PR | Initial adoption | +/- at Jan. 20 | Decision? |
---|---|---|---|---|
.mxl |
#3651 | 106 files | +0 files | Drop |
.exw |
#3754 | 41 repo. | +8 repo. | Drop |
.exu |
#3754 | 54 files | +4 files | Drop |
.sarl |
#3772 | 49 repo. | +5 repo. | Drop |
.hsig |
#3855 | 64 repo. | +21 repo. | Wait |
.imba |
#3869 | 58 repo. | +57 repo. | Wait |
.coco |
#3872 | 79 repo. | +11 repo. | Wait |
.pbf |
#3926 | 62 files | +0 files | Drop |
.pbp |
#3926 | 133 files | +11 files | Drop |
.smk |
#3953 | 157 repo. | +76 repo. | Wait |
.zig |
#4005 | 80 repo. | +144 repo. | Accept* |
.rho |
#4071 | 19 repo. | +33 repo. | Wait |
.wren |
#4088 | 112 files | +110 files, 55 repo. | Wait |
.archimate |
#4128 | 258 repo. | +49 repo. | Accept |
.varlink |
#4164 | 43 files | +5 files | Drop |
.pq |
#4191 | 147 repo. | +24 repo. | Wait |
.pqm |
#4191 | 51 files | +26 files | Drop |
.m |
#4191 | 84 files | -19 files | Drop |
.asx |
#4193 | 122 repo. | +5 repo. | Drop |
.jsonnet |
#2653 | 93 repo. | +18 repo. | Accept** |
.htmlx |
#4323 | 5 files | +0 files | Drop |
* Support for the .zig
extension was already merged.
** That is 93 and 111 repositories among those I downloaded. I downloaded about 2k among the 7/11k .jsonnet
files. I might be missing something here, but I don't see why we didn't add that extension 6 months ago...
Sounds like a reasonable approach to me. 👍 I say go for it.
@lildude Could you take a look to the above please?
Could you take a look to the above please?
Only if you take a look at https://github.com/github-linguist/babel-sublime/pull/1 😜
If you agree with the "Decisions", I'll post an update for the accept on their original pull requests.
Make sense to me. 👍 go for it.
Could you take a look to the above please?
Only if you take a look at github-linguist/babel-sublime#1
Ah, sorry! I'm not sure how I did not see this...
Little nitpick in the original list: the Carp PR was #4530, not #4526 (which was APL) :)
@hellerve Ah, thanks! I forgot to change from the previous line.
@pchaigno any chance you could review the numbers for #4371 again? We've watched sustained growth over the past 8-9 months when it was originally raised. I believe there are over 200 repos now.
I ran the numbers again, after 8 months. +/- Sept. 13 corresponds to the delta since the last count (in January for most extensions).
As a reminder, the three codes for decisions are:
- Drop: I think we should stop monitoring adoption for that extension as it's growing too slowly. Of course, it doesn't mean we'll never add it to Linguist.
- Accept: I think that extension has achieved a high enough adoption to mandate support in Linguist.
- Wait: Adoption is growing and I think we should wait for another ~6 months and see then.
Extension | PR | Initial adoption | +/- Jan. 20 | +/- Sept. 13 | Decision? |
---|---|---|---|---|---|
.hsig |
#3855 | 64 repo. | +21 repo. | +17 repo. | Drop |
.imba |
#3869 | 58 repo. | +57 repo. | +50 repo. | Drop* |
.coco |
#3872 | 79 repo. | +11 repo. | +21 repo. | Drop |
.smk |
#3953 | 157 repo. | +76 repo. | +107 repo. | Accept |
.rho |
#4071 | 19 repo. | +33 repo. | +9 repo. | Drop |
.wren |
#4088 | 112 files | +110 files, 55 repo. | +1 repo. | Drop |
.pq |
#4191 | 147 repo. | +24 repo. | +60 repo. | Accept |
.rego |
#4371 | 80 | - | +137 | Accept |
cabal-ghcjs.project |
#4419 | 70 repo. | - | +18 repo. | Drop |
* I hesitated between Wait (again) and Drop for the .imba
file extension. Since adoption is slowing down (+50 in 8 months compared to +57 in 5.5 months before) I lean toward dropping the extension. It doesn't mean we won't reconsider it in the future of course.
@lildude @Alhadis Could you take a look? If you agree with the "Decisions", I'll post an update for the accept on their original pull requests.
@lildude @Alhadis Could you take a look? If you agree with the "Decisions", I'll post an update for the accept on their original pull requests.
👍 from me on all your decisions.
Sorry @pchaigno, missed your /cc. 👍 from me too.
FYI, running Harvester on the given searches now shows about 78 repositories for each of .raku and .rakumod.
Could .rakutest
be added to extension tracking? It initially wasn't, because it was used by "7 files, 3 repos, 3 users" (which is not surprising, as at that time Raku test framework did not support .rakutest
extension).
I don't expect rakutest
to be accepted at the same time as raku
and rakumod
, but tracking its popularity may make sense.
Also, low popularity of the extensions is to be expected at this point, as RFC for extension change (https://github.com/Raku/problem-solving/blob/master/solutions/language/Path-to-Raku.md) recommends not changing the extensions until 6.e release (6.e wasn't released yet).
@xfix It looks like .rakutest
is still used by a handful of users only. The purpose of this issue is to track usage for extensions that we expect to reach our threshold soon. It doesn't look like that's the case for .rakutest
. Don't hesitate to ping me if that changes.
Update on .daml
(#4523) as of 2020.04.24:
- 1338 unique files (was 155, +1183 since 2019.08.31)
- 135 unique repositories (was 37, +98 since 2019.08.31)
Ran Harvester with extension:daml Party
.
Raw data: https://gist.github.com/stefanobaghino/6439072413623b0a9b05aabeaeebb4e6
Unique repository count method
cat daml.txt | awk -F '/' '{print $1 "//www.github.com/" $4 "/" $5 }' | sort -u | wc -l
Disclaimer: with my other account @stefanobaghino-da, I'm a member of @digital-asset, which is the author of the DAML language. I ran Harvester with my private account (the one I'm using right now) to exclude private repositories.
@pchaigno I've added .curry
to the list. Just an FYI in case you need to update any extension lists locally.
@lildude Could we pin this issue?
@lildude Could we pin this issue?
✅ Done.
Can all of the entries be updated? Its been a year.
Can all of the entries be updated? Its been a year.
I'd like this as well, specifically for Wren support.
Could .rakutest
be added to extension tracking? It has 584 files.
Hi 👋🏻
How can we add a language to the pending popularity section?
How can we add a language to the pending popularity section?
Languages are normally added when a PR is opened that doesn't quite yet meet the documented usage requirements, though we've not really been maintaining this list much as PRs no longer auto-close.
@lildude I would be interested in having a look if Daml now meets the popularity requirements, should I report my findings back in https://github.com/github/linguist/pull/4523 and re-open or open a new one?