linguist
linguist copied to clipboard
compute_stats: raise exception if tree.count_recursive > MAX_TREE_SIZE
On deep repos, liguist fails silently, which is confusing. Raise an exception instead.
Signed-off-by: Alejandro del Castillo [email protected]
Checklist:
-
[ ] I am associating a language with a new file extension.
- [ ] The new extension is used in hundreds of repositories on GitHub.com
- Search results for each extension:
- https://github.com/search?utf8=%E2%9C%93&type=Code&ref=searchresults&q=extension%3AFOOBAR+KEYWORDS+NOT+nothack
- Search results for each extension:
- [ ] I have included a real-world usage sample for all extensions added in this PR:
- Sample source(s):
- [URL to each sample source, if applicable]
- Sample license(s):
- Sample source(s):
- [ ] I have included a change to the heuristics to distinguish my language from others using the same extension.
- [ ] The new extension is used in hundreds of repositories on GitHub.com
-
[ ] I am adding a new language.
- [ ] The extension of the new language is used in hundreds of repositories on GitHub.com.
- Search results for each extension:
- https://github.com/search?utf8=%E2%9C%93&type=Code&ref=searchresults&q=extension%3AFOOBAR+KEYWORDS+NOT+nothack
- Search results for each extension:
- [ ] I have included a real-world usage sample for all extensions added in this PR:
- Sample source(s):
- [URL to each sample source, if applicable]
- Sample license(s):
- Sample source(s):
- [ ] I have included a syntax highlighting grammar.
- [ ] I have included a change to the heuristics to distinguish my language from others using the same extension.
- [ ] The extension of the new language is used in hundreds of repositories on GitHub.com.
-
[ ] I am fixing a misclassified language
- [ ] I have included a new sample for the misclassified language:
- Sample source(s):
- [URL to each sample source, if applicable]
- Sample license(s):
- Sample source(s):
- [ ] I have included a change to the heuristics to distinguish my language from others using the same extension.
- [ ] I have included a new sample for the misclassified language:
-
[ ] I am changing the source of a syntax highlighting grammar
- Old: https://github-lightshow.herokuapp.com/
- New: https://github-lightshow.herokuapp.com/
-
[ ] I am adding new or changing current functionality
- [ ] I have added or updated the tests for the new or changed functionality.
LGTM, but this will almost certainly require changes on GitHub's side.
/cc @lildude
LGTM, but this will almost certainly require changes on GitHub's side.
Yup. I'll need to find time to catch the exception as the code currently expects the empty hash.
This pull request has been automatically marked as stale because it has not had recent activity, and will be closed if no further activity occurs. If this pull request was overlooked, forgotten, or should remain open for any other reason, please reply here to call attention to it and remove the stale status. Thank you for your contributions.
This pull request has been automatically marked as stale because it has not had recent activity, and will be closed if no further activity occurs. If this pull request was overlooked, forgotten, or should remain open for any other reason, please reply here to call attention to it and remove the stale status. Thank you for your contributions.
Still don't have the bandwidth for this work on the GitHub side - will label to stop the pings.
Hello,
We bumped into this issue today, it took us some time to figure out what was wrong.
The error message proposed by this PR would have saved us quite a lot of time.
Btw, what's the reason for this hardcoded MAX_TREE_SIZE
limitation?
Since there is no way to override it we need to patch the gem code in order to be able to use this project on our repo :/
@aallrd I'd welcome a pull request to make this threshold a parameter that defaults to the current value.
@pchaigno would you be open to a PR to simply remove this threshold limitation?
@pchaigno would you be open to a PR to simply remove this threshold limitation?
I'd reject that 😁
The limit is in place because of the impact it has on performance and leads to timeouts on GitHub.com. One repo of more than 100k files on one machine is fine. Millions of them like on GitHub.com may become a problem very quickly.
@lildude makes sense, thank you for the explanation behind this limitation 👍 I'll see what I can do to add a flag to override this default value.