addlicense icon indicating copy to clipboard operation
addlicense copied to clipboard

use go-enry to identify generated, vendored and other types of code

Open willnorris opened this issue 4 years ago • 1 comments

I just discovered go-enry, which GitHub is using in parts of their new code search. It looks like there may be a bit there that would be useful for automatically excluding generated, vendored, and binary files. I don't think they track comment style, so we couldn't replace that, but this other stuff looks useful.

willnorris avatar Dec 09 '21 00:12 willnorris

This does seem useful. Detecting vendor directories would improve walk performance (don't need to traverse the whole node_modules tree, and don't need all users to manually ignore it).

GetLanguage(filename, content) could be useful for formatting stdin to stdout (see #64) and disambiguating file types that use the same extension (.m could be Objective-C or Matlab, for example). This would require some refactoring to work with "language" as a concept rather than simple filename matching.

flwyd avatar Apr 26 '25 18:04 flwyd