Philippe Ombredanne

Results 987 comments of Philippe Ombredanne

@tardyp note that I have done quite a bit of research on how to parse gradle builds at least the Groovy kind, and we could likely consider the Kotlin kind...

@tardyp FYI @JonoYang is contributing some support for gradle in #2822

I think we should also support first the standard Gradle lockfile: https://docs.gradle.org/current/userguide/dependency_locking.html - Names: `gradle.lockfile` and buildscript-gradle.lockfile` - Content: This is an ini or properties-like file: > Each line still...

That's an interesting class of errors! I guess that they all come from binaries? And because it is useful, we cannot stop detecting in binaries. Some remarks: These are at...

Note that the adoption of https://github.com/nexB/pygmars/ as a replacement for NLTK should allow the easier reuse and integration of other libraries in the lexing process including NER and giberish detection....

Another candidate for gibberish that works quite well is https://github.com/domanchi/gibberish-detector

I ran this with [bad-copyright-detections.txt](https://github.com/nexB/scancode-toolkit/files/5985058/bad-copyright-detections.txt) - `pip install gibberish-detector` - `gibberish-detector train examples/big.txt > big.model` - in python: ```Python from gibberish_detector import detector Detector = detector.create_from_model('big.model') data = sorted(set(open('bad-copyright-detections.txt').read().split())) for...

very nice! what's your take on applicability to license then? Did you apply some boosting to legalese words?