nutch
nutch copied to clipboard
NUTCH-3043 Generator: count URLs rejected by URL filters
- add counters URL_FILTERS_REJECTED and URL_FILTER_EXCEPTION
- simplify logging statement
- remove unnecessary cast
Hi @lewismc:
- "use parameterized logging": done
- "augment the metrics documentation once this is merged.": will do
- "we could also create a test for the counters.": for now, TestGenerator is not based on MRUNIT. The various Generator::generate(...) return the number of generated segments without a way to access the counters (they're logged, however). I'd prefer to track this in a separate issue, because it would require to many code changes to read the counters.
Excellent @sebastian-nagel 👍 I agree
Thanks, @lewismc! The metrics wiki page was updated.