nutch icon indicating copy to clipboard operation
nutch copied to clipboard

NUTCH-3043 Generator: count URLs rejected by URL filters

Open sebastian-nagel opened this issue 1 year ago • 2 comments

  • add counters URL_FILTERS_REJECTED and URL_FILTER_EXCEPTION
  • simplify logging statement
  • remove unnecessary cast

sebastian-nagel avatar Apr 25 '24 15:04 sebastian-nagel

Hi @lewismc:

  • "use parameterized logging": done
  • "augment the metrics documentation once this is merged.": will do
  • "we could also create a test for the counters.": for now, TestGenerator is not based on MRUNIT. The various Generator::generate(...) return the number of generated segments without a way to access the counters (they're logged, however). I'd prefer to track this in a separate issue, because it would require to many code changes to read the counters.

sebastian-nagel avatar Apr 27 '24 13:04 sebastian-nagel

Excellent @sebastian-nagel 👍 I agree

lewismc avatar Apr 28 '24 17:04 lewismc

Thanks, @lewismc! The metrics wiki page was updated.

sebastian-nagel avatar May 14 '24 15:05 sebastian-nagel