Feature suggestion: GenericParser mixin
On second thought after implementing #2329 I think the implementation of time_format parameter (and possibly other parameters used in generic parsers) could be a reasonable base for a new GenericParser mixin. Therefore it would be only implemented once. (I realized this now because I copied the if condition for the PR from HTML table parser and broke the DRY rule.)
What do you think @sebix @kamil-certat ?
Other parameters possibly suitable for GenericParser mixin is type_translation, default_url_protocol, ignore_values, data_type.
If we have a GenericParser Mixin with the five parameters/features listed by you, how many could use this mixin == use all of these parameters? Which functionality would the Mixin implement?
At least the validation of the parameters.
Yes it would be used only for a few bots such as HTML Table, CSV, possibly JSON and KeyValue parser. But still, it's less code without repeating it.
Hey,
I'm unsure about the implementation - mixin, a module with functions handling conversion or even descriptors - but such things should be definitely extracted. But to be honest, in your case - why you didn't use DateTime.convert(...)? I think this is a static method intended to be used to avoid code duplication. You could eventually think about a helper or so that raises the InvalidArgument when the format is wrong (or convert method failed, this is also a pythonic way 'first try, then handle'). Or maybe I have missed something?
However, I would also point out, that we may think about switching from custom parsing on everything, to using e.g. marshmallow for schema validation.
Thank you, you are right, using the DateTime.convert() is definitely the better way to do it. I will update the PR.
Frankly, I just noticed that HTML Table parser uses TIME_CONVERSIONS from DateTime and CSV parser uses it's own TIME_CONVERSIONS. So I made it the same way as HTML Table (it was the most straightforward solution). Quick and (unfortunately) dirty.
As for using the marshmallow I was thinking the same thing that such thing would be really nice to have. However I was considering pydantic (which I think would also sit better with the IntelMQ API).
Yes it would be used only for a few bots such as HTML Table, CSV, possibly JSON and KeyValue parser.
But they don't use all of these parameters. Even if one Mixin provides these parameters, the parsers themselves would again need to remove some of them.
But still, it's less code without repeating it.
I fully comply with this goal :)
It was just a suggestion of parameters I think could be useful for all generic parsers. So yes, they don't use them know, but I think they could in the future and it would be without adding any code, just moving it around a bit. :) (probably even reducing the amount of code)