Add load parameter to capture skipped rows metadata
Hey,
I understand the point of this feature being outside of the scope of tabulator (https://github.com/frictionlessdata/tabulator-py/issues/331). I think it would be an important feature to implement in load. As proposed above:
It takes in a list of dicts, each dict containing a regular expression string with once captured group, and one string that contains a column name. The regular expression is then compared to each skipped row in the data. A new column is created with the column_name as its name and the value in the capture group as its value.
If you don't think this would be useful for the general DPP/dataflows community, let me know and I can implement it in our own custom load processor.
@roll @akariv
@cschloer Let's discuss on Monday what's the best place we can put it in (PR to dataflows/custom/etc)