Any known python bindings?
Hi! I was wondering if there are any known python bindings or api's for the same validator? I would be a great effort if there is any.
Thanks
I've got a very simplistic script (not currently public) that launches the Scala validator as a subprocess from Python 3, checks the return code and writes out the validator output. Inputs and outputs mostly hard-coded at the moment so it would need a little work to turn into a proper module which would be a nicer approach. I'm certainly not aware of any pure Python implementation or more sophisticated API being available though.
The calling module was something I vaguely had in mind as a task for a kind of Python code club we've got running within the organisation at the moment, once we're a bit further on.
yeah i have similar usage of it too as a subprocess. But you know using subprocess is kind of hard approach to me.
Probably the most important thing is understanding the return codes. I have
try :
csvvalidator.check_returncode();
except subprocess.CalledProcessError as err:
if csvvalidator.returncode == 3 :
#this is not unexpected (a data validation issue), flag to user and continue
print(batch,"has CSV validation errors reported see",reportFile,"for details");
elif csvvalidator.returncode == 2 :
#something wrong with the schema file, all validation uses the same schema, so stop run, raising the error
print("CSV schema parsing error:",csvvalidator.stdout)
raise err;
except subprocess.SubprocessError as suberr:
## some completely unexpected error has occurred with the subprocess
raise suberr;
## write out the report file and if no error confirm pass/pass with warnings for the batch;
if csvvalidator.returncode == 0 :
if csvvalidator.stdout == "PASS\n" :
print(batch,"passed CSV validation");
else :
print(batch,"passed CSV validation with warnings see",reportFile,"for details");
Thanks @DavidUnderdown, I’m on the subprocess train as well but my code was not as sophisticated as yours. A native implementation would indeed be nice, not only because of the more idiomatic use or for performance reasons, but also because having more than one implementation around is always a good thing for something that tries to establish itself as a standard.