Success exit code even on fatal error
I have code that shells out to trafilatura for a given URL. It would be nice to be able to tell when trafilatura was successful or not. Currently, the binary's exit code does not reflect success. If I run trafilatura on a page that fails, e.g.,
trafilatura --json -u "https://www.sec.gov/Archives/edgar/data/1418091/000110465922078413/tm2220599d1_ex99-p.htm"
I see no data at all, and the exit code is 0, i.e., echo $? returns 0.
It would be better to have trafilatura do a sys.exit(-1) or something whenever a fatal error occurs. My current workaround is to treat JSON parsing errors as trafilatura extraction errors, since the empty string is invalid JSON.
Thank you so much for your work!
Hi @rozbb, in this case the download seems to fail. Thanks for your suggestion, I agree that it would be best to return another exit code.
Hi @rozbb, the commit above should work as it should. You can benefit from it by installing the latest version straight from the repository.
Fantastic, thank you! Seem to work in my test cases. Feel free to close
I still need to add a line about it in the docs but will close the issue thereafter.