`list index out of range`
Hi,
I am trying to run auto-corpus on some html files downloaded from PMC (e.g.: PMC1201259). I get the failed due to list index out of range. Not sure what I might be doing wrong. I will appreciate any help here.
-DT
Hi,
I am trying with the following command auto-corpus -c autocorpus/configs/config_pmc.json -t "outputTest" -f tests/data/public/html/PMC/PMC8885717.html on mac OS iTerm2, where the input file is the one provided in the repo, as well as my own input html files.
Couple of points here:
a) For non-experts, it is frustrating to sift through the code and try to find the error. The log file reveals exactly what the standard output says- INFO...ERROR....WARNING. This is counterproductive for troubleshooting.
b) The explanation for the arguments like -b is absent. Sure it is a config parameter, but what options can be given there except for PMC (beyond what is already shown in the examples). IMO, again - this is counterintuitive.
c) Modifying the code does not help either. It doesn't reflect in the re-build of the package.
On top of that, if this is really a simple error in the actual framing of the command or some other silly error on the end-user's part, then it is even more frustrating given the amount of time spent in troubleshooting and not reaching anywhere, when just a simple documentation bit or better explanation of how to run the tool could have helped.
This tool is useful, and IMO very much needed, given that there aren't other tools that can help convert html to BioC - a format which is useful for parsing PMC full-text. But, even with the intensive documentation, it doesn't help if crucial bits of how to troubleshoot the tool are missing.
-DT