ccsmeth icon indicating copy to clipboard operation
ccsmeth copied to clipboard

IndexError while running ccsmeth call_mods

Open suvi93 opened this issue 1 year ago • 10 comments

I'm getting the following error and after this the job seems to be idling. it does not exit out and give me an error but instead keeps running without producing an output. the error is below -

=============================================== 2024-01-10 13:37:00 - INFO - [main]call_mods starts 2024-01-10 13:37:00 - INFO - cuda availability: True 2024-01-10 13:37:02 - INFO - format_features process-10612 starts 2024-01-10 13:37:02 - INFO - call_mods process-10615 starts 2024-01-10 13:37:02 - INFO - write_process-10617 starts 2024-01-10 13:37:02 - INFO - read_features process-10602 starts 2024-01-10 13:37:02 - INFO - format_features process-10608 starts 2024-01-10 13:37:02 - INFO - format_features process-10607 starts 2024-01-10 13:37:02 - INFO - call_mods process-10616 starts Process Process-1: Traceback (most recent call last): File "/crex/proj/snic2021-6-151/nobackup/Suvi/miniconda3/envs/ccsmethenv/lib/python3.8/multiprocessing/process.py", line 315, in _bootstrap self.run() File "/crex/proj/snic2021-6-151/nobackup/Suvi/miniconda3/envs/ccsmethenv/lib/python3.8/multiprocessing/process.py", line 108, in run self._target(*self._args, **self._kwargs) File "/crex/proj/snic2021-6-151/nobackup/Suvi/miniconda3/envs/ccsmethenv/lib/python3.8/site-packages/ccsmeth/_call_modifications_txt.py", line 74, in _read_features_file_to_str h_num_total = _count_holenum(features_file) File "/crex/proj/snic2021-6-151/nobackup/Suvi/miniconda3/envs/ccsmethenv/lib/python3.8/site-packages/ccsmeth/_call_modifications_txt.py", line 61, in _count_holenum holeid = words[3] IndexError: list index out of range 2024-01-10 13:37:02 - INFO - format_features process-10614 starts 2024-01-10 13:37:02 - INFO - format_features process-10603 starts 2024-01-10 13:37:02 - INFO - format_features process-10609 starts 2024-01-10 13:37:02 - INFO - format_features process-10606 starts 2024-01-10 13:37:02 - INFO - format_features process-10613 starts 2024-01-10 13:37:02 - INFO - format_features process-10604 starts 2024-01-10 13:37:02 - INFO - format_features process-10605 starts 2024-01-10 13:37:02 - INFO - format_features process-10611 starts

how do i go about fixing the index error as it says?

thanks, Suvi

suvi93 avatar Jan 10 '24 12:01 suvi93

I'm getting the same error, but it's not at the same location. Here is my backtrace: ▆

  1. ├─paste0("who_atc_", atc_root) %>% ...
  2. ├─global wrapRDS(., by_name = TRUE, scrape_who_atc(atc_root))
  3. │ └─base::eval(substitute(exprs), envir = new.env(parent = parent.frame(n = 1)))
  4. │ └─base::eval(substitute(exprs), envir = new.env(parent = parent.frame(n = 1)))
  5. │ └─global scrape_who_atc(atc_root)
  6. │ └─... %>% html_text
  7. ├─rvest::html_text(.)
  8. │ └─xml2::xml_text(x, trim = trim)
  9. └─dplyr::nth(., 3)
  10. └─vctrs::vec_size(x)

Maria-M-Oliveira avatar Nov 02 '23 13:11 Maria-M-Oliveira

Hi @hametner, @Maria-M-Oliveira; I got your very same error

A possible fix is swapping the html_text() and nth(3) in the if statement right below the "# Add root node if needed" comment; i.e., lines https://github.com/TheJena/WHO-ATC-scraper/blob/16a65dcc142ccebd792de3a98c040b065419b98d/atcd.R#L190 and https://github.com/TheJena/WHO-ATC-scraper/blob/16a65dcc142ccebd792de3a98c040b065419b98d/atcd.R#L192 but pay attention to keep the %>% operator in between the two swapped functions

for simplicity I forked the repository and this is the commit of interest https://github.com/TheJena/WHO-ATC-scraper/commit/16a65dcc142ccebd792de3a98c040b065419b98d

TheJena avatar Feb 03 '24 22:02 TheJena

Thank you for the report. This bug has been fixed in the latest version I uploaded.

fabkury avatar Aug 01 '24 17:08 fabkury