caml-mimic
caml-mimic copied to clipboard
Error in concat_and_split.py function split_data
Everything fine in the notebook for mimic3 until: tr, dv, te = concat_and_split.split_data(fname, base_name=base_name)
notes_labeled.csv disch_full.csv
are OK, generated successfully but hadm_id = row[1] looks like there is an empty row somewhere in the header, no?
SPLITTING 0 read
IndexError Traceback (most recent call last)
~\Documents\GitHub\caml-mimic\dataproc\concat_and_split.py in split_data(labeledfile, base_name) 75 print(str(i) + " read") 76 ---> 77 hadm_id = row[1] 78 79 if hadm_id in hadm_ids['train']:
IndexError: list index out of range
I'm facing the exact same issue. Did you figure out a solution for this ?
I'm facing the exact same issue. Did you figure out a solution for this ?
I solved this issue just skip the empty row. only run when if len(row) > 1
Same issue. Did anyone modified the function split_data. @Benzenoil Did you changed this in a notebook or from concat_and_split.
@NeelKanwal
Yes, I just changed this from concat_and_split.py
Or you may try on Ubuntu since I did not face the same issue when I run the code under Ubuntu.
Thanks,
I tried it but error does not change. I tried to run it on Jupyter on local machine as well as Google Colab. Every other thing like constants, datamimic file placement is correct but again it is strange. I can see to try it on ubuntu but it seems to be system independent as described in readme.
If you check the generated notes_labeled.csv, you will find that there is an empty row between every two records. It is the empty rows that cause the row[1] to have an IndexError.
But how the empty rows were generated? due to line 38 in concat_and_split.py? I guess?
w.writerow([subj_id, str(hadm_id), text, ';'.join(cur_labels)])