typedb-bio icon indicating copy to clipboard operation
typedb-bio copied to clipboard

Unicode error when installing DGldb

Open jackn11 opened this issue 2 years ago • 0 comments

When running python migrator.py -n 4 --force True I get the following error.

Opening DGIdb...

  Downloading dataset
100% [..........................................................................] 2809387 / 2809387  Finished downloading
  Starting with drugs.
  Drugs inserted! (57497 entries)
  Downloading drug-gene interactions dataset
100% [..........................................................................] 9512574 / 9512574  Finished downloading
Traceback (most recent call last):
  File "C:\Users\jackn\TypeDBBio\typedb-bio\migrator.py", line 60, in <module>
    migrate_dgibd(session, NUM_DR, NUM_INT, args.num_threads, args.commit_batch)
  File "C:\Users\jackn\TypeDBBio\typedb-bio\Migrators\DGIdb\DGIdbMigrator.py", line 18, in migrate_dgibd
    insert_interactions(session, num_int, num_threads, batch_size)
  File "C:\Users\jackn\TypeDBBio\typedb-bio\Migrators\DGIdb\DGIdbMigrator.py", line 68, in insert_interactions
    raw_file = openFile(file, num_int)
  File "C:\Users\jackn\TypeDBBio\typedb-bio\Migrators\Helpers\open_file.py", line 9, in openFile
    for row in csvreader:
  File "C:\Users\jackn\AppData\Local\Programs\Python\Python310\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 4295: character maps to <undefined>
(.venv) PS C:\Users\jackn\TypeDBBio\typedb-bio>

jackn11 avatar Jul 18 '22 22:07 jackn11