chanjo
chanjo copied to clipboard
Chanjo failed to link ccds.15.grch37p13.extended.bed
Chanjo version 4.2.0 installed via pip. This is the log.
chanjo init -a -f
2018-08-31 11:38:49 cybertron chanjo.init.cli[10816] INFO setting up chanjo under: /data/coverage180830
2018-08-31 11:38:49 cybertron chanjo.init.bootstrap[10816] INFO downloading... [https://s3.eu-central-1.amazonaws.com/chanjo/ccds15.grch37p13.extended.bed.zip]
2018-08-31 11:38:50 cybertron chanjo.init.bootstrap[10816] INFO extracting BED file...
2018-08-31 11:38:50 cybertron chanjo.init.bootstrap[10816] INFO removing BED archive...
2018-08-31 11:38:50 cybertron chanjo.init.cli[10816] INFO configure new chanjo database: sqlite:////data/coverage180830/chanjo.coverage.sqlite3
2018-08-31 11:38:51 cybertron chanjo.store.api[10816] INFO created tables: sample, transcript, transcript_stat
2018-08-31 11:38:51 cybertron chanjo.init.cli[10816] INFO writing config file: /data/coverage180830/chanjo.yaml
Chanjo bootstrap successful! Now run:
chanjo --config /data/coverage180830/ccds.15.grch37p13.extended.bed
(chanjo) coverage180830 ➤ chanjo --config /data/coverage180830/ccds.15.grch37p13.extended.bed
adding transcripts [------------------------------------] 0%
Traceback (most recent call last):
File "/home/user/bin/opt/anaconda/envs/chanjo/bin/chanjo", line 11, in <module>
sys.exit(root())
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 722, in __call__
return self.main(*args, **kwargs)
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 697, in main
rv = self.invoke(ctx)
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 895, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 535, in invoke
return callback(*args, **kwargs)
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/decorators.py", line 17, in new_func
return f(get_current_context(), *args, **kwargs)
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/chanjo/load/cli.py", line 60, in link
for tx_model in bar:
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/_termui_impl.py", line 259, in next
rv = next(self.iter)
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/chanjo/load/link.py", line 24, in <genexpr>
models = (make_model(tx_id, exons) for tx_id, exons in transcripts.items())
File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/chanjo/load/link.py", line 40, in make_model
gene_id = int(exons[0]['elements'][transcript_id]['gene_id'])
ValueError: invalid literal for int() with base 10: '-'
This is becuase the current version is intended to work with hgnc.bed
file which has different format comparing to ccds.15.grch37p13.extended.bed.
head ccds.15.grch37p13.extended.bed
X 100075402 100075464 X-100075405-100075462 0 - CCDS14473.1 CSTF2
X 100076512 100076595 X-100076515-100076593 0 - CCDS14473.1 CSTF2
X 100077236 100077410 X-100077239-100077408 0 - CCDS14473.1 CSTF2
X 100078277 100078418 X-100078280-100078416 0 - CCDS14473.1 CSTF2
X 100078871 100078995 X-100078874-100078993 0 - CCDS14473.1 CSTF2
X 100079105 100079247 X-100079108-100079245 0 - CCDS14473.1 CSTF2
X 100081619 100081747 X-100081622-100081745 0 - CCDS14473.1 CSTF2
X 100083025 100083092 X-100083028-100083090 0 - CCDS14473.1 CSTF2
X 100086500 100086646 X-100086503-100086644 0 - CCDS14473.1 CSTF2
X 100087719 100087899 X-100087722-100087897 0 - CCDS14473.1 CSTF2
And HGNC.bed sample: https://github.com/Clinical-Genomics/chanjo/blob/master/chanjo/init/demo-files/hgnc.min.bed
1 955550 955755 1-955552-955753 NM_198576 329 AGRN
1 957579 957844 1-957581-957842 NM_198576 329 AGRN
1 970655 970706 1-970657-970704 NM_198576 329 AGRN
1 976043 976262 1-976045-976260 NM_198576 329 AGRN
1 976551 976779 1-976553-976777 NM_198576 329 AGRN
1 976856 977084 1-976858-977082 NM_198576 329 AGRN
1 977334 977544 1-977336-977542 NM_198576 329 AGRN
1 978617 978839 1-978619-978837 NM_198576 329 AGRN
1 978916 979114 1-978918-979112 NM_198576 329 AGRN
1 979201 979405 1-979203-979403 NM_198576 329 AGRN
1 979487 979639 1-979489-979637 NM_198576 329 AGRN
So, if HGNC.bed is preferable, please make the init process download the HGNC.bed instead.