chanjo icon indicating copy to clipboard operation
chanjo copied to clipboard

Chanjo failed to link ccds.15.grch37p13.extended.bed

Open biocyberman opened this issue 6 years ago • 0 comments

Chanjo version 4.2.0 installed via pip. This is the log.

chanjo init -a -f                                                                                                                  
2018-08-31 11:38:49 cybertron chanjo.init.cli[10816] INFO setting up chanjo under: /data/coverage180830
2018-08-31 11:38:49 cybertron chanjo.init.bootstrap[10816] INFO downloading... [https://s3.eu-central-1.amazonaws.com/chanjo/ccds15.grch37p13.extended.bed.zip]
2018-08-31 11:38:50 cybertron chanjo.init.bootstrap[10816] INFO extracting BED file...
2018-08-31 11:38:50 cybertron chanjo.init.bootstrap[10816] INFO removing BED archive...
2018-08-31 11:38:50 cybertron chanjo.init.cli[10816] INFO configure new chanjo database: sqlite:////data/coverage180830/chanjo.coverage.sqlite3
2018-08-31 11:38:51 cybertron chanjo.store.api[10816] INFO created tables: sample, transcript, transcript_stat
2018-08-31 11:38:51 cybertron chanjo.init.cli[10816] INFO writing config file: /data/coverage180830/chanjo.yaml
Chanjo bootstrap successful! Now run: 
chanjo --config /data/coverage180830/ccds.15.grch37p13.extended.bed
(chanjo) coverage180830 ➤ chanjo --config /data/coverage180830/ccds.15.grch37p13.extended.bed                                                                             
adding transcripts  [------------------------------------]    0%
Traceback (most recent call last):
  File "/home/user/bin/opt/anaconda/envs/chanjo/bin/chanjo", line 11, in <module>
    sys.exit(root())
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 1066, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/chanjo/load/cli.py", line 60, in link
    for tx_model in bar:
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/click/_termui_impl.py", line 259, in next
    rv = next(self.iter)
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/chanjo/load/link.py", line 24, in <genexpr>
    models = (make_model(tx_id, exons) for tx_id, exons in transcripts.items())
  File "/home/user/bin/opt/anaconda/envs/chanjo/lib/python3.5/site-packages/chanjo/load/link.py", line 40, in make_model
    gene_id = int(exons[0]['elements'][transcript_id]['gene_id'])
ValueError: invalid literal for int() with base 10: '-'

This is becuase the current version is intended to work with hgnc.bed file which has different format comparing to ccds.15.grch37p13.extended.bed.

head ccds.15.grch37p13.extended.bed                                                                                                                                                                                                                                                               
X       100075402       100075464       X-100075405-100075462   0       -       CCDS14473.1     CSTF2
X       100076512       100076595       X-100076515-100076593   0       -       CCDS14473.1     CSTF2
X       100077236       100077410       X-100077239-100077408   0       -       CCDS14473.1     CSTF2
X       100078277       100078418       X-100078280-100078416   0       -       CCDS14473.1     CSTF2
X       100078871       100078995       X-100078874-100078993   0       -       CCDS14473.1     CSTF2
X       100079105       100079247       X-100079108-100079245   0       -       CCDS14473.1     CSTF2
X       100081619       100081747       X-100081622-100081745   0       -       CCDS14473.1     CSTF2
X       100083025       100083092       X-100083028-100083090   0       -       CCDS14473.1     CSTF2
X       100086500       100086646       X-100086503-100086644   0       -       CCDS14473.1     CSTF2
X       100087719       100087899       X-100087722-100087897   0       -       CCDS14473.1     CSTF2

And HGNC.bed sample: https://github.com/Clinical-Genomics/chanjo/blob/master/chanjo/init/demo-files/hgnc.min.bed

1	955550	955755	1-955552-955753	NM_198576	329	AGRN
1	957579	957844	1-957581-957842	NM_198576	329	AGRN
1	970655	970706	1-970657-970704	NM_198576	329	AGRN
1	976043	976262	1-976045-976260	NM_198576	329	AGRN
1	976551	976779	1-976553-976777	NM_198576	329	AGRN
1	976856	977084	1-976858-977082	NM_198576	329	AGRN
1	977334	977544	1-977336-977542	NM_198576	329	AGRN
1	978617	978839	1-978619-978837	NM_198576	329	AGRN
1	978916	979114	1-978918-979112	NM_198576	329	AGRN
1	979201	979405	1-979203-979403	NM_198576	329	AGRN
1	979487	979639	1-979489-979637	NM_198576	329	AGRN

So, if HGNC.bed is preferable, please make the init process download the HGNC.bed instead.

biocyberman avatar Aug 31 '18 09:08 biocyberman