gnomad_methods
gnomad_methods copied to clipboard
UnboundLocalError: local variable 'error_rate' referenced before assignment
Hello Developers,
I am following the short tutorial from here for ancestry classification. It was working fine when I had earlier versions of the gnomad and hail. I did a fresh install yetsreday with updated versions and now I have the following error. It appears to be a bug.
ht, rf_model = assign_population_pcs(
... ht,
... pc_cols=ht.scores,
... fit=fit,
... )
2022-09-29 14:55:46 Hail: INFO: Coerced sorted dataset (0 + 1) / 1]
INFO (gnomad.sample_qc.ancestry 224): Found the following sample count after population assignment: sas: 1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/bioinfoRD/ARCdata/Projects_AMT/conda_envs/hail/lib/python3.10/site-packages/gnomad/sample_qc/ancestry.py", line 235, in assign_population_pcs
min_assignment_prob=min_prob, error_rate=error_rate
UnboundLocalError: local variable 'error_rate' referenced before assignment
Edit:
Just as debugging, I added error_rate: float = 0.0 at line 163
The above error was gone but the new error popped up:
File "/bioinfoRD/ARCdata/Projects_AMT/conda_envs/hail/lib/python3.10/site-packages/gnomad/sample_qc/ancestry.py", line 241, in assign_population_pcs
evaluation_sample=hl.literal(list(evaluate_fit.s)).contains(pops_ht.s),
UnboundLocalError: local variable 'evaluate_fit' referenced before assignment
Then I ended up commenting out the annotate function at line 250 and it worked.
But this needs to be done properly think.
Hi,
I also just received this error following the gnomad ancestry inference vignette. I am using the terra-jupyter-hail docker us.gcr.io/broad-dsp-gcr-public/terra-jupyter-hail:1.0.20. The release notes mentions updating hail to 0.2.98.
Thank you @PoisonAlien and @sabrinacamp2 It looks like this error was fixed with this commit, so this should be resolved in the newest gnomad_methods release. However, as described in this other issue. The code in the blog post won't work as is.
I have put in a PR to modify the new code so the blog post example can work.
If you try this again with the newest version of gnomad_methods (v0.6.3), you would need to modify the code in the following way: This:
htt, rf_model = assign_population_pcs(
ht,
pc_cols=ht.scores,
fit=fit,
)
should change to:
htt, rf_model = assign_population_pcs(
ht,
pc_cols=range(num_pcs),
fit=fit,
)
This should be fixed in the next release so the original blog post code can work as expected.
Thank you, Julia
The newest release v0.6.4 has fixed the assign_population_pcs function so that it still works with the code in the blog post. Closing the issue since I think this is now fixed, but please reach out if it's still a problem.