gnomad_methods icon indicating copy to clipboard operation
gnomad_methods copied to clipboard

UnboundLocalError: local variable 'error_rate' referenced before assignment

Open PoisonAlien opened this issue 3 years ago • 1 comments

Hello Developers,

I am following the short tutorial from here for ancestry classification. It was working fine when I had earlier versions of the gnomad and hail. I did a fresh install yetsreday with updated versions and now I have the following error. It appears to be a bug.

ht, rf_model = assign_population_pcs(
...     ht,
...     pc_cols=ht.scores,
...     fit=fit,
... )

2022-09-29 14:55:46 Hail: INFO: Coerced sorted dataset              (0 + 1) / 1]
INFO (gnomad.sample_qc.ancestry 224): Found the following sample count after population assignment: sas: 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/bioinfoRD/ARCdata/Projects_AMT/conda_envs/hail/lib/python3.10/site-packages/gnomad/sample_qc/ancestry.py", line 235, in assign_population_pcs
    min_assignment_prob=min_prob, error_rate=error_rate
UnboundLocalError: local variable 'error_rate' referenced before assignment

Edit: Just as debugging, I added error_rate: float = 0.0 at line 163 The above error was gone but the new error popped up:

File "/bioinfoRD/ARCdata/Projects_AMT/conda_envs/hail/lib/python3.10/site-packages/gnomad/sample_qc/ancestry.py", line 241, in assign_population_pcs
    evaluation_sample=hl.literal(list(evaluate_fit.s)).contains(pops_ht.s),
UnboundLocalError: local variable 'evaluate_fit' referenced before assignment

Then I ended up commenting out the annotate function at line 250 and it worked.

But this needs to be done properly think.

PoisonAlien avatar Sep 30 '22 07:09 PoisonAlien

Hi,

I also just received this error following the gnomad ancestry inference vignette. I am using the terra-jupyter-hail docker us.gcr.io/broad-dsp-gcr-public/terra-jupyter-hail:1.0.20. The release notes mentions updating hail to 0.2.98.

sabrinacamp2 avatar Oct 27 '22 16:10 sabrinacamp2

Thank you @PoisonAlien and @sabrinacamp2 It looks like this error was fixed with this commit, so this should be resolved in the newest gnomad_methods release. However, as described in this other issue. The code in the blog post won't work as is.

I have put in a PR to modify the new code so the blog post example can work.

If you try this again with the newest version of gnomad_methods (v0.6.3), you would need to modify the code in the following way: This:

htt, rf_model = assign_population_pcs(
    ht,
    pc_cols=ht.scores,
    fit=fit,
)

should change to:

htt, rf_model = assign_population_pcs(
    ht,
    pc_cols=range(num_pcs),
    fit=fit,
)

This should be fixed in the next release so the original blog post code can work as expected.

Thank you, Julia

jkgoodrich avatar Oct 28 '22 14:10 jkgoodrich

The newest release v0.6.4 has fixed the assign_population_pcs function so that it still works with the code in the blog post. Closing the issue since I think this is now fixed, but please reach out if it's still a problem.

jkgoodrich avatar Nov 08 '22 15:11 jkgoodrich