mtag icon indicating copy to clipboard operation
mtag copied to clipboard

Key error if running FDR step separately but have custom name for 'n' column

Open nievergeltlab opened this issue 6 years ago • 1 comments

Hi,

I downloaded and ran MTAG today.

First I calculated test statistics python mtag-master/mtag.py --sumstats eur_dec28_2017_maf01_info6.results_nefff_pos2.mtag,data2_sumstats --snp_name SNP --z_name z --beta_name BETA --se_name SE --n_name N --eaf_name MAF --a1_name A1 --a2_name A2 --chr_name CHR --bpos_name BP --drop_ambig_snps --fdr --skip_mtag --out ./pt_md

This step ran fine. However when I tried doing FDR

python mtag-master/mtag.py --fdr --skip_mtag --n_approx --out ./pt_md

I got an error:

"Traceback (most recent call last): File "mtag-master/mtag.py", line 1503, in N_mat[:,t] = df_d[t]['n'] File "/sara/sw/python-2.7.9/lib/python2.7/site-packages/pandas/core/frame.py", line 2059, in getitem return self._getitem_column(key) File "/sara/sw/python-2.7.9/lib/python2.7/site-packages/pandas/core/frame.py", line 2066, in _getitem_column return self._get_item_cache(key) File "/sara/sw/python-2.7.9/lib/python2.7/site-packages/pandas/core/generic.py", line 1386, in _get_item_cache values = self._data.get(item) File "/sara/sw/python-2.7.9/lib/python2.7/site-packages/pandas/core/internals.py", line 3543, in get loc = self.items.get_loc(item) File "/sara/sw/python-2.7.9/lib/python2.7/site-packages/pandas/indexes/base.py", line 2136, in get_loc return self._engine.get_loc(self._maybe_cast_indexer(key)) File "pandas/index.pyx", line 132, in pandas.index.IndexEngine.get_loc (pandas/index.c:4433) File "pandas/index.pyx", line 154, in pandas.index.IndexEngine.get_loc (pandas/index.c:4279) File "pandas/src/hashtable_class_helper.pxi", line 732, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13742) File "pandas/src/hashtable_class_helper.pxi", line 740, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:13696) KeyError: 'n' "

Notice that in the first command my input was --n_name N. This error occurred because mtag was looking through the trait_1/trait_2 files for a column called 'n', but obviously none would be there because I was using a column called 'N'.

I worked around the problem by just going to line 1503 in mtag.py and changing "N_mat[:,t] = df_d[t]['n']" to "N_mat[:,t] = df_d[t]['N']" , after which the program ran successfully.

Therefore I guess I could have avoided this if I just ran the FDR step with the initial mtag command, or if I just modified the input files. Anyway I just thought I would document this.

Thanks! Adam

nievergeltlab avatar Aug 21 '18 20:08 nievergeltlab

Hi @nievergeltlab ,

Thanks for your feedback. I think a better way to solve the issue you mentioned is to put in a header conformer that allows for customized column names as input, but forces output to have uniform column names. I've updated the program with this feature. Now running MTAG and then FDR right after should work. Please feel free to re-pull the master branch and try this. Let me know if you have any other questions!

Best, Hui

huilisabrina avatar Aug 24 '18 14:08 huilisabrina