AutoGeneS icon indicating copy to clipboard operation
AutoGeneS copied to clipboard

Extracting Reference profile

Open RolantusdataExp opened this issue 2 years ago • 5 comments

Hi developers, Thanks for a great tool! I was wondering if you could extract the Reference Profile prior to the actual deconvolution. I am currently trying to make the deconvolution work, however, it fails to estimate certain cell types (probably due to an insufficient estimation of marker genes). Can I extract the matrix and change it? Best, Peter

RolantusdataExp avatar Oct 24 '23 11:10 RolantusdataExp

Hi Peter,

Thanks for using our tool. You can get the selected solution of the optimizer using autogenes.Interface.selection. Which should include the selected genes and their mean expression across cell types. You can then manually add your markers to that matrix and run the regression on the new matrix.

Hope this helps. Hana

lila167 avatar Oct 25 '23 08:10 lila167

Love the tool, I just have a follow up question.

I assume that the result form the autogenes.Interface.selection is the booleanarray equivalent to the key_added='autogenes' from the autogenes.Interface.select

However I'm having trouble finding out which selected genes originate from which cell type earlier defined in the autogenes.Interface.init

Moreover, is there a way of only saving the selected genes solution after the autogenes.Interface.optimize and not all solutions through the Interface.save

Best, Christian Andersen

Aeget1000 avatar Oct 27 '23 11:10 Aeget1000

Hi Christian,

Thanks for your interest!

I'm not sure if I got your question. Are you looking for the markers per cell type? If so, this is not what autogenes offers. It actually finds sets of genes that optimize the objectives simultaneously which are minimizing correlation across cell types and maximizing their distance. So, autogenes doesn't select a fixed set of markers per cell types, and that's the whole idea. Instead, it is completely flexible to select as many markers per cell type as it wish, and as long as the the gene set satisfies the objectives and constraints, it's a good set.

You can select a solution from pareto front using autogenes.Interface.select and then autogenes.Interface.selection should return only the selected solution.

I hope these help. Let me know if you have any further questions. Best, Hana

lila167 avatar Oct 30 '23 10:10 lila167

Hi Hana

Thank you for the great answer it solved my problem. However I have another question.

I have a AnnData object called data with n_obs × n_vars = 23 × 12997 (cell states x genes)

  • a booleanarray in the data.var denoting which genes are HVG (highly variable genes) with 1718 of the 12997 them being True boolean values
  • a list of strings containing the cell states in data.obs ["cellstate1","cellstate1","cellstate5","cellstate23" .... ]

I'm struggling to find the optimal settings for the function:

  • Interface.optimize(ngen = 2, mode = 'standard', nfeatures = None, weights = None, objectives = None, seed = 0, verbose = True, **kwargs)

It is to my understanding that if I set:

  • ngen=5000, nfeatures= (1718-1), population_size= 400, offspring_size = 300, mode ="fixed"

(1) The algorithm will select the 1718 genes in data.var that are denoted True (2) 400 solution sets will be generated each of which will include the 1717 selected marker genes ( I subtract 1 from the number of HVG that I have because when I use all of them the correlation wouldn't go down. ) (3) Then the minimization and maximization of correlation and distance takes place (4) 300 random solution sets are selected And then step 3 and 4 loops 5000 times

Please let me know if I have misunderstood something. I would love to know how to optimize my selection of parameters for this given data and just in general

Best, Christian Andersen

Aeget1000 avatar Nov 08 '23 10:11 Aeget1000

Hello, Thank you so much for this tool. Just following up on this thread, could you provide an example of how to use the ag.Interface.selection function? It is unclear to me what parameters to input with this function. Thank you!

soueryw avatar Jul 29 '24 18:07 soueryw