mmvec icon indicating copy to clipboard operation
mmvec copied to clipboard

Different results

Open xmcr opened this issue 2 years ago • 4 comments

Hello, I used your tool mmvec to analyze the co occurrence probability of microorganisms and metabolites. I use the default parameters of the qiime2 command. I have run it many times, but the rank value is different each time. Why does this problem occur? How do I choose these results. Thank you.

xmcr avatar Dec 07 '22 09:12 xmcr

The ranks should be fairly reproducible. How different are we talking about? Do you have histograms of the tanks across different runs? Did you run the model to convergence?

If the results are that different, I’d stick with the model that has the lowest cross validation score.

On Wed, Dec 7, 2022 at 4:14 AM xmcr @.***> wrote:

Hello, I used your tool mmvec to analyze the co occurrence probability of microorganisms and metabolites. I use the default parameters of the qiime2 command. I have run it many times, but the rank value is different each time. Why does this problem occur? How do I choose these results. Thank you.

— Reply to this email directly, view it on GitHub https://github.com/biocore/mmvec/issues/173, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXJH2SBMDE6W5ZKE4PDWMBIOTANCNFSM6AAAAAASWRRJHE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

mortonjt avatar Dec 07 '22 13:12 mortonjt

Mmvec.zip Thank you for your reply! I have three groups of data, and each group of data has six samples. I used the qiime command to run mmvec for each group. This is the result of running the same command three times for one set of data. Although they are the same set of data and use the same command, the obtained ranks do not seem to be consistent. Please help me to see why. And how should I choose? thank you!

xmcr avatar Dec 08 '22 02:12 xmcr

Hi @xmcr it doesn't look like any of your models has reached convergence -- there aren't any cross-validation results for any of your models. Chances are, either your iterations are too small, by default --p epochs 100, but you may want to bump this up to like --p-epochs 5000. If you are impatient you could also try increasing to --p-learning-rate 1e-2.

Regarding how to choose between the models, you want to chose the one that has the best fit (i.e. the most predictive on held-out data). If you had to chose between one of those 3 models, Model 1 is the best, since it has the largest Pseudo Q-squared score (Q2=0.056540). But I think you can do much better if you run the model longer.

That being said, I only see 5 samples in your table, so I think you would be hard-pressed to obtain reasonable co-occurrence estimates (with any method for that matter).

mortonjt avatar Dec 08 '22 14:12 mortonjt

The ranks should be fairly reproducible. How different are we talking about? Do you have histograms of the tanks across different runs? Did you run the model to convergence? If the results are that different, I’d stick with the model that has the lowest cross validation score. On Wed, Dec 7, 2022 at 4:14 AM xmcr @.> wrote: Hello, I used your tool mmvec to analyze the co occurrence probability of microorganisms and metabolites. I use the default parameters of the qiime2 command. I have run it many times, but the rank value is different each time. Why does this problem occur? How do I choose these results. Thank you. — Reply to this email directly, view it on GitHub <#173>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA75VXJH2SBMDE6W5ZKE4PDWMBIOTANCNFSM6AAAAAASWRRJHE . You are receiving this because you are subscribed to this thread.Message ID: @.>

Oh, thank you so much for your suggestion. I will try to change the parameter to -- p-epochs 5000, and try many times. If the result is really not ideal, then I may not use Mmvec as a main method. What a pity!

xmcr avatar Dec 08 '22 15:12 xmcr