songbird icon indicating copy to clipboard operation
songbird copied to clipboard

Examples and/or more precise information on "good enough" data re: model fitting

Open fedarko opened this issue 5 years ago • 2 comments

Talked about this with @cameronmartino. Essentially, the red sea dataset used in the README is a really "nice" example of a model fitting -- this is great but can be confusing to people with more noisy data where you will still get some sort of model fit, but it isn't nearly as nice (e.g. your pseudo-Q2 score is less than 0.73...)

Having more precise information about when you're "done" would be beneficial to users.

forum xref

fedarko avatar Feb 06 '20 21:02 fedarko

It's important to take the Q2 value as a grain of salt - we know that it is using the wrong distance metric. And the stats behind R2 for multinomial regression is still an outstanding problem in the statistical community - I doubt we'd be able to make traction on that (it'll be a major feat).

So here anything above 0 is potentially considered reasonable.

mortonjt avatar Feb 07 '20 16:02 mortonjt

For reference, the README section on Q^2 values has been updated to be less strict. I guess we could still add example(s) with less perfect data :), so I'm going to leave this issue open for now.

fedarko avatar Feb 08 '20 00:02 fedarko