songbird
songbird copied to clipboard
Examples and/or more precise information on "good enough" data re: model fitting
Talked about this with @cameronmartino. Essentially, the red sea dataset used in the README is a really "nice" example of a model fitting -- this is great but can be confusing to people with more noisy data where you will still get some sort of model fit, but it isn't nearly as nice (e.g. your pseudo-Q2 score is less than 0.73...)
Having more precise information about when you're "done" would be beneficial to users.
It's important to take the Q2 value as a grain of salt - we know that it is using the wrong distance metric. And the stats behind R2 for multinomial regression is still an outstanding problem in the statistical community - I doubt we'd be able to make traction on that (it'll be a major feat).
So here anything above 0 is potentially considered reasonable.
For reference, the README section on Q^2 values has been updated to be less strict. I guess we could still add example(s) with less perfect data :), so I'm going to leave this issue open for now.