brendapy icon indicating copy to clipboard operation
brendapy copied to clipboard

Better handling of -999 values and ranges

Open matthiaskoenig opened this issue 4 years ago • 8 comments

Updated the issue to reflect the discussion below.

matthiaskoenig avatar Jun 30 '20 16:06 matthiaskoenig

Is this related to errors like:

ERROR:root:data could not be converted to float: 2-8 {NADH} ERROR:root:data could not be converted to float: -999 {more} # Sometimes also associated with metabolite name

What do -999 for rate and {more} instead of metabolite mean mean?

When will this be fixed approx?

Hrovatin avatar Aug 23 '21 17:08 Hrovatin

@Hrovatin

999 means there is no value reported for that corresponding entry (metabolite concentration or Km or Vmax). Hope this helps!

DeepaMahm avatar Aug 24 '21 05:08 DeepaMahm

Thank you very much. Do you know if any fixes that will solve these errors are planned?

Hrovatin avatar Aug 24 '21 06:08 Hrovatin

@Hrovatin This are issues with the Brenda entries. I.e. the entries exist in the database, but have no numerical values. You can easily filter these out or I could provide an option to filter these. I will have a look asap, but pretty busy today and tomorrow with other things.

matthiaskoenig avatar Aug 24 '21 08:08 matthiaskoenig

I would appreciate if you made a fix for the following: I think there are 2 cases:

  • -999 is as said by @DeepaMahm a filler for a missing value. So instead of an error you could output logger Info that no value was found or something similar.
  • a range (e.g. 2-8) is a valid value and I do not want to filter this out. In this case a tuple of 2 floats instead of single float could be returned (or some similar solution to represent the range).

Hrovatin avatar Aug 24 '21 08:08 Hrovatin

The reasons I kept the -999 entries was because of the associated references which could provide valuable information to modelers. I.e. here is a publication about the kinetics or parameters of this enzyme. The solution I prefer is to

  • clearly indicate these are missing values via NaN which would also make it very easy to filter these entries
  • the tuple idea is great
  • I would add an additional type attribute which would be MissingValue, Value, or Range which would make it easy to distinguish the different types programatically.

matthiaskoenig avatar Aug 24 '21 08:08 matthiaskoenig

That sounds great.

Hrovatin avatar Aug 24 '21 09:08 Hrovatin

Have these changes been implemented already?

Hrovatin avatar Oct 23 '21 09:10 Hrovatin