brendapy
brendapy copied to clipboard
Better handling of -999 values and ranges
Updated the issue to reflect the discussion below.
Is this related to errors like:
ERROR:root:data could not be converted to float: 2-8 {NADH}
ERROR:root:data could not be converted to float: -999 {more}
# Sometimes also associated with metabolite name
What do -999 for rate and {more} instead of metabolite mean mean?
When will this be fixed approx?
@Hrovatin
999 means there is no value reported for that corresponding entry (metabolite concentration or Km or Vmax). Hope this helps!
Thank you very much. Do you know if any fixes that will solve these errors are planned?
@Hrovatin This are issues with the Brenda entries. I.e. the entries exist in the database, but have no numerical values. You can easily filter these out or I could provide an option to filter these. I will have a look asap, but pretty busy today and tomorrow with other things.
I would appreciate if you made a fix for the following: I think there are 2 cases:
- -999 is as said by @DeepaMahm a filler for a missing value. So instead of an error you could output logger Info that no value was found or something similar.
- a range (e.g. 2-8) is a valid value and I do not want to filter this out. In this case a tuple of 2 floats instead of single float could be returned (or some similar solution to represent the range).
The reasons I kept the -999 entries was because of the associated references which could provide valuable information to modelers. I.e. here is a publication about the kinetics or parameters of this enzyme. The solution I prefer is to
- clearly indicate these are missing values via
NaN
which would also make it very easy to filter these entries - the tuple idea is great
- I would add an additional type attribute which would be
MissingValue
,Value
, orRange
which would make it easy to distinguish the different types programatically.
That sounds great.
Have these changes been implemented already?