component-contribution icon indicating copy to clipboard operation
component-contribution copied to clipboard

Error in recache_compound.py

Open gandrill opened this issue 5 years ago • 6 comments

Hi Elad,

I tried to use the recache_compound.py to add new components that are not in the KEGG database. An error message is displayed in the chemaxon.py file (see attached screenshot) stating the value of the variable added to the pKa list cannot be converted into a float . This error is also displayed when I try to recache known components from the database (C00288 in the example). I expect the problem does not come from the dependencies as I have all of them. My first guess would be that the decimal separator is not correct ("." instead of ",") but I'm not comfortable in making this change to risk compromising the code.

Would you have any idea ?

Léo error while recaching

gandrill avatar Nov 16 '18 10:11 gandrill

Hi Léo,

I think you are right about the separator. Your ChemAxon output format is not supported by the python float() function. I would suggest that you try to change the settings in your ChemAxon installation. Or perhaps change the language settings in your OS to something that uses the standard decimal separator. I don't think that changing the code to support non-standard number formats is the right answer in this case.

eladnoor avatar Nov 18 '18 14:11 eladnoor

It might be fixable by setting the Python locale to the system locale.

Midnighter avatar Nov 18 '18 15:11 Midnighter

Thank you to both of you for your answers. Changing the regional settings of my OS to English (US) solved my problem. Using the function replace() to convert "," into "." worked as well. Recache_compound.py and compound_cacher now run with unknown components' InChIs (from kegg_additions.tsv) and write data to cache. However, when I try to launch an example reaction with a new component (say 91001), the component doesn't seem to be cached and the following message is displayed. However, in the cache folder, component.json.gz is updated to today's date which would confirm that the component data has been cached.

I understand this is an entirely different problem. Would you have an idea to start with ?

not_caching

gandrill avatar Nov 19 '18 12:11 gandrill

Make sure that you use the ID with the "C" prefix in your reaction formulas: C91001). You can also try to open the .json.gz file and see if indeed the compound is cached there (it's quite large, so it could be challenging if you don't use shell scripts for it). By the way, the name should be "compounds.json.gz" not "component.json.gz".

eladnoor avatar Nov 19 '18 13:11 eladnoor

Thank you for your answer. The compound was indeed in the "compounds.json.gz" but couldn't be used in my reactions somehow. I had to install the component-contribution-master\setup.py once again (with the new compounds specified in the kegg_addition.tsv before launching the installation).

gandrill avatar Nov 26 '18 14:11 gandrill

Hi, I am not sure from you answer if the issue was solved by reinstalling the package or not. If not, can you try uninstalling the package completely and running it from a local directory so it will be easier to debug?

eladnoor avatar Nov 26 '18 15:11 eladnoor