ChEMBL_Structure_Pipeline icon indicating copy to clipboard operation
ChEMBL_Structure_Pipeline copied to clipboard

Fix installation conflict and add RDKit-pypi as a versioned requirement

Open djm-healx opened this issue 2 years ago • 3 comments

This PR fixes some circular import that is encountered during setup/installation, where the version of the structure pipeline is imported from module code. Now it is hard-coded into setup.py together with a pinned version of rdkit.

This was encountered in Python 3.9.8 when running:

pip install -e .
>>...
>>ModuleNotFoundError: No module named 'rdkit'

This was due to the setup script imported the version of this pipeline from chembl_structure_pipeline/__init__.py, which subsequently required the import of rdkit for the other code in __init__.py.

Also added rdkit-pypi as an optional extra dependency in setup.py, since I see in the CI script for this repo that conda is used . This should make the PR less likely to cause issues, and enables pip to handle versioning, if installed as pip install -e ".[rdkit]"

The version of rdkit-pypi is pinned to lie between 2019.03.1 (as specified in the module code) and 2021.3.5.1 (which is the last non-failing version where the tests in #39 still pass - see not below).

djm-healx avatar Apr 12 '22 19:04 djm-healx

These changes seem to make Python 3.9.8 happy with installation and being able to run tests.

However, unittest fails on one of the tests, which feels separate to this PR:

python -m unittest discover chembl_structure_pipeline

FAIL: testDoubleBondStereoProblem1 (test.test_standardizer.TestCase)
a problem found in testing
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/ChEMBL_Structure_Pipeline/chembl_structure_pipeline/test/test_standardizer.py", line 1655, in testDoubleBondStereoProblem1
    self.assertNotEqual(omolb.find('13 12  1  6'), -1)
AssertionError: -1 == -1

I'll raise an issue

djm-healx avatar Apr 12 '22 20:04 djm-healx

Hi Dan, many thanks for opening the issue and the pull request.

There are 3 issues in here:

Adding rdkit-pypi as an extras_require: That's a good idea but what it doesn't look that nice is the need of changing the version in 2 files instad of one. We were using option 6 from here which was simple and worked for distributing the package with conda but it doesn't work with the rdkit-pypi package. We could switch to option 1 adding the setup.cfg file with the following code

[metadata]
version = attr: chembl_structure_pipeline.__version__

Tests: You can run them within the project directory running python -m unittest discover or nosetests --with-doctest like we do in the CI or running python -m unittest discover chembl_structure_pipeline if you're ouside the project directory. Relative paths should be fine.

issue https://github.com/chembl/ChEMBL_Structure_Pipeline/issues/39. Which I've just confirmed and it will need some work.

eloyfelix avatar Apr 13 '22 17:04 eloyfelix

Many thanks for your prompt reply @eloyfelix - much appreciated. No problem - happy to help.

I've added a Makefile for the test command, in case this is useful in the future.

After implementing your suggestion of setup.cfg, I thought this could potentially be taken one step further by adding bump2version for automatic version increments (so it could be that make increment-minor is used after changes are made in the future). More than happy to roll back this part though if preferred, as it is perhaps unnecessary.

djm-healx avatar Apr 13 '22 20:04 djm-healx

I'm closing because I just merged: https://github.com/chembl/ChEMBL_Structure_Pipeline/pull/42 but many thanks for reporting the issues and thank you for the patience. I'll soon make a build of the package so it can be pip installed.

eloyfelix avatar Nov 04 '22 16:11 eloyfelix