ChEMBL_Structure_Pipeline
ChEMBL_Structure_Pipeline copied to clipboard
Fix installation conflict and add RDKit-pypi as a versioned requirement
This PR fixes some circular import that is encountered during setup/installation, where the version of the structure pipeline is imported from module code. Now it is hard-coded into setup.py
together with a pinned version of rdkit
.
This was encountered in Python 3.9.8 when running:
pip install -e .
>>...
>>ModuleNotFoundError: No module named 'rdkit'
This was due to the setup script imported the version of this pipeline from chembl_structure_pipeline/__init__.py
, which subsequently required the import of rdkit
for the other code in __init__.py
.
Also added rdkit-pypi
as an optional extra dependency in setup.py
, since I see in the CI script for this repo that conda
is used . This should make the PR less likely to cause issues, and enables pip
to handle versioning, if installed as pip install -e ".[rdkit]"
The version of rdkit-pypi is pinned to lie between 2019.03.1 (as specified in the module code) and 2021.3.5.1 (which is the last non-failing version where the tests in #39 still pass - see not below).
These changes seem to make Python 3.9.8 happy with installation and being able to run tests.
However, unittest
fails on one of the tests, which feels separate to this PR:
python -m unittest discover chembl_structure_pipeline
FAIL: testDoubleBondStereoProblem1 (test.test_standardizer.TestCase)
a problem found in testing
----------------------------------------------------------------------
Traceback (most recent call last):
File "/ChEMBL_Structure_Pipeline/chembl_structure_pipeline/test/test_standardizer.py", line 1655, in testDoubleBondStereoProblem1
self.assertNotEqual(omolb.find('13 12 1 6'), -1)
AssertionError: -1 == -1
I'll raise an issue
Hi Dan, many thanks for opening the issue and the pull request.
There are 3 issues in here:
Adding rdkit-pypi as an extras_require: That's a good idea but what it doesn't look that nice is the need of changing the version in 2 files instad of one. We were using option 6 from here which was simple and worked for distributing the package with conda but it doesn't work with the rdkit-pypi package. We could switch to option 1 adding the setup.cfg
file with the following code
[metadata]
version = attr: chembl_structure_pipeline.__version__
Tests: You can run them within the project directory running python -m unittest discover
or nosetests --with-doctest
like we do in the CI or running python -m unittest discover chembl_structure_pipeline
if you're ouside the project directory. Relative paths should be fine.
issue https://github.com/chembl/ChEMBL_Structure_Pipeline/issues/39. Which I've just confirmed and it will need some work.
Many thanks for your prompt reply @eloyfelix - much appreciated. No problem - happy to help.
I've added a Makefile for the test command, in case this is useful in the future.
After implementing your suggestion of setup.cfg
, I thought this could potentially be taken one step further by adding bump2version
for automatic version increments (so it could be that make increment-minor
is used after changes are made in the future). More than happy to roll back this part though if preferred, as it is perhaps unnecessary.
I'm closing because I just merged: https://github.com/chembl/ChEMBL_Structure_Pipeline/pull/42 but many thanks for reporting the issues and thank you for the patience. I'll soon make a build of the package so it can be pip installed.