datajoint-python icon indicating copy to clipboard operation
datajoint-python copied to clipboard

dj.Diagram raises ValueError if it was installed from Conda (but works if it was installed from PyPI)

Open felixsmli opened this issue 2 years ago • 10 comments

Bug Report

Description

When trying to plot the schema diagram with dj.Diagram(schema), it raises an exception:

ValueError: Node names and attributes should not contain ":" unless they are quoted with "". For example the string 'attribute:data1' should be written as '"attribute:data1"'. Please refer https://github.com/pydot/pydot/issues/258

Reproducibility

Include:

  • OS: WSL2 running Ubuntu 20.04.5 (can also reproduce this in Ubuntu 22.04 Docker)
  • Python Version: Python 3.10.6
  • MySQL Version: 8.0
  • MySQL Deployment Strategy: remote
  • DataJoint Version: 0.13.7
  • Minimum number of steps to reliably reproduce the issue Here is a Dockerfile for installing DataJoint from Conda:
FROM jupyter/scipy-notebook:ubuntu-22.04
RUN conda install -y datajoint

Use it to run a Jupyter Notebook and try to plot the diagram of a schema with dj.Diagram(schema) , it shows the ValueError. I know the diagram function depends on Graphviz, which was installed automatically while installing DataJoint:

conda list | grep graphviz
graphviz                  6.0.1                h5abf519_0    conda-forge

Also I have tried:

  1. Install Graphviz to the system with APT (For Ubuntu 22.04 it gets graphviz/now 2.42.2-6 amd64) before running conda install, conda still installs its own Graphviz.
  2. Install Graphviz 2.42.3 with conda, then install DataJoint.

Same issue persists.

  • Complete error stack as a result of evaluating the above steps
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
File /opt/conda/lib/python3.10/site-packages/IPython/core/formatters.py:342, in BaseFormatter.__call__(self, obj)
    340     method = get_real_method(obj, self.print_method)
    341     if method is not None:
--> 342         return method()
    343     return None
    344 else:

File /opt/conda/lib/python3.10/site-packages/datajoint/diagram.py:440, in Diagram._repr_svg_(self)
    439 def _repr_svg_(self):
--> 440     return self.make_svg()._repr_svg_()

File /opt/conda/lib/python3.10/site-packages/datajoint/diagram.py:428, in Diagram.make_svg(self)
    425 def make_svg(self):
    426     from IPython.display import SVG
--> 428     return SVG(self.make_dot().create_svg())

File /opt/conda/lib/python3.10/site-packages/datajoint/diagram.py:373, in Diagram.make_dot(self)
    310 label_props = {  # http://matplotlib.org/examples/color/named_colors.html
    311     None: dict(
    312         shape="circle",
   (...)
    366     ),
    367 }
    368 node_props = {
    369     node: label_props[d["node_type"]]
    370     for node, d in dict(graph.nodes(data=True)).items()
    371 }
--> 373 dot = nx.drawing.nx_pydot.to_pydot(graph)
    374 for node in dot.get_nodes():
    375     node.set_shape("circle")

File /opt/conda/lib/python3.10/site-packages/networkx/drawing/nx_pydot.py:309, in to_pydot(N)
    298 raise_error = (
    299     _check_colon_quotes(u)
    300     or _check_colon_quotes(v)
   (...)
    306     )
    307 )
    308 if raise_error:
--> 309     raise ValueError(
    310         f'Node names and attributes should not contain ":" unless they are quoted with "".\
    311         For example the string \'attribute:data1\' should be written as \'"attribute:data1"\'.\
    312         Please refer https://github.com/pydot/pydot/issues/258'
    313     )
    314 edge = pydot.Edge(u, v, **str_edgedata)
    315 P.add_edge(edge)

ValueError: Node names and attributes should not contain ":" unless they are quoted with "".                    For example the string 'attribute:data1' should be written as '"attribute:data1"'.                    Please refer https://github.com/pydot/pydot/issues/258

Expected Behavior

The schema diagram can be plotted correctly.

Additional Research and Context

Actually if you install DataJoint with pip install, with system Graphviz from APT, the diagram function works properly. For verifying this I actually created two docker images to test (the above one installs DataJoint with conda and the following one installs it with pip). The pip Dockerfile works as expected, but the Conda one does not.

FROM jupyter/scipy-notebook:ubuntu-22.04
USER root
RUN apt-get update && \
    apt-get upgrade -y && \
    apt-get install -yq \
    graphviz \
    && rm -rf /var/lib/apt/lists/*
USER jovyan
RUN python -m pip install --no-cache-dir \
    datajoint

Another difference I notice is that conda gets DataJoint 0.13.7 while pip gets DataJoint 0.13.8. So I have tried pip install datajoint 0.13.7, the diagram still works.

felixsmli avatar Nov 11 '22 03:11 felixsmli

I can confirm the error, however it only appeared for me when I created a table with a dependency. Dropping the related table resolved the issue.

datajoint 0.13.7 pyhd8ed1ab_0 conda-forge graphviz 2.50.0 hdb8b0d4_0

@schema
class Mouse(dj.Manual):
    definition = """
    # Experimental animals
    mouse_id             : int                          # Unique animal ID
    ---
    dob=null             : date                         # date of birth
    sex="unknown"        : enum('M','F','unknown')      # sex
    """

@schema
class Experimenter(dj.Manual):
        definition = """
    # Experimenter
    experimenter_id      : int                          # Unique experimenter ID
    ---
    name=null            : varchar(255)                 # name of experimenter
    sex="unknown"        : enum('M','F','unknown')      # sex
    """

@schema
class Model(dj.Manual):
            definition = """
    # Model info
    model_id      : int                        # Unique model ID
    ---
    name           : varchar(255)              # name of model
    type           : varchar(255)              # model architecture
    training_date  : date                      # date the model was trained
    description    : varchar(255)              # description of the model (optional)
    """

Works as intended. grafik

But adding another table (with dependency) results in the same error.

@schema
class Session(dj.Manual):
    definition = """
    #Session
    -> Mouse
    session_id           : int                         # id of experiment
    ---
    session_time         : time                         # time of experiment #todo: change to datetime
    experimenter_id      : int              # id of experimenter, linking to experimenter table
    video_path           : varchar(255)                 # path to video file
    pose_path            : varchar(255)                 # path to pose file
    pose_origin          : varchar(255)              # origin of pose estimation (e.g. SLEAP)
    annotation_path      : varchar(255)                 # path to annotation file
    annotation_origin    : varchar(255)        # origin of annotation files (e.g. BORIS)
    """
ValueError                                Traceback (most recent call last)
File ~\anaconda3\envs\datajoint_test\lib\site-packages\IPython\core\formatters.py:344, in BaseFormatter.__call__(self, obj)
    342     method = get_real_method(obj, self.print_method)
    343     if method is not None:
--> 344         return method()
    345     return None
    346 else:

File ~\anaconda3\envs\datajoint_test\lib\site-packages\datajoint\diagram.py:440, in Diagram._repr_svg_(self)
    439 def _repr_svg_(self):
--> 440     return self.make_svg()._repr_svg_()

File ~\anaconda3\envs\datajoint_test\lib\site-packages\datajoint\diagram.py:428, in Diagram.make_svg(self)
    425 def make_svg(self):
    426     from IPython.display import SVG
--> 428     return SVG(self.make_dot().create_svg())

File ~\anaconda3\envs\datajoint_test\lib\site-packages\datajoint\diagram.py:373, in Diagram.make_dot(self)
    310 label_props = {  # http://matplotlib.org/examples/color/named_colors.html
    311     None: dict(
    312         shape="circle",
   (...)
    366     ),
    367 }
    368 node_props = {
    369     node: label_props[d["node_type"]]
    370     for node, d in dict(graph.nodes(data=True)).items()
    371 }
--> 373 dot = nx.drawing.nx_pydot.to_pydot(graph)
    374 for node in dot.get_nodes():
    375     node.set_shape("circle")

File ~\anaconda3\envs\datajoint_test\lib\site-packages\networkx\drawing\nx_pydot.py:309, in to_pydot(N)
    298 raise_error = (
    299     _check_colon_quotes(u)
    300     or _check_colon_quotes(v)
   (...)
    306     )
    307 )
    308 if raise_error:
--> 309     raise ValueError(
    310         f'Node names and attributes should not contain ":" unless they are quoted with "".\
    311         For example the string \'attribute:data1\' should be written as \'"attribute:data1"\'.\
    312         Please refer https://github.com/pydot/pydot/issues/258'
    313     )
    314 edge = pydot.Edge(u, v, **str_edgedata)
    315 P.add_edge(edge)

ValueError: Node names and attributes should not contain ":" unless they are quoted with "".                    For example the string 'attribute:data1' should be written as '"attribute:data1"'.                    Please refer https://github.com/pydot/pydot/issues/258

The tables work as intended as far as I was able to check.


Installing datajoint again with pip install datajoint fixes the issue.

(datajoint_test) C:\Users\JSchw\PycharmProjects\Datajoint_test>pip install datajoint
Requirement already satisfied: datajoint in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (0.13.7)
Requirement already satisfied: tqdm in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (4.65.0)
Requirement already satisfied: matplotlib in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (3.4.3)
Requirement already satisfied: otumat in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (0.3.1)
Requirement already satisfied: cryptography in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (3.4.8)
Collecting networkx<=2.6.3
  Downloading networkx-2.6.3-py3-none-any.whl (1.9 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 6.8 MB/s eta 0:00:00
Requirement already satisfied: pyparsing in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (3.0.9)
Requirement already satisfied: pydot in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.4.2)
Requirement already satisfied: pandas in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.5.3)
Requirement already satisfied: numpy in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.22.3)
Requirement already satisfied: minio>=7.0.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (7.1.14)
Requirement already satisfied: ipython in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (8.12.0)
Requirement already satisfied: urllib3 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.26.15)
Requirement already satisfied: pymysql>=0.7.2 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from datajoint) (1.0.3)
Requirement already satisfied: certifi in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from minio>=7.0.0->datajoint) (2022.12.7)
Requirement already satisfied: cffi>=1.12 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from cryptography->datajoint) (1.15.1)
Requirement already satisfied: stack-data in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.6.2)
Requirement already satisfied: typing-extensions in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (4.5.0)
Requirement already satisfied: matplotlib-inline in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.1.6)
Requirement already satisfied: prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (3.0.38)
Requirement already satisfied: traitlets>=5 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (5.9.0)
Requirement already satisfied: pickleshare in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.7.5)
Requirement already satisfied: backcall in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.2.0)
Requirement already satisfied: decorator in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (5.1.1)
Requirement already satisfied: colorama in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.4.6)
Requirement already satisfied: pygments>=2.4.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (2.15.0)
Requirement already satisfied: jedi>=0.16 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from ipython->datajoint) (0.18.2)
Requirement already satisfied: python-dateutil>=2.7 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from matplotlib->datajoint) (2.8.2)
Requirement already satisfied: cycler>=0.10 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from matplotlib->datajoint) (0.11.0)
Requirement already satisfied: pillow>=6.2.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from matplotlib->datajoint) (9.4.0)
Requirement already satisfied: kiwisolver>=1.0.1 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from matplotlib->datajoint) (1.4.4)
Requirement already satisfied: flask in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from otumat->datajoint) (2.2.3)
Requirement already satisfied: watchdog in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from otumat->datajoint) (3.0.0)
Requirement already satisfied: appdirs in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from otumat->datajoint) (1.4.4)
Requirement already satisfied: pytz>=2020.1 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from pandas->datajoint) (2023.3)
Requirement already satisfied: pycparser in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from cffi>=1.12->cryptography->datajoint) (2.21)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from jedi>=0.16->ipython->datajoint) (0.8.3)
Requirement already satisfied: wcwidth in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from prompt-toolkit!=3.0.37,<3.1.0,>=3.0.30->ipython->datajoint) (0.2.6)
Requirement already satisfied: six>=1.5 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from python-dateutil>=2.7->matplotlib->datajoint) (1.16.0)   
Requirement already satisfied: itsdangerous>=2.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (2.1.2)
Requirement already satisfied: click>=8.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (8.1.3)
Requirement already satisfied: Werkzeug>=2.2.2 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (2.2.3)
Requirement already satisfied: Jinja2>=3.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (3.1.2)
Requirement already satisfied: importlib-metadata>=3.6.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from flask->otumat->datajoint) (6.4.1)      
Requirement already satisfied: asttokens>=2.1.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from stack-data->ipython->datajoint) (2.2.1)
Requirement already satisfied: executing>=1.2.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from stack-data->ipython->datajoint) (1.2.0)
Requirement already satisfied: pure-eval in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from stack-data->ipython->datajoint) (0.2.2)
Requirement already satisfied: zipp>=0.5 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from importlib-metadata>=3.6.0->flask->otumat->datajoint) (3.15.0)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\jschw\anaconda3\envs\datajoint_test\lib\site-packages (from Jinja2>=3.0->flask->otumat->datajoint) (2.1.1)
Installing collected packages: networkx
  Attempting uninstall: networkx
    Found existing installation: networkx 3.1
    Uninstalling networkx-3.1:
      Successfully uninstalled networkx-3.1
Successfully installed networkx-2.6.3

Minor Edit:

Did a clean install. this time with only pip install datajoint, now I am running into this issue #1033.

JensBlack avatar Apr 20 '23 09:04 JensBlack

Running into the same issue after upgrading to 14.1 with pip3 install --upgrade datajoint, Jens.

troselab-setup avatar Jul 18 '23 07:07 troselab-setup

I ran into the same error. It's related to breaking change in networkx. On it.

dimitri-yatsenko avatar Jul 18 '23 12:07 dimitri-yatsenko

Great! thx!

troselab-setup avatar Jul 19 '23 07:07 troselab-setup

Hi, have there been any solutions for this issue? I just did fresh install of datajoint on macOS 13.6 and I am trying out the shapes schema example and getting this same error. Note this is with a conda install, not pip. Should I reinstall with pip? Is there an alternative way of plotting the ERD?

noahpettit avatar Oct 02 '23 19:10 noahpettit

I will prioritize this to fix before our Harvard workshop. See you there. The current fix is to downgrade networkx.

dimitri-yatsenko avatar Oct 02 '23 22:10 dimitri-yatsenko

Thanks Dimitri. I was able to fix it through pip upgrading datajoint to 0.14.1, vs conda-forge version which looks to be 0.13.7 (Thanks Tobias Rose for the tip). Testing with python 3.9.18. I wasn't able to test on python>=3.10 because of a separate unrelated issue. Looking forward to the workshop, see you there!

noahpettit avatar Oct 03 '23 02:10 noahpettit

I will prioritize this to fix before our Harvard workshop. See you there. The current fix is to downgrade networkx.

For reference, networkx is already a somewhat sticky transitive dependency due to incompatible version constraints with even fairly old versions of scikit-image. datajoint, as of 0.14.1, still requires networkx<2.6.3, while scikit-image>=0.20 requires networkx>=2.8.

simon-ball avatar Oct 17 '23 09:10 simon-ball

To be explicit, downgrading networkx to 2.6.2 worked in my case

CBroz1 avatar Mar 25 '24 19:03 CBroz1

Yes, we understand this backward incompatibly of networkx. Fix is coming.

dimitri-yatsenko avatar Mar 25 '24 19:03 dimitri-yatsenko