pymc icon indicating copy to clipboard operation
pymc copied to clipboard

Added networkx export functionality

Open jonititan opened this issue 2 years ago β€’ 12 comments

Apologies for delay but finally got around to updating the latest version of model_graph.py to include networkx export function as discussed in 5677 This creates a method for exporting the graph network structure to networkx. This is useful as it allows direct conversion instead of exporting to graphviz, saving to dot file, importing the dotfile via pydot before finally creating the networkx graph.

This is a new method so there should be no breaking changes. Docstrings have all been updated.

The same approach could be used in v4. I did it in v3 because that's what I'm currently using.

Editted This was copy paste error from previous pull request. This current pull request is for v4 main and was tested on v4.0.0

What is this PR about? ...

Checklist

Major / Breaking Changes

  • ...

Bugfixes / New features

  • ...

Docs / Maintenance

  • ...

jonititan avatar Aug 10 '22 13:08 jonititan

Thanks for the follow up.

It seems there are several unrelated whitespace changes. Could you remove them?

The same approach could be used in v4. I did it in v3 because that's what I'm currently using.

The current PR is directed at main, which is V4.

ricardoV94 avatar Aug 10 '22 13:08 ricardoV94

I think that sorted the whitespace issues? Sorry I actually realised just after submitting.

jonititan avatar Aug 10 '22 14:08 jonititan

and yes this is directed at main v4 and was tested on v4.0.0

jonititan avatar Aug 10 '22 14:08 jonititan

(I tagged @ericmjl even though he's busy in case he has wisdom to share here, but no pressure!)

ColCarroll avatar Aug 10 '22 14:08 ColCarroll

i'm not familiar with networkx (and a little surprised it has such a similar api!), but this seems like a nice addition! A few asks, and feel free to push back if these are onerous:

1. Can you post a screenshot of the output of the sample code from `model_to_networkx` in a notebook?

2. can you add a test in `model_to_graphviz.py`?

3. it looks like there are some edits (maybe from an autoformatter?) on unrelated lines. if it isn't terrible, can you remove those?

as a side conversation that maybe you already had -- does networkx know how to directly read in a dot file? would it be possible to have model_to_networkx be (something like) nx.dot_to_nx(model_to_graphviz(model).source)? that would be very attractive, since it would not require extra code or tests here!

  1. yes I'll do that shortly

  2. Did you mean add it to this? test_model_graph.py

  3. I think i fixed the whitespace but happy to do it if not.

networkx plays just fine with dot files https://networkx.org/documentation/stable/reference/drawing.html There are also examples here on plotting using graphviz from networkx

It's just very annoying as I wanted to export directly and manipulate the network graph and add things directly from the pymc model without having to pass the graph to string and back as dot first. I also ran into errors using that method where some links were not retained but this direct method I've added preserves all of that. When I patched it locally back in April it was so I could plot my pymc model interactively and inspect all the node values, distributions, etc Made it easier to show others. Please see attached. networkdiagram

it uses d3graph

jonititan avatar Aug 10 '22 15:08 jonititan

Here's a screenshot. I saved my code as alt_model_graph.py and you can see where I import it. The model is just a draft I was messing around with actionscreenshot .

jonititan avatar Aug 10 '22 15:08 jonititan

Sorry just realised you meant the code in the docstring. Wait one

jonititan avatar Aug 10 '22 15:08 jonititan

Ok that throws an error I will fix and update the pull request.

jonititan avatar Aug 10 '22 15:08 jonititan

Codecov Report

Merging #6046 (23c9ccc) into main (7a5074e) will decrease coverage by 3.03%. The diff coverage is 18.91%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #6046      +/-   ##
==========================================
- Coverage   89.27%   86.23%   -3.04%     
==========================================
  Files          72       72              
  Lines       12890    12924      +34     
==========================================
- Hits        11507    11145     -362     
- Misses       1383     1779     +396     
Impacted Files Coverage Ξ”
pymc/model_graph.py 78.31% <16.66%> (-17.15%) :arrow_down:
pymc/__init__.py 100.00% <100.00%> (ΓΈ)
pymc/sampling_jax.py 0.00% <0.00%> (-97.06%) :arrow_down:
pymc/distributions/transforms.py 59.63% <0.00%> (-40.37%) :arrow_down:
pymc/distributions/dist_math.py 57.71% <0.00%> (-29.72%) :arrow_down:
pymc/data.py 60.16% <0.00%> (-19.92%) :arrow_down:
pymc/distributions/logprob.py 93.93% <0.00%> (-3.79%) :arrow_down:
pymc/aesaraf.py 88.53% <0.00%> (-3.66%) :arrow_down:
pymc/step_methods/hmc/base_hmc.py 89.76% <0.00%> (-0.79%) :arrow_down:
pymc/parallel_sampling.py 85.47% <0.00%> (-0.34%) :arrow_down:
... and 2 more

codecov[bot] avatar Aug 10 '22 16:08 codecov[bot]

Interestingly I fixed that and found a bug that I didn't understand. In that example there is a self loop obs -> obs However when iterating over the same data networkx is never passed a parent child pair of obs and obs to create that edge. Turns out it was removed from main but isn't in the version I have installed yet(v4.0.0) https://github.com/pymc-devs/pymc/commit/ce07e7066ff6e08c62280974f55ecbd5e15fa6f9

New update fixes the issue and matches behaviour of latest version(no self loop for observed) screenshot2

jonititan avatar Aug 10 '22 17:08 jonititan

OK I tested my fix and attributes aern't preserved for subgraphs correctly as i'm using compose to combine subgraphs and it overwrites attributes by default. https://networkx.org/documentation/stable/reference/algorithms/generated/networkx.algorithms.operators.binary.compose.html?highlight=compose I've got to put the kids to bed but afterwards i'll think of a nice way to sort it that works for all attributes instead of just the ones I see currently.

jonititan avatar Aug 10 '22 17:08 jonititan

Right i've sorted out the attribute behaviour. networkx doesn't have a way to add attributes to a subgraph you either make them graph, node, or edge attributes. I've made them node attributes and every node in a cluster can be identifield by its cluster attribute which storesthe cluster/plate name

jonititan avatar Aug 10 '22 18:08 jonititan

Did you mean add it to this? test_model_graph.py

Yes! apologies. A test or two would be super helpful here.

Right now formatting is causing the test suite to fail -- the docstrings normally need a little more white space than you have here (probably running pre-commit run --all-files after installing https://pre-commit.com/ in your environment will fix it)

ColCarroll avatar Aug 11 '22 18:08 ColCarroll

ok pre-commit has run successfully and i've committed those changes.

I'll have a think re tests

jonititan avatar Aug 11 '22 19:08 jonititan

Right i've added a simple test in to test_model_graph.py. I think i've understood how that test function file works correctly. I extended SeededTest as used for the modelgraph tests but didn't add as many different tests.

How many test types are needed?

jonititan avatar Aug 11 '22 19:08 jonititan

Test looks good -- it looks like it mostly verifies that this works on certain inputs, and I'm ok for now with that (this seems like a cool feature, and if it gets use that reveals bugs, we can fix those and add more tests).

The linting is still broken, and I'll let @OriolAbril or @ricardoV94 approve/merge once that is fixed, but I approve of the functionality and the tests, for what that's worth!

ColCarroll avatar Aug 11 '22 20:08 ColCarroll

I think i've solved all the linting issues now

jonititan avatar Aug 11 '22 21:08 jonititan

:( thanks for bearing with it -- it looks like it is still a little mucked up. Sorry I can't offer more guidance - I haven't been contributing a ton lately!

ColCarroll avatar Aug 12 '22 01:08 ColCarroll

It passed pre-commit though? And it has the docstring spacing?

Or at least it passed on my pc looks like it's failing here.

I'll spend some time on my setup. Maybe I'm introducing formatting issues when I ctrl+c ctrl+v This time I'll make sure to pull direct and then push back to try to avoid those issues.

On Fri, 12 Aug 2022, 2:21 am Colin Carroll, @.***> wrote:

:( thanks for bearing with it -- it looks like it is still a little mucked up. Sorry I can't offer more guidance - I haven't been contributing a ton lately!

β€” Reply to this email directly, view it on GitHub https://github.com/pymc-devs/pymc/pull/6046#issuecomment-1212641532, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3T4MQDZG32CBDMFPUBKJDVYWRLBANCNFSM56EUI2UQ . You are receiving this because you authored the thread.Message ID: @.***>

jonititan avatar Aug 12 '22 06:08 jonititan

Well I pulled from main, carefully applied my changes, ran pre-commit only on the four files I changed, and then pushed to my patch-2 branch.

pymc_init_.py pymc\tests\test_model_graph.py pymc\model_graph.py pymc\docs\source\api\models.rst

Could you confirm the formatting is ok? Pre-commit seems to still be failing on github but it passed locally on my machine so i don't know whats happening there. Sorry for all this hassle.

jonititan avatar Aug 12 '22 09:08 jonititan

Looks like everything passes in Github except pre-commit itself which gives an opaque error message. Error: The process '/opt/hostedtoolcache/Python/3.10.6/x64/bin/pre-commit' failed with exit code 1

jonititan avatar Aug 12 '22 13:08 jonititan

I took a guess it might be because networkx needs to be added to the requirements so i've done that and when it re runs hopefully that clears it all up.

jonititan avatar Aug 12 '22 14:08 jonititan

Can we re-run the pre-commit action with debug enabled? Or does anyone have an idea what is causing it to fail?

jonititan avatar Aug 12 '22 15:08 jonititan

I am not sure about adding a new dependency. Can we make it optional? You can use pytest.importorskip for the tests

ricardoV94 avatar Aug 12 '22 15:08 ricardoV94

Can we re-run the pre-commit action with debug enabled? Or does anyone have an idea what is causing it to fail?

So see the failure reason click the Details of the failed ❌ pre-commit job.

It failed the black step which normalizes code style. You have two options:

  • Make the same changes it shows in the log by hand,
  • or run the pre-commit locally (recommended).

First pip install pre-commit and second pre-commit run --all to manually run it. This second step makes the necessary changes which you'll have to commit. You can also run pre-commit install (vs. pre-commit uninstall) to set it up to automatically run every time you make/amend a commit.

michaelosthege avatar Aug 12 '22 15:08 michaelosthege

I tried running it twice this time before committing. I push it to my path repo and am running the same tests on it there. If that all works I will report back here for workflows to be retriggered.

jonititan avatar Aug 15 '22 09:08 jonititan

Ok all tests passed on my patch repo so It should surely pass this time here.

jonititan avatar Aug 15 '22 09:08 jonititan

I assume the goal is for the function to be used as pymc.model_to_networkx (which is what is being documented) which won't work yet. For this to work you need to add it to https://github.com/pymc-devs/pymc/blob/main/pymc/init.py#L68

I've made that change you requested @OriolAbril could you confirm please?

jonititan avatar Aug 15 '22 12:08 jonititan

Right i've fixed the trailing whitespace issue and it is passing pre-commit tests again.

jonititan avatar Aug 15 '22 13:08 jonititan

Great. Thanks likewise for bearing with me while I learned on the go.

jonititan avatar Aug 15 '22 17:08 jonititan