gempy icon indicating copy to clipboard operation
gempy copied to clipboard

migrate to PyMC v4.0

Open AndrewAnnex opened this issue 3 years ago • 5 comments

PyMC v4.0 has been released recently and the migration to it from pymc3 looks pretty easy. This could also improve issues with installation of gempy and make it easier to maintain. A number of speed increases are also reported

announcement: https://www.pymc.io/blog/v4_announcement.html migration guide: https://www.pymc-labs.io/blog-posts/the-quickest-migration-guide-ever-from-pymc3-to-pymc-v40/

AndrewAnnex avatar Jun 07 '22 16:06 AndrewAnnex

Hi Andrew,

great suggestion. We're currently in the preparation of the next major iteration of gempy, which won't rely on pymc3 anymore (at least, that's the idea). While the roadmap towards this "gempy3" is taking shape, I cannot give a robust estimate as to when that version will be released. I may be in the meantime a worthwile project to try to migrate the current gempy.

Japhiolite avatar Jun 09 '22 11:06 Japhiolite

I think it would be worth it to make a branch that attempts the migration directly, it could be that a lot of the existing code can be retained

AndrewAnnex avatar Jun 14 '22 17:06 AndrewAnnex

There's now a branch dev_pymc4 where we can attempt it https://github.com/cgre-aachen/gempy/tree/dev_pymc4

Japhiolite avatar Jun 15 '22 06:06 Japhiolite

@Japhiolite I've made a surprising amount of progress with #706, there are only 5 tests failing, some due to serialized models which still use 'theano_optimizer' in the _options.csv files and then some sort of complicated shape error with a few models that I don't know how to diagnose.

I think however there may need to be some work ensuring the code is running on JAX

AndrewAnnex avatar Jul 01 '22 22:07 AndrewAnnex

Thanks for the update Andrew. Sounds great, and I'm somewhat baffled that there're only 5 tests failing now :D

Japhiolite avatar Jul 04 '22 11:07 Japhiolite

@AndrewAnnex, @Japhiolite any progress on that here?

AlexanderJuestel avatar Dec 29 '22 18:12 AlexanderJuestel

@AlexanderJuestel See PR #706

The PR is ready but we still have to figure out how to flag this. A proposition I made to @Leguark was to move to GemPy 2.3 with the switch from Theano to Aesara. However, we were also thinking of putting in a config to switch between Theano and Aesara, as the syntax of both is practically the same. We'd just have to use something along the lines of:

if some condition:
    import theano as ta
else:
    import aesara as ta

Japhiolite avatar Jan 02 '23 08:01 Japhiolite

@AlexanderJuestel @Japhiolite some thoughts:

  1. We need to run some tests to ensure aesara is able to utilize GPUs. I have access to a server with two nvidia hpc gpus so I am happy to run these, but it would be great to see all the tests pass first.
  2. The failing tests are due to the name issue I mentioned above about the name of the optimizer class. I propose that instead of just replacing theano with aesara blindly across the whole repo that we go ahead a rename the class to just always be 'optimizer' and not include theano/aesara in the name at all.
  3. I don't favor the conditional import you specify above. Since theano is effectively defunct I think it just makes it harder to package and ship gempy with no real benefits

AndrewAnnex avatar Jan 02 '23 20:01 AndrewAnnex

In PR #706 all tests are passing. So you could pick some, modify to GPU-utilization and run them on the server you mention? Concerning Point 2. Yes, we should do that. I was hinting at this with the ta acronym, but something like optimizer sounds good to me. With Point 3 you're most likely right too, as we're already running in problems with numpy versions due to theano. So it's likely best to leave theaon behind with Version 2.2.X and start with aesara with Version 2.3 onward.

Japhiolite avatar Jan 04 '23 10:01 Japhiolite

@Japhiolite okay sounds good, I should have some time late next week to run initial tests and from there I can work on points 2 & 3

AndrewAnnex avatar Jan 06 '23 00:01 AndrewAnnex

Hi, is there any progress on the switch?

I'd like to experiment a bit with gempy for a company, so Anaconda isn't available. I'm having installation issues with theano and numpy using pip, despite building the older numpy version.

Happy to work with an experimental branch, although the dev_pymc4 branch for example still seems to reference theano?

Michael-P-Crisp avatar Jan 18 '23 00:01 Michael-P-Crisp

Hi Michael,

hopefully in the near future (February). I'm currently a bit occupied with working stuff, but plan to push forward PR #706 asap. dev_pymc4 is far behind the current main. To test with aesara, try Andrew's fork: https://github.com/AndrewAnnex/gempy/tree/dev_pymc4_aesara

Japhiolite avatar Jan 20 '23 14:01 Japhiolite

Thanks for the info.

I downloaded that branch and initially got an error:

File "C/\Users\MC\Documents\gempy\lib\site-packages\scipy\spatial_kdtree.py", line 4, in from ._ckdtree import cKDTree, cKDTreeNode File "_ckdtree.pyx", line 1, in init scipy.spatial._ckdtree ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject

However after updating to the latest numpy version (1.24.1), it now imports without errors. I'll play around with it a bit more next week.

Keep up the good work, it's appreciated!

Michael-P-Crisp avatar Jan 21 '23 00:01 Michael-P-Crisp

Good catch! Assume there might be our freeze in numpy still active in that branch, which became a necessity with using Theano. "Using theano, GemPy requires numpy version < 1.22.0 as blas_opt_info was deprecated in newer numpy versions." Will remove that from the branch. Thank Michael!

Japhiolite avatar Jan 24 '23 16:01 Japhiolite

Happy to help!

Trying to make a model, I've come across a different issue using Python 3.10. I.e. when running "geo_model = gp.create_model('Model1')", I get a traceback: File "C:\Users\CRI92972\Documents\gempy\lib\site-packages\numpy\core\getlimits.py", line 481, in new dtype = numeric.dtype(type(dtype)) TypeError: 'NoneType' object is not callable

According to this answer on stackoverflow, updating Pandas to 1.4.3 will solve the issue, although I see Gempy currently requires Pandas < 1.4. It also suggests downgrading to Python 3.9 which I'll do next, although it takes forever to get anything new installed on these work computers! Otherwise, I'm not seeing anything in the gempy documentation to suggest it wouldn't work in 3.10. https://stackoverflow.com/questions/69998276/how-to-fix-the-numpy-dtype-nonetype-error

Michael-P-Crisp avatar Jan 25 '23 00:01 Michael-P-Crisp

for what it is worth I am seeing this error now after making a new environment and with pandas 1.3.4 in unconformity_model_topo

self =
Lithology ids
  [4. 4. 4. ... 1. 1. 1.]

values = [array([[4.       , 4.       , 4.       , ..., 2.5003058, 2.5003058,
        2.5003058]]), array([[[2.       , 2.     ...        0.67693603]]), array([[1.1507    , 0.        , 0.
  ],
       [0.        , 0.54057858, 0.67693605]]), ...]

    def set_solution_to_topography(self, values: Union[list, np.ndarray]):
        l0, l1 = self.grid.get_grid_args('topography')
>       self.geological_map = np.array(
            [values[0][:, l0: l1], values[4][:, l0: l1].astype(float)])
E       ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after
1 dimensions. The detected shape was (2,) + inhomogeneous part.

gempy/core/solution.py:180: ValueError```

so something is going wrong now with the tests, in fact all the tests are failing for me, maybe a numpy version issue?

AndrewAnnex avatar Jan 25 '23 00:01 AndrewAnnex

Just following up from my last post about the "ValueError: numpy.ndarray size changed, may indicate binary incompatibility." error in the CURRENT release version of gempy.

Installing via pip with python 3.10 causes the issue, however it works in python 3.9. When I do a separate install via Anaconda, it works fine with python 3.10.

Perhaps a note can be added somewhere suggesting python 3.9 is required when installing with pip, until the new Aesera version comes out?

Michael-P-Crisp avatar Jan 27 '23 01:01 Michael-P-Crisp

I'll put that warning in for the time being. I should really allocate some time to get the pandas dependency out of the freezer. I think within February, we'll just move to 2.3 using aesara and work from there. With progressing other libraries, the current state of GemPy just becomes more and more "fragile"

Japhiolite avatar Feb 02 '23 20:02 Japhiolite

GemPy 2.3 has been released and the issue should have been solved now! Please reopen if you still have issues with the installation.

See https://github.com/cgre-aachen/gempy/releases/tag/v2.3.0 for more information

AlexanderJuestel avatar Jun 20 '23 15:06 AlexanderJuestel