gempy icon indicating copy to clipboard operation
gempy copied to clipboard

[DEP] IndexError when using pandas==2.0.2

Open AlexanderJuestel opened this issue 1 year ago • 9 comments

Describe the bug After updating pandas from 2.0.1 to 2.0.2 using pip. The error already occurs when creating a new model using gp.create_model('Model1').

IndexError                                Traceback (most recent call last)
Cell In[8], line 1
----> 1 geo_model = gp.create_model('Model1')
      2 geo_model

File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\gempy_api.py:114, in create_model(project_name)
     99 def create_model(project_name='default_project') -> Project:
    100     """Create a Project object.
    101 
    102     Args:
   (...)
    112         TODO: Adding saving address
    113     """
--> 114     return Project(project_name)

File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\model.py:1628, in Project.__init__(self, project_name)
   1625 def __init__(self, project_name='default_project'):
   1627     self.meta = MetaData(project_name=project_name)
-> 1628     super().__init__()

File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\model.py:85, in ImplicitCoKriging.__init__(self)
     78 self._rescaling = ScalingSystem(self._surface_points, self._orientations,
     79                                 self._grid)
     80 self._additional_data = AdditionalData(self._surface_points,
     81                                        self._orientations, self._grid,
     82                                        self._faults,
     83                                        self._surfaces, self._rescaling)
---> 85 self._interpolator = InterpolatorModel(self._surface_points,
     86                                        self._orientations, self._grid,
     87                                        self._surfaces,
     88                                        self._stack, self._faults,
     89                                        self._additional_data)
     91 self.solutions = Solution(self._grid, self._surfaces, self._stack)
     93 # Previous values of sfai.

File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\interpolator.py:650, in InterpolatorModel.__init__(self, surface_points, orientations, grid, surfaces, series, faults, additional_data, **kwargs)
    646 def __init__(self, surface_points: "SurfacePoints", orientations: "Orientations", grid: "Grid",
    647              surfaces: "Surfaces", series, faults: "Faults", additional_data: "AdditionalData",
    648              **kwargs):
--> 650     super().__init__(surface_points, orientations, grid, surfaces, series, faults,
    651                      additional_data, **kwargs)
    652     self.len_series_i = np.zeros(1)
    653     self.len_series_o = np.zeros(1)

File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\interpolator.py:66, in Interpolator.__init__(self, surface_points, orientations, grid, surfaces, series, faults, additional_data, **kwargs)
     63 self.aesara_graph = self.create_aesara_graph(additional_data, inplace=False)
     64 self.aesara_function = None
---> 66 self._compute_len_series()

File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\interpolator.py:822, in InterpolatorModel._compute_len_series(self)
    817 self.len_series_f = np.atleast_1d(len_series_f_.astype(
    818     'int32'))  # [:self.additional_data.get_additional_data()['values']['Structure', 'number series']]
    820 self._old_len_series = self.len_series_i
--> 822 self.len_series_i = self.len_series_i[non_zero]
    823 self.len_series_o = self.len_series_o[non_zero]
    824 # self.len_series_f = self.len_series_f[non_zero]

IndexError: invalid index to scalar variable.

To Reproduce Provide detailed steps to reproduce the behavior:

Updating pandas from 2.0.1 to 2.0.2 and using the latest version of the development branch ...

Expected behavior A clear and concise description of what you expected to happen.

Screenshots If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: (e.g. iOS)
  • GemPy Version
    • if installed via pip: provide GemPy version (e.g. 2.0.1)
    • if cloned from GitHub: provide Git commit id (e.g. 839bf85f)
  • Jupyter Version (if applicable)

Additional context Add any other context about the problem here.

AlexanderJuestel avatar Jun 17 '23 07:06 AlexanderJuestel

The error remains in a freshly installed environment and with all packages installed manually.

AlexanderJuestel avatar Jun 19 '23 06:06 AlexanderJuestel

What is the work around for this? I ran into it after having a working install and now can't seem to get around this issue, by means of downgrading pandas.

phasyn8 avatar Jul 21 '23 13:07 phasyn8

So, you still have this issue after downgrading pandas? Or does downgrade not work in your environment? GemPy won't work with pandas 2.0.2, but 2.0 and 2.0.1

Japhiolite avatar Jul 24 '23 08:07 Japhiolite

Tracking down the issue further. The function that is broken is _def_compute_len_series

    def _compute_len_series(self):


        self.len_series_i = self.additional_data.structure_data.df.loc[
                                'values', 'len series surface_points'] - \
                            self.additional_data.structure_data.df.loc[
                                'values', 'number surfaces per series']


        self.len_series_o = self.additional_data.structure_data.df.loc[
            'values', 'len series orientations'].astype(
            'int32')

self.len_series_i and self.len_series_o are of type np.int32 and equal both to 0 when running the model the first time

        # Remove series without data
        non_zero_i = self.len_series_i.nonzero()[0]
        non_zero_o = self.len_series_o.nonzero()[0]
        non_zero = np.intersect1d(non_zero_i, non_zero_o)


        self.non_zero = non_zero

non_zero equals therefore to array([], dtype=int64), an empty array

        self.len_series_u = self.additional_data.kriging_data.df.loc[
            'values', 'drift equations'].astype('int32')
        try:
            len_series_f_ = self.faults.faults_relations_df.values[non_zero][:, non_zero].sum(
                axis=0)


        except np.AxisError:
            print('np.axis error')
            len_series_f_ = self.faults.faults_relations_df.values.sum(axis=0)


        self.len_series_f = np.atleast_1d(len_series_f_.astype(
            'int32'))  # [:self.additional_data.get_additional_data()['values']['Structure', 'number series']]


        self._old_len_series = self.len_series_i


        self.len_series_i = self.len_series_i[non_zero]
        self.len_series_o = self.len_series_o[non_zero]
        # self.len_series_f = self.len_series_f[non_zero]
        self.len_series_u = self.len_series_u[non_zero]

Indexing self.len_series_i (type np.int32) with non_zero results in the error seen in this issue.

Reproducing the error locally: image

AlexanderJuestel avatar Aug 12 '23 17:08 AlexanderJuestel


# Index Error raised since pandas==2.0.2
        try:
            self.len_series_i = self.len_series_i[non_zero]
            self.len_series_o = self.len_series_o[non_zero]
            # self.len_series_f = self.len_series_f[non_zero]
            self.len_series_u = self.len_series_u[non_zero]

            if self.len_series_i.shape[0] == 0:
                self.len_series_i = np.zeros(1, dtype=int)
                self._old_len_series = self.len_series_i

            if self.len_series_o.shape[0] == 0:
                self.len_series_o = np.zeros(1, dtype=int)
            if self.len_series_u.shape[0] == 0:
                self.len_series_u = np.zeros(1, dtype=int)
            if self.len_series_f.shape[0] == 0:
                self.len_series_f = np.zeros(1, dtype=int)

        except IndexError:
            self.len_series_i = np.array([self.len_series_i])
            self.len_series_o = np.array([self.len_series_o])
            # self.len_series_f = np.array([self.len_series_f])
            self.len_series_u = np.array([self.len_series_u])

AlexanderJuestel avatar Aug 12 '23 17:08 AlexanderJuestel


# Type Error raised since pandas==2.0.2
        try:
            if len(self.kriging_data.df.loc['values', 'drift equations']) < \
                    self.structure_data.df.loc['values', 'number series']:
                self.kriging_data.set_u_grade()
        except TypeError:
            if int(self.kriging_data.df.loc['values', 'drift equations']) < \
                    self.structure_data.df.loc['values', 'number series']:
                self.kriging_data.set_u_grade()

AlexanderJuestel avatar Aug 12 '23 18:08 AlexanderJuestel

There seems to be a change in data type from pandas 2.0.1 to 2.0.2. So we need to track down the location where the DataFrame is constructed.

image

image

This can be traced to the following value:

image

image

image

image

Now I just need to find the place where the values are assigned to the DataFrame....

AlexanderJuestel avatar Aug 12 '23 20:08 AlexanderJuestel

Opened an issue in the pandas repo: https://github.com/pandas-dev/pandas/issues/54519

AlexanderJuestel avatar Aug 13 '23 14:08 AlexanderJuestel

There seems to be a change in data type from pandas 2.0.1 to 2.0.2. So we need to track down the location where the DataFrame is constructed.

image

image

This can be traced to the following value:

image

image

image

image

Now I just need to find the place where the values are assigned to the DataFrame....

as a fix / workaround, might already work to put the int representing the geo_model object into a numpy.ndarray.

Japhiolite avatar Aug 14 '23 09:08 Japhiolite

GemPy v3 does not depend on pandas anymore

Leguark avatar Apr 16 '24 12:04 Leguark