gempy
gempy copied to clipboard
[DEP] IndexError when using pandas==2.0.2
Describe the bug
After updating pandas from 2.0.1 to 2.0.2 using pip. The error already occurs when creating a new model using gp.create_model('Model1')
.
IndexError Traceback (most recent call last)
Cell In[8], line 1
----> 1 geo_model = gp.create_model('Model1')
2 geo_model
File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\gempy_api.py:114, in create_model(project_name)
99 def create_model(project_name='default_project') -> Project:
100 """Create a Project object.
101
102 Args:
(...)
112 TODO: Adding saving address
113 """
--> 114 return Project(project_name)
File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\model.py:1628, in Project.__init__(self, project_name)
1625 def __init__(self, project_name='default_project'):
1627 self.meta = MetaData(project_name=project_name)
-> 1628 super().__init__()
File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\model.py:85, in ImplicitCoKriging.__init__(self)
78 self._rescaling = ScalingSystem(self._surface_points, self._orientations,
79 self._grid)
80 self._additional_data = AdditionalData(self._surface_points,
81 self._orientations, self._grid,
82 self._faults,
83 self._surfaces, self._rescaling)
---> 85 self._interpolator = InterpolatorModel(self._surface_points,
86 self._orientations, self._grid,
87 self._surfaces,
88 self._stack, self._faults,
89 self._additional_data)
91 self.solutions = Solution(self._grid, self._surfaces, self._stack)
93 # Previous values of sfai.
File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\interpolator.py:650, in InterpolatorModel.__init__(self, surface_points, orientations, grid, surfaces, series, faults, additional_data, **kwargs)
646 def __init__(self, surface_points: "SurfacePoints", orientations: "Orientations", grid: "Grid",
647 surfaces: "Surfaces", series, faults: "Faults", additional_data: "AdditionalData",
648 **kwargs):
--> 650 super().__init__(surface_points, orientations, grid, surfaces, series, faults,
651 additional_data, **kwargs)
652 self.len_series_i = np.zeros(1)
653 self.len_series_o = np.zeros(1)
File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\interpolator.py:66, in Interpolator.__init__(self, surface_points, orientations, grid, surfaces, series, faults, additional_data, **kwargs)
63 self.aesara_graph = self.create_aesara_graph(additional_data, inplace=False)
64 self.aesara_function = None
---> 66 self._compute_len_series()
File ~\Documents\02_Weisweiler\00_Geomechanical_Model\02_Python\../../../gempy_v2023.1.0/gempy-gempy_v2023.1.0/gempy-gempy_v2023.1.0\gempy\core\interpolator.py:822, in InterpolatorModel._compute_len_series(self)
817 self.len_series_f = np.atleast_1d(len_series_f_.astype(
818 'int32')) # [:self.additional_data.get_additional_data()['values']['Structure', 'number series']]
820 self._old_len_series = self.len_series_i
--> 822 self.len_series_i = self.len_series_i[non_zero]
823 self.len_series_o = self.len_series_o[non_zero]
824 # self.len_series_f = self.len_series_f[non_zero]
IndexError: invalid index to scalar variable.
To Reproduce Provide detailed steps to reproduce the behavior:
Updating pandas from 2.0.1 to 2.0.2 and using the latest version of the development branch ...
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
- OS: (e.g. iOS)
- GemPy Version
- if installed via pip: provide GemPy version (e.g. 2.0.1)
- if cloned from GitHub: provide Git commit id (e.g. 839bf85f)
- Jupyter Version (if applicable)
Additional context Add any other context about the problem here.
The error remains in a freshly installed environment and with all packages installed manually.
What is the work around for this? I ran into it after having a working install and now can't seem to get around this issue, by means of downgrading pandas.
So, you still have this issue after downgrading pandas? Or does downgrade not work in your environment? GemPy won't work with pandas 2.0.2, but 2.0 and 2.0.1
Tracking down the issue further. The function that is broken is _def_compute_len_series
def _compute_len_series(self):
self.len_series_i = self.additional_data.structure_data.df.loc[
'values', 'len series surface_points'] - \
self.additional_data.structure_data.df.loc[
'values', 'number surfaces per series']
self.len_series_o = self.additional_data.structure_data.df.loc[
'values', 'len series orientations'].astype(
'int32')
self.len_series_i
and self.len_series_o
are of type np.int32
and equal both to 0
when running the model the first time
# Remove series without data
non_zero_i = self.len_series_i.nonzero()[0]
non_zero_o = self.len_series_o.nonzero()[0]
non_zero = np.intersect1d(non_zero_i, non_zero_o)
self.non_zero = non_zero
non_zero
equals therefore to array([], dtype=int64)
, an empty array
self.len_series_u = self.additional_data.kriging_data.df.loc[
'values', 'drift equations'].astype('int32')
try:
len_series_f_ = self.faults.faults_relations_df.values[non_zero][:, non_zero].sum(
axis=0)
except np.AxisError:
print('np.axis error')
len_series_f_ = self.faults.faults_relations_df.values.sum(axis=0)
self.len_series_f = np.atleast_1d(len_series_f_.astype(
'int32')) # [:self.additional_data.get_additional_data()['values']['Structure', 'number series']]
self._old_len_series = self.len_series_i
self.len_series_i = self.len_series_i[non_zero]
self.len_series_o = self.len_series_o[non_zero]
# self.len_series_f = self.len_series_f[non_zero]
self.len_series_u = self.len_series_u[non_zero]
Indexing self.len_series_i
(type np.int32) with non_zero
results in the error seen in this issue.
Reproducing the error locally:
# Index Error raised since pandas==2.0.2
try:
self.len_series_i = self.len_series_i[non_zero]
self.len_series_o = self.len_series_o[non_zero]
# self.len_series_f = self.len_series_f[non_zero]
self.len_series_u = self.len_series_u[non_zero]
if self.len_series_i.shape[0] == 0:
self.len_series_i = np.zeros(1, dtype=int)
self._old_len_series = self.len_series_i
if self.len_series_o.shape[0] == 0:
self.len_series_o = np.zeros(1, dtype=int)
if self.len_series_u.shape[0] == 0:
self.len_series_u = np.zeros(1, dtype=int)
if self.len_series_f.shape[0] == 0:
self.len_series_f = np.zeros(1, dtype=int)
except IndexError:
self.len_series_i = np.array([self.len_series_i])
self.len_series_o = np.array([self.len_series_o])
# self.len_series_f = np.array([self.len_series_f])
self.len_series_u = np.array([self.len_series_u])
# Type Error raised since pandas==2.0.2
try:
if len(self.kriging_data.df.loc['values', 'drift equations']) < \
self.structure_data.df.loc['values', 'number series']:
self.kriging_data.set_u_grade()
except TypeError:
if int(self.kriging_data.df.loc['values', 'drift equations']) < \
self.structure_data.df.loc['values', 'number series']:
self.kriging_data.set_u_grade()
There seems to be a change in data type from pandas 2.0.1 to 2.0.2. So we need to track down the location where the DataFrame is constructed.
This can be traced to the following value:
Now I just need to find the place where the values are assigned to the DataFrame....
Opened an issue in the pandas repo: https://github.com/pandas-dev/pandas/issues/54519
There seems to be a change in data type from pandas 2.0.1 to 2.0.2. So we need to track down the location where the DataFrame is constructed.
This can be traced to the following value:
Now I just need to find the place where the values are assigned to the DataFrame....
as a fix / workaround, might already work to put the int
representing the geo_model
object into a numpy.ndarray
.
GemPy v3 does not depend on pandas anymore