kliff icon indicating copy to clipboard operation
kliff copied to clipboard

`VisibleDeprecationWarning` in weight class

Open ipcamit opened this issue 1 year ago • 4 comments

Line 177 in kliff.dataset.weight module is giving the following error in Python > 3.9, and error in Python >3.10

ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

originating here:

    def _compute_weight_one_property(data_norm, property_weight_params, property_type):
        c1, c2 = property_weight_params
->        sigma = np.linalg.norm([c1, (c2 * data_norm)])
        weight = 1 / sigma
...

@yonatank93 I am currently adapting it for new workflow, can you please tell what is the expected behavior of this, so I can modify it accordingly?

Does this captures your intent correctly? :

    c1_arr = np.ones_like(data_norm) * c1
    c2_arr = data_norm * c2
    sigma = np.linalg.norm(np.vstack((c1_arr, c2_arr)), axis=0) 

Both functions give identical results.

But as per the equation defined in MagnitudeInverseWeight here: https://kliff.readthedocs.io/en/latest/modules/dataset.html#magnitude-inverse-weight

I think it should be:

    sigma = np.linalg.norm([c1, c2 * np.linalg.norm(data_norm)]) 

ipcamit avatar Apr 02 '23 04:04 ipcamit

Hi @ipcamit. I think we should modify L177 as

sigma = np.array([np.linalg.norm(c1, c2 * dn) for dn in data_norm])

The equation in the documentation is for per data point. And, for example, if the configuration has 3 atoms in it, the function _compute_weight_one_property will return 3 numbers, one corresponding to each atom.

If we apply this change, I think we also need to apply additional minor change, for example to L146-L148

self._energy_weight = self._compute_weight_one_property(
        [energy_norm], self._weight_params["energy_weight_params"], "energy"
)[0]

and to L165-L167

self._stress_weight = self._compute_weight_one_property(
        [stress_norm], self._weight_params["stress_weight_params"], "stress"
)[0]

Note that I haven't tested these suggestions.

The reason is that for a configuration with N atoms, self._forces_weight needs to be a 1D array with 3N elements, and self._energy_weight and self._stress_weight each needs to be a float.

yonatank93 avatar Apr 03 '23 13:04 yonatank93

@yonatank93 you seem updated the Weight class in the just merged PR. Is the issue reported by @ipcamit resolved there?

mjwen avatar Apr 04 '23 01:04 mjwen

Yes, I did. I think it is resolved.

yonatank93 avatar Apr 04 '23 01:04 yonatank93

Also, we need to update the Weight class such that the input shape of forces_weight and stress_weight do not need to be a float. For example, the force weight need to accommodate for three cases

  • a float. It will be multiple by all (N, 3) force components, where N is the number of atoms
  • 1d array with shape (N,). It will be broadcasted into a shape of (N, 3) and then multiple the forces. Physically, this enables using a different weight for each atom
  • 2d array with shape (N,3). This will be directly multiplied by the forces, enabling the use of a different weight for each individual component

Since @ipcamit is adapting for the new workflow, we do not need to modify it now. But once the new stuff is functioning, we need to get back to this.

mjwen avatar Apr 04 '23 01:04 mjwen