pymatgen
pymatgen copied to clipboard
Missing matches from ZSLGenerator
Describe the bug
I noticed that some of the lattice matches I was getting with MPInterfaces' matching algorithm were missing in pymatgen's implementation with the same matching parameters. After a closer look, I narrowed the issue down to the transformation indices generated in ZSLGenerator.generate_sl_transformation_sets(). This happens when int(self.max_area / film_area) (or substrate_area, same issue) rounds down. Since this value is used as the second argument in the range() function, the list comprehension only iterates up to int(self.max_area / film_area) - 1 and as a result, we miss some indices at the upper limit.
In the specific example I was working on(script below), the matched area is just below my max_area, and this missing index results in no matches. Since this only happens when value % 1 < 0.5, a simple np.ceil() would (and does) solve this issue. Here's a small script to reproduce this;
To Reproduce
Simply run this script, and currently, it will not return any matches. Converting the int() to np.ceil(), we get 4 matches below the max area of 100 A^2.
import numpy as np
from pymatgen.analysis.interfaces import ZSLGenerator
from pymatgen.core.surface import SlabGenerator
from pymatgen.ext.matproj import MPRester
m = MPRester()
match_params = {'max_area': 100,
'max_angle_tol': 0.02,
'max_length_tol': 0.05,
'max_area_ratio_tol': 0.1,
'bidirectional': False}
film_bulk = m.get_structure_by_material_id('mp-13', conventional_unit_cell=True)
substrate_bulk = m.get_structure_by_material_id('mp-134', conventional_unit_cell=True)
miller_index = (1, 1, 1)
film_slab = SlabGenerator(film_bulk, miller_index, 10, 10).get_slabs()[0]
substrate_slab = SlabGenerator(substrate_bulk, miller_index, 10, 10).get_slabs()[0]
zsl_gen = ZSLGenerator(**match_params)
matches = list(zsl_gen(film_vectors=film_slab.lattice.matrix[:2],
substrate_vectors=substrate_slab.lattice.matrix[:2]))
def get_area(vec1, vec2):
vec3 = np.cross(vec1, vec2)
return np.sqrt(np.dot(vec3, vec3))
if matches:
for i, match in enumerate(matches):
subs_sl_vecs = match.substrate_sl_vectors
film_sl_vecs = match.film_sl_vectors
print(f'For match {i}, substrate SC has area {get_area(subs_sl_vecs[0], subs_sl_vecs[1])} '
f'and film SC has area {get_area(film_sl_vecs[0], film_sl_vecs[1])}')
Expected behavior There should be a match with roughly 94 A^2 average supercell area with the given match_params.
I can fix this & write some unit tests and do a PR if needed.