pymatgen icon indicating copy to clipboard operation
pymatgen copied to clipboard

Missing matches from ZSLGenerator

Open fyalcin opened this issue 3 years ago • 0 comments

Describe the bug

I noticed that some of the lattice matches I was getting with MPInterfaces' matching algorithm were missing in pymatgen's implementation with the same matching parameters. After a closer look, I narrowed the issue down to the transformation indices generated in ZSLGenerator.generate_sl_transformation_sets(). This happens when int(self.max_area / film_area) (or substrate_area, same issue) rounds down. Since this value is used as the second argument in the range() function, the list comprehension only iterates up to int(self.max_area / film_area) - 1 and as a result, we miss some indices at the upper limit.

In the specific example I was working on(script below), the matched area is just below my max_area, and this missing index results in no matches. Since this only happens when value % 1 < 0.5, a simple np.ceil() would (and does) solve this issue. Here's a small script to reproduce this;

To Reproduce Simply run this script, and currently, it will not return any matches. Converting the int() to np.ceil(), we get 4 matches below the max area of 100 A^2.

import numpy as np
from pymatgen.analysis.interfaces import ZSLGenerator
from pymatgen.core.surface import SlabGenerator
from pymatgen.ext.matproj import MPRester

m = MPRester()

match_params = {'max_area': 100,
                'max_angle_tol': 0.02,
                'max_length_tol': 0.05,
                'max_area_ratio_tol': 0.1,
                'bidirectional': False}

film_bulk = m.get_structure_by_material_id('mp-13', conventional_unit_cell=True)
substrate_bulk = m.get_structure_by_material_id('mp-134', conventional_unit_cell=True)

miller_index = (1, 1, 1)

film_slab = SlabGenerator(film_bulk, miller_index, 10, 10).get_slabs()[0]
substrate_slab = SlabGenerator(substrate_bulk, miller_index, 10, 10).get_slabs()[0]

zsl_gen = ZSLGenerator(**match_params)
matches = list(zsl_gen(film_vectors=film_slab.lattice.matrix[:2],
                       substrate_vectors=substrate_slab.lattice.matrix[:2]))

def get_area(vec1, vec2):
    vec3 = np.cross(vec1, vec2)
    return np.sqrt(np.dot(vec3, vec3))

if matches:
    for i, match in enumerate(matches):
        subs_sl_vecs = match.substrate_sl_vectors
        film_sl_vecs = match.film_sl_vectors
        print(f'For match {i}, substrate SC has area {get_area(subs_sl_vecs[0], subs_sl_vecs[1])} '
              f'and film SC has area {get_area(film_sl_vecs[0], film_sl_vecs[1])}')

Expected behavior There should be a match with roughly 94 A^2 average supercell area with the given match_params.

I can fix this & write some unit tests and do a PR if needed.

fyalcin avatar Mar 09 '22 14:03 fyalcin