Most fields not cleaned from '<' characters

Open ronny-rentner opened this issue 10 months ago • 1 comments

I've been using the MRZ() class to parse a German passport MRZ string. It gives me 'D<<' as a country and nationality. Is this intentional to include those filling characters?

I've noticed you're stripping the '<' characters off from name and surname but not from most other fields.

As a side note, self.names.replace('<', ' ').strip() might be replaced with self.names.rstrip('<').

I am using such a _clean() function now:

    def _clean(self, attribute_list=None):
        """Strips trailing '<' characters from specified attributes."""
        if attribute_list is None:
            attribute_list = ['type', 'country', 'number', 'optional1', 'nationality', 'optional2', 'personal_number']
        for attr_name in attribute_list:
            if hasattr(self, attr_name):
                current_value = getattr(self, attr_name)
                if isinstance(current_value, str): # Ensure we are only stripping strings
                    setattr(self, attr_name, current_value.rstrip('<'))

Feb 28 '25 08:02 ronny-rentner

Yeah, it was intentional because (as far as I remember now) I was relying on some specs which claimed the respective fields must have that and that many characters while the name field could have arbitrary lengths.

If it is the wrong treatment I'm happy to treat it differently, but it would be nice to clarify again what the standard says.

Feb 28 '25 16:02 konstantint