pypact Reading values missing "E" character for exponentials

Hello,

I am using FISPACT and noticed that the extract_boundaries_and_values function in gammaspectrum.py for output treatment has issues treating outputs with missing "E" character for exponential format. gammaspectrum.txt

I added a "patch" from my Python experience wich is not optimal of course.

I hope this is the correct way of contributing. If this is not welcomed, please notify me. If this is welcomed but clearly not optimal, I'll be happy to know a better method.

Best regards

Dec 12 '22 14:12 MrDeoth

Hi, thanks for finding this issue.

Looking at the code that could indeed be the case, although I have not seen a test case showing this yet. Could you perhaps share the file that is causing the issue, so I could add it as a test case? If the spectrum is sensitive then perhaps you can mutate the numbers but preserve the failing format.

From your attachment, I assume you're proposing the following fix.

import re

from pypact.output.tags import GAMMA_SPECTRUM_SUB_HEADER
from pypact.util.decorators import freeze_it
from pypact.util.jsonserializable import JSONSerializable

FLOAT_NUMBER = r"[0-9]+(?:\.(?:[0-9]+))?(?:e?(?:[-+]?[0-9]+)?)?"
GAMMA_SPECTRUM_LINE = \
    r"[^(]*\(\s*(?P<lb>{FN})\s*-\s*(?P<ub>{FN})\s*MeV\)\s*(?P<value>{FN})\D*(?P<vr>{FN}).*".format(
        FN=FLOAT_NUMBER,
    )
GAMMA_SPECTRUM_LINE_MATCHER = re.compile(GAMMA_SPECTRUM_LINE, re.IGNORECASE)


@freeze_it
class GammaSpectrum(JSONSerializable):
    """
        The gamma spectrum type from the output
    """

    def __init__(self):
        self.boundaries = []  # TODO dvp: should be numpy arrays (or even better xarrays)
        self.values = []
        self.volumetric_rates = []

    def fispact_deserialize(self, file_record, interval):
        self.__init__()

        lines = file_record[interval]

        def extract_boundaries_and_values(_lines):
            header_found = False
            for line in _lines:
                if not header_found:
                    if GAMMA_SPECTRUM_SUB_HEADER in line:
                        header_found = True
                if header_found:
                    if line.strip() == "":
                        return
                    match = GAMMA_SPECTRUM_LINE_MATCHER.match(line)
                    lower_boundary = float(match.group("lb"))
                    upper_boundary = float(match.group("ub"))
                    value_str = match.group("value")
                    if "E" not in value_str :
                        splitted_value_str = value_str.split("-")
                        splitted_value_str = [splitted_value_str[0], "E-", splitted_value_str[1]]
                        value_str = "".join(splitted_value_str)
                    value = float(value_str)
                    volumetric_rate_str = match.group("vr")
                    if "E" not in volumetric_rate_str :
                        splitted_volumetric_rate_str = volumetric_rate_str.split("-")
                        splitted_volumetric_rate_str = [splitted_volumetric_rate_str[0], "E-", splitted_volumetric_rate_str[1]]
                        volumetric_rate_str = "".join(splitted_volumetric_rate_str)
                    volumetric_rate = float(volumetric_rate_str)
                    yield lower_boundary, upper_boundary, value, volumetric_rate

        boundaries = []
        values = []
        volumetric_rates = []

        for lb, ub, v, vr in extract_boundaries_and_values(lines):
            if not boundaries:
                boundaries.append(lb)
            boundaries.append(ub)
            values.append(v)
            volumetric_rates.append(vr)

        if values:
            self.boundaries = boundaries
            self.values = values
            self.volumetric_rates = volumetric_rates

This could work, but I am now thinking we should probably use the utility function to handle this: https://github.com/fispact/pypact/blob/master/pypact/util/numerical.py#L12

There are some tests already to try and cover this case - is your failing float an example of one of these tests? https://github.com/fispact/pypact/blob/master/tests/util/numericaltest.py

Dec 12 '22 21:12 thomasms

Hi, Thanks for aswering. I'd rather not send you my files because I don't know in what extent I am allowed to share anything, even with artificial data. The number format with causes this issue is indeed in fortrant float style "-2.34321-308" (which I didn't know it existed until now). Using the utility function is clearly a better option since mine would cause issues with negative values in fortran format. I successfully tested it in my case replacing the float() functions by get_float() from numerical.py.

Thanks

Dec 13 '22 09:12 MrDeoth

PS : Replacing the float() functions in gammaspectrum.py.

Dec 13 '22 09:12 MrDeoth

Going to reopen this to fix as suggested.

Dec 22 '22 21:12 thomasms