calc icon indicating copy to clipboard operation
calc copied to clipboard

Feature request: support hex floats

Open ghost opened this issue 5 years ago • 13 comments
trafficstars

Hexadecimal floats are a formatting for floating point values supported in C since C99. It shows the mantissa in hex. This is useful because it shows the exact number with no rounding or decimal approximation.

E.g. 0.486224 is 0x1.eff2bp+6.

ghost avatar May 07 '20 10:05 ghost

I like this idea. Anyone want to take a crack as modifying the parser to permit such Hexadecimal floats?

lcn2 avatar Feb 03 '21 10:02 lcn2

` // char* hex_float_c_str; // PCRE validator of input: // "[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?" // Test examples // char* example_hex_float = "+0x1.921fb54442d18p+0001"; // char* example_hex_float = "+0x0.0000000000000p+0000"; // char* example_hex_float = "+0x0.0000000000001p+0000"; // char* example_hex_float = "+0x1.0p+0000"; // char* example_hex_float = "+0x1.0"; // char* example_hex_float = "+0x1";

    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    long long numerator = 0;
    int mantissa_power_of_2 = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    // Keeping it simpler later in code
    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if (example_hex_float.mid(ch_idx, 2) == "0x") {
        ch_idx += 2;
        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < example_hex_float.length()) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                mantissa_power_of_2 -= 4;
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                mantissa_power_of_2 -= 4;
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < example_hex_float.length()) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < example_hex_float.length()) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);

        long long power_multiplier = 1;
        power_multiplier = power_multiplier << abs(power_of_2);
        if (power_of_2 > 0) {
            numerator = numerator * power_multiplier;
        }
        else if (power_of_2 < 0) {
            denominator = power_multiplier;
        }
    }

    numerator = sign * numerator;

`

kcrossen avatar Mar 01 '22 16:03 kcrossen

Testing usefulness in Calc (my version):

    QString test_commands = "";
    test_commands += QString("test_value=") + QString::number(numerator) + "/" + QString::number(denominator) + ";";
    RPN_Commands_Execute(test_commands);

    QString test_result = Trim_Calc_Result(Calc_Evaluate("round(test_value, 32);"));

    qDebug() << test_result;

kcrossen avatar Mar 01 '22 17:03 kcrossen

The above code will "overflow" or "underflow" because of the limitations of long long, so: ` // char* hex_float_c_str; // PCRE validator of input: // "[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?" // Test examples // char* hex_float_c_str = "+0x1.921fb54442d18p+0001"; // char* hex_float_c_str = "+0x0.0000000000000p+0000"; // char* hex_float_c_str = "+0x0.0000000000001p+0000"; // char* hex_float_c_str = "+0x1.0p+0000"; // char* hex_float_c_str = "+0x1.0"; // char* hex_float_c_str = "+0x1"; // Max: // char* hex_float_c_str = "+0x1.fffffffffffffp+1023"; // Min: // char* hex_float_c_str = "+0x1.0000000000000p-1074";

    int hex_float_c_str_length = strlen(hex_float_c_str);
    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    long long numerator = 0;
    int mantissa_power_of_2 = 0;
    int mantissa_digit_count = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if ((ch_idx < (hex_float_c_str_length - 1)) and
        (hex_float_c_str[ch_idx] == '0') and
        (hex_float_c_str[ch_idx + 1] == 'x')) {
        ch_idx += 2;
        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < 14) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
    }

    QString test_commands = "test_value=(";
    if (sign < 0) test_commands += "-1*";
    test_commands += QString::number(numerator);
    if (power_of_2 > 0) test_commands += "*2^" + QString::number(power_of_2);
    test_commands += ")/(";
    test_commands += QString::number(denominator);
    if (power_of_2 < 0) test_commands += "*2^" + QString::number(-power_of_2);
    test_commands += ");";
    RPN_Commands_Execute(test_commands);

    QString test_result = Trim_Calc_Result(Calc_Evaluate("estr(test_value);"));

    qDebug() << test_result;

` The individual components, mantissa and power, if following standard, will stay within the range of long long. Excess mantissa digits are ignored if after the radix mark or effectively replaced with zeros if before the radix mark.

kcrossen avatar Mar 01 '22 18:03 kcrossen

Added test for "overflow": // char* hex_float_c_str = "+0x1.921fb54442d18abcdefp+0001";

kcrossen avatar Mar 01 '22 19:03 kcrossen

Expand value range of tolerated hex floats by 16X: `void Test_Hexadecimal_Float_Parse ( QString Hexadecimal_Float_String ) { // PCRE validator of input: QRegExp validate_hexadecimal_float = QRegExp("[+-]?0x[0-9a-f]+([.][0-9a-f]+)?(p[+-]?[0-9]+)?", Qt::CaseInsensitive);

if (validate_hexadecimal_float.exactMatch(Hexadecimal_Float_String)) {
    QByteArray example_hex_float_ba = Hexadecimal_Float_String.toLocal8Bit();
    char* hex_float_c_str = (char*) malloc(example_hex_float_ba.count() + 10);
    strncpy(hex_float_c_str, example_hex_float_ba.data(), example_hex_float_ba.count());

    int hex_float_c_str_length = strlen(hex_float_c_str);
    // Keeping sign separate makes mantissa testing simpler
    int sign = 1;
    unsigned long long numerator = 0;
    int mantissa_power_of_2 = 0;

#define maximum_mantissa_digit_count 15 int mantissa_digit_count = 0;

    // Keeping sign separate makes power testing simpler
    int power_sign = 1;
    int power_of_2 = 0;

    // Introduce some lenience
    bool has_fraction = false;
    bool has_power = false;

    for (int lo_idx = 0; hex_float_c_str[lo_idx]; lo_idx++)
        hex_float_c_str[lo_idx] = tolower(hex_float_c_str[lo_idx]);

    int ch_idx = 0;
    if ((hex_float_c_str[ch_idx] == '+') or
        (hex_float_c_str[ch_idx] == '-')) {
        if (hex_float_c_str[ch_idx] == '-') sign = -1;
        ch_idx++;
    }

    if ((ch_idx < (hex_float_c_str_length - 1)) and
        (hex_float_c_str[ch_idx] == '0') and
        (hex_float_c_str[ch_idx + 1] == 'x')) {
        ch_idx += 2;
        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, this seriously violates standard ...
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_digit_count += 1;
                }
                else {
                    // ... but cope by effectively treating ...
                    // ... remaining digits as zero
                    mantissa_power_of_2 += 4;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == '.')) {
        ch_idx++;
        // Require fractional part after radix point
        has_fraction = true;

        if (numerator == 0) {
            // Literal must have started with "0x0." ...
            // ... i.e. not normalized, therefore ...
            mantissa_power_of_2 -= 1;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + int(hex_float_c_str[ch_idx]) - int('0');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else if ((hex_float_c_str[ch_idx] >= 'a') and
                     (hex_float_c_str[ch_idx] <= 'f')) {
                // Prevent overflow, parses what standard allows
                if (mantissa_digit_count < maximum_mantissa_digit_count) {
                    numerator = (numerator << 4) + 10 + int(hex_float_c_str[ch_idx]) - int('a');
                    mantissa_power_of_2 -= 4;
                    mantissa_digit_count += 1;
                }
                ch_idx++;
            }
            else break;
        }
    }

    if ((ch_idx < hex_float_c_str_length) and
        (hex_float_c_str[ch_idx] == 'p')) {
        ch_idx++;
        // Not lenient here, must finish power if started
        has_power = true;

        if ((hex_float_c_str[ch_idx] == '+') or
            (hex_float_c_str[ch_idx] == '-')) {
            if (hex_float_c_str[ch_idx] == '-') power_sign = -1;
            ch_idx++;
        }

        while (ch_idx < hex_float_c_str_length) {
            if ((hex_float_c_str[ch_idx] >= '0') and
                (hex_float_c_str[ch_idx] <= '9')) {
                power_of_2 = (power_of_2 * 10) + int(hex_float_c_str[ch_idx]) - int('0');
                ch_idx++;
            }
            else break;
        }
    }

    // Assemble numerator & denominator
    unsigned long long denominator = 1;

    // Reduction is easy here since the only ...
    // ... prime factor of denominator is two
    if (has_fraction) {
        // Otherwise denominator must already be one
        while ((numerator > 1) and
               ((numerator & 1) == 0) and
               // OK, maybe a little lenience
               (mantissa_power_of_2 != 0)) {
            numerator = numerator >> 1;
            mantissa_power_of_2 += 1;
        }
     }

    if ((numerator > 0) and
        (has_fraction or has_power)) {
        // Only way denominator can be other than one
        power_of_2 = mantissa_power_of_2 + (power_sign * power_of_2);
    }

    QString test_commands = "test_value=(";
    if (sign < 0) test_commands += "-1*";
    test_commands += QString::number(numerator);
    if (power_of_2 > 0) test_commands += "*2^" + QString::number(power_of_2);
    test_commands += ")/(";
    test_commands += QString::number(denominator);
    if (power_of_2 < 0) test_commands += "*2^" + QString::number(-power_of_2);
    test_commands += ");";

    // RPN_Commands_Execute executes the argument command string for its side effects ...
    // ... i.e. Calc's internal state (variables).
    // It doesn't care about the results unless there is an error.
    RPN_Commands_Execute(test_commands);

    // Calc_Evaluate executes the argument command string and returns the result
    // Trim_Calc_Result strips the syntactic "sugar" from the returned result.
    QString test_result_internal = Trim_Calc_Result(Calc_Evaluate("estr(test_value);"));
    QString test_result = Trim_Calc_Result(Calc_Evaluate("round(test_value, 32);"));

    qDebug() << "/*--------------------*/";
    qDebug() << Hexadecimal_Float_String;
    qDebug() << test_result;
    qDebug() << test_result_internal;
    qDebug() << "/*--------------------*/";

    free(hex_float_c_str);
}
else {
    qDebug() << "Validation Error: " + Hexadecimal_Float_String;
}

}`

Test code: Test_Hexadecimal_Float_Parse("+0x1.921fb54442d18p+0001"); Test_Hexadecimal_Float_Parse("+0x0.0000000000000p+0000"); Test_Hexadecimal_Float_Parse("+0x0.0000000000001p+0000"); Test_Hexadecimal_Float_Parse("+0x1.0p+0000"); Test_Hexadecimal_Float_Parse("+0x1.0"); Test_Hexadecimal_Float_Parse("+0x1"); // Defined maximum allowable value: Test_Hexadecimal_Float_Parse("+0x1.fffffffffffffp+1023"); // Defined minimum allowable value: Test_Hexadecimal_Float_Parse("+0x1.0000000000000p-1074"); // Test too many hex digits in mantissa: Test_Hexadecimal_Float_Parse("+0x1.921fb54442d18abcdefp+0001");

Test results: /--------------------/ "+0x1.921fb54442d18p+0001" "3.14159265358979311599796346854419" "884279719003555/281474976710656" /--------------------/ /--------------------/ "+0x0.0000000000000p+0000" "0" "0" /--------------------/ /--------------------/ "+0x0.0000000000001p+0000" "0.00000000000000011102230246251565" "1/9007199254740992" /--------------------/ /--------------------/ "+0x1.0p+0000" "1" "1" /--------------------/ /--------------------/ "+0x1.0" "1" "1" /--------------------/ /--------------------/ "+0x1" "1" "1" /--------------------/ /--------------------/ "+0x1.fffffffffffffp+1023" "179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368" "179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368" /--------------------/ /--------------------/ "+0x1.0000000000000p-1074" "0" "1/202402253307310618352495346718917307049556649764142118356901358027430339567995346891960383701437124495187077864316811911389808737385793476867013399940738509921517424276566361364466907742093216341239767678472745068562007483424692698618103355649159556340810056512358769552333414615230502532186327508646006263307707741093494784" /--------------------/ /--------------------/ "+0x1.921fb54442d18abcdefp+0001" "3.14159265358979311599796346854419" "884279719003555/281474976710656" /--------------------/

I've tried to use something approximating usual C style (excepting the use of array notation).

Looking at Calc parsing, it looks to be well beyond my competence to fully integrate this code at the parsing level. And of course, integrated at that level, one could support quadruple hex floats, etc.

Have fun.

kcrossen avatar Mar 03 '22 14:03 kcrossen

@kcrossen May I ask why you submitted all this code as comments? There is a merge request facility?

pmetzger avatar Mar 03 '22 15:03 pmetzger

I think Fabrice Bellard has already done it long ago with his numcal app, along with a lot of other features, check it out here : http://numcalc.com/

Saldef avatar Mar 18 '22 09:03 Saldef

I don't understand the bulk (nearly any of) of the parsing code, which makes the usual form of posting this problematic.

kcrossen avatar Mar 23 '22 20:03 kcrossen

Furthermore, I don't know how to use the relevant github tools (which I use for my own code about like I used to use sourceforge).

kcrossen avatar Mar 23 '22 21:03 kcrossen

Calc is maintained on GitHub, not sourceforge. GitHub has lots of good documentation that you should consider.

lcn2 avatar Mar 24 '22 03:03 lcn2

@kcrossen Github is mostly just git plus a web interface.

pmetzger avatar Apr 05 '22 19:04 pmetzger

We hope to address this, perhaps sometime next month, in a 2.14.1.x non-production release.

lcn2 avatar Apr 08 '22 07:04 lcn2

This issue will be part of calc v3: see issue #103. Closing this issue so that any further discussion may occur under issue #103

lcn2 avatar Oct 04 '23 02:10 lcn2