jdk 8341260: Add Float16 to jdk.incubator.vector

Port of Float16 from java.lang in the lworld+fp16 branch to jdk.incubabor.vector.

Progress

[ ] Change must be properly reviewed (1 review required, with at least 1 Reviewer)
[x] Change must not contain extraneous whitespace
[x] Commit message must refer to an issue
[ ] Change requires CSR request JDK-8342567 to be approved

Issues

JDK-8341260: Add Float16 to jdk.incubator.vector (Enhancement - P4)
JDK-8342567: Add Float16 to jdk.incubator.vector (CSR)

Contributors

Raffaello Giulietti <[email protected]>
Jatin Bhateja <[email protected]>

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jdk.git pull/21574/head:pull/21574
$ git checkout pull/21574

Update a local copy of the PR:
$ git checkout pull/21574
$ git pull https://git.openjdk.org/jdk.git pull/21574/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 21574

View PR using the GUI difftool:
$ git pr show -t 21574

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jdk/pull/21574.diff

Webrev

Link to Webrev Comment

Oct 17 '24 23:10 jddarcy

:wave: Welcome back darcy! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

Oct 17 '24 23:10 bridgekeeper[bot]

@jddarcy This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8341260: Add Float16 to jdk.incubator.vector

Co-authored-by: Raffaello Giulietti <[email protected]>
Co-authored-by: Jatin Bhateja <[email protected]>
Reviewed-by: rgiulietti

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 26 new commits pushed to the master branch:

79345bbbae2564f9f523859d1227a1784293b20f: 8339783: Implement JEP 479: Remove the Windows 32-bit x86 Port
2eeaa57b19780723ad7c74b1a62dea491241b686: 8343944: C2: MinLNode::add_ring() computes _widen wrongly leading to an endless widening/compilation
e9ede464b2be84af676dc64bd3595b304bfe818d: 8343508: Parallel: Use ordinary klass accessor in verify_filler_in_dense_prefix
c78de7bf5fc5a4da50c6c64e181abf02a5b12630: 8343964: RISC-V: Improve PrintOptoAssembly output for loadNKlassCompactHeaders node
eb40a88f4076360708402454a494907e8c0c845d: 8343430: RISC-V: C2: Remove old trampoline call
b26e4952e971a3cd027291f7f823140aeb5e5074: 8343801: Change string printed by nsk_null_string for null strings
a4e2c20849008d5b560f94b58fe70ef8e58c8d4c: 8343344: Windows attach logic failed to handle a failed open on a pipe
63eb4853f6782f350f67b6bcf25d83bc4480be71: 8343883: Cannot resolve user specified toolchain-path for lld after JDK-8338304
db85090553ab14a84c3ed0a2604dd56c5b6e6982: 8338411: Implement JEP 486: Permanently Disable the Security Manager
c12b386d1916af9a04b4c6698838c2b40c6cdd86: 8338007: [JVMCI] ResolvedJavaMethod.reprofile can crash ciMethodData
... and 16 more: https://git.openjdk.org/jdk/compare/889f906235e99b7207f2e30e1f6f5771188f5a56...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

Oct 17 '24 23:10 openjdk[bot]

@jddarcy The following label will be automatically applied to this pull request:

core-libs

When this pull request is ready to be reviewed, an "RFR" email will be sent to the corresponding mailing list. If you would like to change these labels, use the /label pull request command.

Oct 17 '24 23:10 openjdk[bot]

/contributor add @rgiulietti

Oct 17 '24 23:10 jddarcy

/contributor add @jatin-bhateja

Oct 17 '24 23:10 jddarcy

@jddarcy Contributor Raffaello Giulietti <[email protected]> successfully added.

Oct 17 '24 23:10 openjdk[bot]

@jddarcy Contributor Jatin Bhateja <[email protected]> successfully added.

Oct 17 '24 23:10 openjdk[bot]

Add as contributors other engineers who worked on Float16 on the lworld+fp16 branch in Valhalla.

Oct 17 '24 23:10 jddarcy

To ease review, diffs of corresponding files from the current, at time of writing, state of files in lworld+fp16 vs a port to jdk.incubator.vector, starting with Float16:

$ diff src/java.base/share/classes/java/lang/Float16.java  ~/jdk24/open/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java 
26c26
< package java.lang;
---
> package jdk.incubator.vector;
28a29
> import java.math.BigInteger;
30,31c31,32
< import jdk.internal.math.*;
< import jdk.internal.vm.annotation.IntrinsicCandidate;
---
> // import jdk.internal.math.*;
> // import jdk.internal.vm.annotation.IntrinsicCandidate;
37c38
<  * The {@code Float16} is a primitive value class holding 16-bit data
---
>  * The {@code Float16} is a class holding 16-bit data
46,48d46
<  * <p>This is a <a href="https://openjdk.org/jeps/401">primitive value class</a> and its objects are
<  * identity-less non-nullable value objects.
<  *
52a51,56
>  * <p>This is a <a href="{@docRoot}/java.base/java/lang/doc-files/ValueBased.html">value-based</a>
>  * class; programmers should treat instances that are
>  * {@linkplain #equals(Object) equal} as interchangeable and should not
>  * use instances for synchronization, or unpredictable behavior may
>  * occur. For example, in a future release, synchronization may fail.
>  *
62,64d65
<  *
<  * @author Jatin Bhateja
<  * @since 20.00
69,70c70,71
< @jdk.internal.MigratedValueClass
< @jdk.internal.ValueBased
---
> // @jdk.internal.MigratedValueClass
> //@jdk.internal.ValueBased
213c214,215
<         return FloatToDecimal.toString(f16.floatValue());
---
>         return Float.toString(f16.floatValue());
>         // return FloatToDecimal.toString(f16.floatValue());
420d421
<      * @see BigDecimal#float16Value()
423c424,539
<         return v.float16Value();
---
>         return BigDecimalConversion.float16Value(v);
>     }
> 
>     private class BigDecimalConversion {
>         /*
>          * Let l = log_2(10).
>          * Then, L < l < L + ulp(L) / 2, that is, L = roundTiesToEven(l).
>          */
>         private static final double L = 3.321928094887362;
> 
>         private static final int P_F16 = PRECISION;  // 11
>         private static final int Q_MIN_F16 = MIN_EXPONENT - (P_F16 - 1);  // -24
>         private static final int Q_MAX_F16 = MAX_EXPONENT - (P_F16 - 1);  // 5
> 
>         /**
>          * Powers of 10 which can be represented exactly in {@code
>          * Float16}.
>          */
>         private static final Float16[] FLOAT16_10_POW = {
>             Float16.valueOf(1), Float16.valueOf(10), Float16.valueOf(100),
>             Float16.valueOf(1_000), Float16.valueOf(10_000)
>         };
> 
>         public static Float16 float16Value(BigDecimal bd) {
> //             int scale = bd.scale();
> //             BigInteger unscaledValue = bd.unscaledValue();
> 
> //              if (unscaledValue.abs().compareTo(BigInteger.valueOf(Long.MAX_VALUE)) <= 0) {
> //                 long intCompact = bd.longValue();
> //                 Float16 v = Float16.valueOf(intCompact);
> //                 if (scale == 0) {
> //                     return v;
> //                 }
> //                 /*
> //                  * The discussion for the double case also applies here. That is,
> //                  * the following test is precise for all long values, but here
> //                  * Long.MAX_VALUE is not an issue.
> //                  */
> //                 if (v.longValue() == intCompact) {
> //                     if (0 < scale && scale < FLOAT16_10_POW.length) {
> //                         return Float16.divide(v, FLOAT16_10_POW[scale]);
> //                     }
> //                     if (0 > scale && scale > -FLOAT16_10_POW.length) {
> //                         return Float16.multiply(v, FLOAT16_10_POW[-scale]);
> //                     }
> //                 }
> //             }
>             return fullFloat16Value(bd);
>         }
> 
>         private static BigInteger bigTenToThe(int scale) {
>             return BigInteger.TEN.pow(scale);
>         }
> 
>         private static Float16 fullFloat16Value(BigDecimal bd) {
>             if (BigDecimal.ZERO.compareTo(bd) == 0) {
>                 return Float16.valueOf(0);
>             }
>             BigInteger w = bd.unscaledValue().abs();
>             int scale = bd.scale();
>             long qb = w.bitLength() - (long) Math.ceil(scale * L);
>             Float16 signum = Float16.valueOf(bd.signum());
>             if (qb < Q_MIN_F16 - 2) {  // qb < -26
>                 return Float16.multiply(signum, Float16.valueOf(0));
>             }
>             if (qb > Q_MAX_F16 + P_F16 + 1) {  // qb > 17
>                 return Float16.multiply(signum, Float16.POSITIVE_INFINITY);
>             }
>             if (scale < 0) {
>                 return Float16.multiply(signum, valueOf(w.multiply(bigTenToThe(-scale))));
>             }
>             if (scale == 0) {
>                 return Float16.multiply(signum, valueOf(w));
>             }
>             int ql = (int) qb - (P_F16 + 3);
>             BigInteger pow10 =  bigTenToThe(scale);
>             BigInteger m, n;
>             if (ql <= 0) {
>                 m = w.shiftLeft(-ql);
>                 n = pow10;
>             } else {
>                 m = w;
>                 n = pow10.shiftLeft(ql);
>             }
>             BigInteger[] qr = m.divideAndRemainder(n);
>             /*
>              * We have
>              *      2^12 = 2^{P+1} <= i < 2^{P+5} = 2^16
>              * Contrary to the double and float cases, where we use long and int, resp.,
>              * here we cannot simply declare i as short, because P + 5 < Short.SIZE
>              * fails to hold.
>              * Using int is safe, though.
>              *
>              * Further, as Math.scalb(Float16) does not exists, we fall back to
>              * Math.scalb(double).
>              */
>             int i = qr[0].intValue();
>             int sb = qr[1].signum();
>             int dq = (Integer.SIZE - (P_F16 + 2)) - Integer.numberOfLeadingZeros(i);
>             int eq = (Q_MIN_F16 - 2) - ql;
>             if (dq >= eq) {
>                 return Float16.valueOf(bd.signum() * Math.scalb((double) (i | sb), ql));
>             }
>             int mask = (1 << eq) - 1;
>             int j = i >> eq | (Integer.signum(i & mask)) | sb;
>             return Float16.valueOf(bd.signum() * Math.scalb((double) j, Q_MIN_F16 - 2));
>         }
> 
>         public static Float16 valueOf(BigInteger bi) {
>             int signum = bi.signum();
>             return (signum == 0 || bi.bitLength() <= 31)
>                 ? Float16.valueOf(bi.longValue())  // might return infinities
>                 : signum > 0
>                 ? Float16.POSITIVE_INFINITY
>                 : Float16.NEGATIVE_INFINITY;
>         }
577,578c693
<      * according to the IEEE 754 floating-point binary16 bit layout,
<      * preserving Not-a-Number (NaN) values.
---
>      * according to the IEEE 754 floating-point binary16 bit layout.
591,607d705
<      * Returns a representation of the specified floating-point value
<      * according to the IEEE 754 floating-point binary16 bit layout.
<      *
<      * @param   fp16   a {@code Float16} floating-point number.
<      * @return the bits that represent the floating-point number.
<      *
<      * @see Float#floatToIntBits(float)
<      * @see Double#doubleToLongBits(double)
<      */
<     public static short float16ToShortBits(Float16 fp16) {
<         if (!isNaN(fp16)) {
<             return float16ToRawShortBits(fp16);
<         }
<         return 0x7e00;
<     }
< 
<     /**
694c792
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
714c812
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
783c881
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
806c904
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
829c927
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
852c950
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
873c971
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
907c1005
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
1109c1207
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
1131c1229
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
1319a1418
>         int DoubleConsts_EXP_BIAS = 1023;
1328c1427
<                 * Double.longBitsToDouble((long) (scaleFactor + DoubleConsts.EXP_BIAS) << Double.PRECISION - 1));
---
>                 * Double.longBitsToDouble((long) (scaleFactor + DoubleConsts_EXP_BIAS) << Double.PRECISION - 1));
1330d1428
< 
1374d1471
<

Oct 17 '24 23:10 jddarcy

$ diff src/java.base/share/classes/jdk/internal/math/Float16Consts.java  ~/jdk24/open/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16Consts.java 
26c26
< package jdk.internal.math;
---
> package jdk.incubator.vector;
28,30c28,30
< import static java.lang.Float16.MIN_EXPONENT;
< import static java.lang.Float16.PRECISION;
< import static java.lang.Float16.SIZE;
---
> import static jdk.incubator.vector.Float16.MIN_EXPONENT;
> import static jdk.incubator.vector.Float16.PRECISION;
> import static jdk.incubator.vector.Float16.SIZE;
37c37
< public class Float16Consts {
---
> class Float16Consts {

Oct 17 '24 23:10 jddarcy

$ diff test/jdk/java/lang/Float16/BasicFloat16ArithTests.java  ~/jdk24/test/jdk/jdk/incubator/vector/BasicFloat16ArithTests.java 
26c26,27
<  * @bug 8329817 8334432 8339076
---
>  * @bug 8329817 8334432 8339076 8341260
>  * @modules jdk.incubator.vector
30c31,32
< import static java.lang.Float16.*;
---
> import jdk.incubator.vector.Float16;
> import static jdk.incubator.vector.Float16.*;

$ diff test/jdk/java/math/BigDecimal/DoubleFloatValueTests.java  ~/jdk24/test/jdk/java/math/BigDecimal/DoubleFloatValueTests.java 
26c26
<  * @bug 8205592 8339252
---
>  * @bug 8205592 8339252 8341260
27a28
>  * @modules jdk.incubator.vector
37a39
> import jdk.incubator.vector.Float16;
110c112
<         Float16 res = bv.float16Value();
---
>         Float16 res =  Float16.valueOf(bv); // bv.float16Value();

Oct 17 '24 23:10 jddarcy

$ diff test/jdk/java/lang/Float16/BasicFloat16ArithTests.java  ~/jdk24/test/jdk/jdk/incubator/vector/BasicFloat16ArithTests.java 
26c26,27
<  * @bug 8329817 8334432 8339076
---
>  * @bug 8329817 8334432 8339076 8341260
>  * @modules jdk.incubator.vector
30c31,32
< import static java.lang.Float16.*;
---
> import jdk.incubator.vector.Float16;
> import static jdk.incubator.vector.Float16.*;

$ diff test/jdk/java/math/BigDecimal/DoubleFloatValueTests.java  ~/jdk24/test/jdk/java/math/BigDecimal/DoubleFloatValueTests.java 
26c26
<  * @bug 8205592 8339252
---
>  * @bug 8205592 8339252 8341260
27a28
>  * @modules jdk.incubator.vector
37a39
> import jdk.incubator.vector.Float16;
110c112
<         Float16 res = bv.float16Value();
---
>         Float16 res =  Float16.valueOf(bv); // bv.float16Value();

Initially, I left the tests for the BigDecimal -> Float16 conversion in the tests for the base module to ease review compared to their location in the Valhalla branch. The tests can be extracted and moved to a jdk.incubator.vector area subsequently.

Oct 17 '24 23:10 jddarcy

To ease review, diffs of corresponding files from the current, at time of writing, state of files in lworld+fp16 vs a port to jdk.incubator.vector, starting with Float16:

$ diff src/java.base/share/classes/java/lang/Float16.java  ~/jdk24/open/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/Float16.java 
26c26
< package java.lang;
---
> package jdk.incubator.vector;
28a29
> import java.math.BigInteger;
30,31c31,32
< import jdk.internal.math.*;
< import jdk.internal.vm.annotation.IntrinsicCandidate;
---
> // import jdk.internal.math.*;
> // import jdk.internal.vm.annotation.IntrinsicCandidate;
37c38
<  * The {@code Float16} is a primitive value class holding 16-bit data
---
>  * The {@code Float16} is a class holding 16-bit data
46,48d46
<  * <p>This is a <a href="https://openjdk.org/jeps/401">primitive value class</a> and its objects are
<  * identity-less non-nullable value objects.
<  *
52a51,56
>  * <p>This is a <a href="{@docRoot}/java.base/java/lang/doc-files/ValueBased.html">value-based</a>
>  * class; programmers should treat instances that are
>  * {@linkplain #equals(Object) equal} as interchangeable and should not
>  * use instances for synchronization, or unpredictable behavior may
>  * occur. For example, in a future release, synchronization may fail.
>  *
62,64d65
<  *
<  * @author Jatin Bhateja
<  * @since 20.00
69,70c70,71
< @jdk.internal.MigratedValueClass
< @jdk.internal.ValueBased
---
> // @jdk.internal.MigratedValueClass
> //@jdk.internal.ValueBased
213c214,215
<         return FloatToDecimal.toString(f16.floatValue());
---
>         return Float.toString(f16.floatValue());
>         // return FloatToDecimal.toString(f16.floatValue());
420d421
<      * @see BigDecimal#float16Value()
423c424,539
<         return v.float16Value();
---
>         return BigDecimalConversion.float16Value(v);
>     }
> 
>     private class BigDecimalConversion {
>         /*
>          * Let l = log_2(10).
>          * Then, L < l < L + ulp(L) / 2, that is, L = roundTiesToEven(l).
>          */
>         private static final double L = 3.321928094887362;
> 
>         private static final int P_F16 = PRECISION;  // 11
>         private static final int Q_MIN_F16 = MIN_EXPONENT - (P_F16 - 1);  // -24
>         private static final int Q_MAX_F16 = MAX_EXPONENT - (P_F16 - 1);  // 5
> 
>         /**
>          * Powers of 10 which can be represented exactly in {@code
>          * Float16}.
>          */
>         private static final Float16[] FLOAT16_10_POW = {
>             Float16.valueOf(1), Float16.valueOf(10), Float16.valueOf(100),
>             Float16.valueOf(1_000), Float16.valueOf(10_000)
>         };
> 
>         public static Float16 float16Value(BigDecimal bd) {
> //             int scale = bd.scale();
> //             BigInteger unscaledValue = bd.unscaledValue();
> 
> //              if (unscaledValue.abs().compareTo(BigInteger.valueOf(Long.MAX_VALUE)) <= 0) {
> //                 long intCompact = bd.longValue();
> //                 Float16 v = Float16.valueOf(intCompact);
> //                 if (scale == 0) {
> //                     return v;
> //                 }
> //                 /*
> //                  * The discussion for the double case also applies here. That is,
> //                  * the following test is precise for all long values, but here
> //                  * Long.MAX_VALUE is not an issue.
> //                  */
> //                 if (v.longValue() == intCompact) {
> //                     if (0 < scale && scale < FLOAT16_10_POW.length) {
> //                         return Float16.divide(v, FLOAT16_10_POW[scale]);
> //                     }
> //                     if (0 > scale && scale > -FLOAT16_10_POW.length) {
> //                         return Float16.multiply(v, FLOAT16_10_POW[-scale]);
> //                     }
> //                 }
> //             }
>             return fullFloat16Value(bd);
>         }
> 
>         private static BigInteger bigTenToThe(int scale) {
>             return BigInteger.TEN.pow(scale);
>         }
> 
>         private static Float16 fullFloat16Value(BigDecimal bd) {
>             if (BigDecimal.ZERO.compareTo(bd) == 0) {
>                 return Float16.valueOf(0);
>             }
>             BigInteger w = bd.unscaledValue().abs();
>             int scale = bd.scale();
>             long qb = w.bitLength() - (long) Math.ceil(scale * L);
>             Float16 signum = Float16.valueOf(bd.signum());
>             if (qb < Q_MIN_F16 - 2) {  // qb < -26
>                 return Float16.multiply(signum, Float16.valueOf(0));
>             }
>             if (qb > Q_MAX_F16 + P_F16 + 1) {  // qb > 17
>                 return Float16.multiply(signum, Float16.POSITIVE_INFINITY);
>             }
>             if (scale < 0) {
>                 return Float16.multiply(signum, valueOf(w.multiply(bigTenToThe(-scale))));
>             }
>             if (scale == 0) {
>                 return Float16.multiply(signum, valueOf(w));
>             }
>             int ql = (int) qb - (P_F16 + 3);
>             BigInteger pow10 =  bigTenToThe(scale);
>             BigInteger m, n;
>             if (ql <= 0) {
>                 m = w.shiftLeft(-ql);
>                 n = pow10;
>             } else {
>                 m = w;
>                 n = pow10.shiftLeft(ql);
>             }
>             BigInteger[] qr = m.divideAndRemainder(n);
>             /*
>              * We have
>              *      2^12 = 2^{P+1} <= i < 2^{P+5} = 2^16
>              * Contrary to the double and float cases, where we use long and int, resp.,
>              * here we cannot simply declare i as short, because P + 5 < Short.SIZE
>              * fails to hold.
>              * Using int is safe, though.
>              *
>              * Further, as Math.scalb(Float16) does not exists, we fall back to
>              * Math.scalb(double).
>              */
>             int i = qr[0].intValue();
>             int sb = qr[1].signum();
>             int dq = (Integer.SIZE - (P_F16 + 2)) - Integer.numberOfLeadingZeros(i);
>             int eq = (Q_MIN_F16 - 2) - ql;
>             if (dq >= eq) {
>                 return Float16.valueOf(bd.signum() * Math.scalb((double) (i | sb), ql));
>             }
>             int mask = (1 << eq) - 1;
>             int j = i >> eq | (Integer.signum(i & mask)) | sb;
>             return Float16.valueOf(bd.signum() * Math.scalb((double) j, Q_MIN_F16 - 2));
>         }
> 
>         public static Float16 valueOf(BigInteger bi) {
>             int signum = bi.signum();
>             return (signum == 0 || bi.bitLength() <= 31)
>                 ? Float16.valueOf(bi.longValue())  // might return infinities
>                 : signum > 0
>                 ? Float16.POSITIVE_INFINITY
>                 : Float16.NEGATIVE_INFINITY;
>         }
577,578c693
<      * according to the IEEE 754 floating-point binary16 bit layout,
<      * preserving Not-a-Number (NaN) values.
---
>      * according to the IEEE 754 floating-point binary16 bit layout.
591,607d705
<      * Returns a representation of the specified floating-point value
<      * according to the IEEE 754 floating-point binary16 bit layout.
<      *
<      * @param   fp16   a {@code Float16} floating-point number.
<      * @return the bits that represent the floating-point number.
<      *
<      * @see Float#floatToIntBits(float)
<      * @see Double#doubleToLongBits(double)
<      */
<     public static short float16ToShortBits(Float16 fp16) {
<         if (!isNaN(fp16)) {
<             return float16ToRawShortBits(fp16);
<         }
<         return 0x7e00;
<     }
< 
<     /**
694c792
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
714c812
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
783c881
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
806c904
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
829c927
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
852c950
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
873c971
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
907c1005
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
1109c1207
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
1131c1229
<     @IntrinsicCandidate
---
>     // @IntrinsicCandidate
1319a1418
>         int DoubleConsts_EXP_BIAS = 1023;
1328c1427
<                 * Double.longBitsToDouble((long) (scaleFactor + DoubleConsts.EXP_BIAS) << Double.PRECISION - 1));
---
>                 * Double.longBitsToDouble((long) (scaleFactor + DoubleConsts_EXP_BIAS) << Double.PRECISION - 1));
1330d1428
< 
1374d1471
<

As initially done here, the port of Float16 omits any VM intrinsification; that would need to be done by subsequent work. Subsequent work may also adapt the Java implementations to be more amenable to being intrinsified.

Oct 17 '24 23:10 jddarcy

Webrevs

Oct 19 '24 21:10 mlbridge[bot]

Joe, our revised and now-current thinking about JIT support for the Float16 (both as a pre-Valhalla VBC and Valhalla value class) is that there should be zero need for any @IntrinsicCandidate markers in Float16.java.

The two existing intrinsics in Float.java should be fully sufficient to cue the JIT to introduce FP16 types and operations in its IR. There is no need to help the JIT either recognize Float16 objects as special types, nor recognize the special interpretation of the short value field inside. It is very routine in the Valhalla JIT to strip off the boxing (i.e., the VBC-like Float16 details) and just deal directly with the payload (i.e., the short value). And it is also routine to change perspective on live values in the IR, depending on how they are used (i.e, via the two Float intrinsics), so that the IR ends up storing FP16 temps in whatever register class the CPU likes best.

So, unless there is some reason I am not aware of, there does not need to be any intrinsic definition in Float16.java, nor even a comment that gestures at the possible existence of an intrinsic.]

Note the economy of the resulting design: Two intrinsics at the Java level provide the JIT with ample opportunity (if it so desires) to organize an IR with a full FP16 type and all its typed operations. There is no need to surface such complexity in the Java source code.

Oct 19 '24 22:10 rose00

Somebody might ask as a followup, "But what about calling sequences? Without intrinsics, how does the JIT know to store Float16 values in the correct type of argument register?" Presumably some platforms may wish to store arguments (and return values) in floating point registers, not in the integer registers used by a typical value class (such as Float16, containing a single short value).

That question might be motivated by the observation that, while FP16 stored in temps might well be stored in FPU registers, simply due to proximity to decode/encode intrinsics, changing the calling sequence requires that we give the Float16 class a little extra magic. (More detail: There are fundamentally 3 different places that FP16 values might need type-specific storage: IR temps, argument/return registers, and heap variables. The first can often be FPU regs. The last can be untyped memory words and a short is just fine. The middle are driven by the type of the value field, but can also take the field container Float16 into account.)

In any case, getting those details right does not require any @IntrinsicCandidate annotation, or any other source-level annotation. It just requires the VM to know about Float16 and treat it a little bit specially. Wholesale marking of methods is overkill.

Oct 19 '24 23:10 rose00

Comparing with https://github.com/openjdk/jdk/pull/21490 we can see that there are more than minimum number of intrinsics I recommended above, but (crucially) the intrinsics are decoupled from the box type, and refactored into Float16Math.java.

https://github.com/openjdk/jdk/pull/21490/files#diff-105a2bf4929174c594b5bcc54f749e93d9fca1b5371ca301fff02badd0e8da5a

For example, see the max intrinsics line 53. These intrinsics are better structured for the JIT, because there is no extraneous boxing to deal with. (Boxes involve null checks and heap allocation and other details which are just extra overhead.)

Apart from intrinsic classification, a simpler thing to do that may help the box methods to fold up correctly would be a @ForceInline annotation. But for the simple methods in this PR that is usually overkill.

Oct 20 '24 00:10 rose00

Comparing with #21490 we can see that there are more than minimum number of intrinsics I recommended above, but (crucially) the intrinsics are decoupled from the box type, and refactored into Float16Math.java.

https://github.com/openjdk/jdk/pull/21490/files#diff-105a2bf4929174c594b5bcc54f749e93d9fca1b5371ca301fff02badd0e8da5a

For example, see the max intrinsics line 53. These intrinsics are better structured for the JIT, because there is no extraneous boxing to deal with. (Boxes involve null checks and heap allocation and other details which are just extra overhead.)

Apart from intrinsic classification, a simpler thing to do that may help the box methods to fold up correctly would be a @ForceInline annotation. But for the simple methods in this PR that is usually overkill.

I'm agnostic toward how the intrinsification of Float16 is implemented. I've removed the commented-out intrinsics annotations in a subsequent push. As noted, for review purposes I wanted to have a least one iteration of Float16 in the incubator that was as close as possible to the version of Float16 that has been in use in the lworld+fp16 branch.

Oct 20 '24 18:10 jddarcy

Updated the PR with new code and tests for properly rounded String -> Float16 conversion only using public JDK/Java SE APIs. Whenever Float16 support is moved to the base module, this implementation can be abandoned in favor of augmenting the existing support for String -> float and String -> double conversion.

Oct 29 '24 04:10 jddarcy

I believe the only remaining functionality to add is porting over the minimal-length Float16 to String conversion from JDK-8341245 in the Valhalla repo.

Oct 29 '24 04:10 jddarcy

Updated the implementation with Float16 -> string conversion code from @rgiulietti and added various supporting tests.

Nov 12 '24 05:11 jddarcy

Looks good. Approving the current status. Are there plans for other small additions?

Subsequent to this comment, I did a light clean-up pass over the tests. I'll do a pass over the Float16 implementation too to make sure previous comments are all addressed.

Nov 12 '24 21:11 jddarcy

Looks good. Approving the current status. Are there plans for other small additions?

Subsequent to this comment, I did a light clean-up pass over the tests. I'll do a pass over the Float16 implementation too to make sure previous comments are all addressed.

PS Pushed another clean-up pass over the implementation and tests.

Nov 13 '24 05:11 jddarcy

/integrate

Nov 13 '24 18:11 jddarcy

Going to push as commit dbf23466aff902836838f06bcbbf3c9e7c5e9c6a. Since your change was applied there have been 40 commits pushed to the master branch:

a5f11b5f775be6c1d9593562ba65912261efdf52: 8343483: Remove unnecessary @SuppressWarnings annotations (serviceability)
7be77725eab6f45d8f8d23f2ba0d18d2d89a40aa: 8344112: Remove code to support security manager execution mode from DatagramChannel implementation
bd3fec3075829efc0afe7a99d7a684cf81cc5bbb: 8344086: Remove security manager dependency in FFM
916694f2c1e7fc8d6a88e7026bc2d29ba2923849: 8343317: Extend test generation tool to handle APX NDD/NF flavor of instructions
eb240a7df9a029bb762def86b805bdfdfa3e4625: 8344051: Problemlist jdk/jfr/event/runtime/TestNativeMemoryUsageEvents.java with ZGC until JDK-8343893 is fixed
c00e20c399cf9b3b21258bd5654a92d703c8fcd2: 8343285: java.lang.Process is unresponsive and CPU usage spikes to 100%
cc2acd14b13ada243fc13dc4d9007c4e2df56148: 8343286: Missing unchecked cast warning in polymorphic method call
b80ca4902af71938b32634d3fd230f4d65cde454: 8344124: JDK-8341411 Broke the build
a08d67c2a9d0bbc6f38c6280efd19b60303eb5e8: 8344080: Return type mismatch for jfr_unregister_stack_filter
4c5bc5f2f091ae861d5329cdae42fe7fa295544b: 8343923: GHA: Switch to Xcode 15 on MacOS AArch64 runners
... and 30 more: https://git.openjdk.org/jdk/compare/889f906235e99b7207f2e30e1f6f5771188f5a56...master

Your commit was automatically rebased without conflicts.

Nov 13 '24 18:11 openjdk[bot]

@jddarcy Pushed as commit dbf23466aff902836838f06bcbbf3c9e7c5e9c6a.

:bulb: You may see a message that your pull request was closed with unmerged commits. This can be safely ignored.

Nov 13 '24 18:11 openjdk[bot]

jdk jdk copied to clipboard

8341260: Add Float16 to jdk.incubator.vector

Progress

Issues

Contributors

Reviewing

Webrev

Webrevs

jdk
jdk copied to clipboard