Jeff Hammond

Results 414 comments of Jeff Hammond

@loveshack Your suggestion of OpenBLAS here is total garbage. Unless there has been a rewrite for SKX, it's nowhere near as fast on SKX. See [SkylakeX](https://github.com/flame/blis/blob/master/docs/Performance.md#skylakex) on the Performance Wiki...

@mratsim You are right that people buy the 2 FMA parts when they are building HPC systems, but there are a lot of academics and software developers at small firms...

> > @loveshack Your suggestion of OpenBLAS here is total garbage. Unless > > there has been a rewrite for SKX, it's nowhere near as fast on SKX. > >...

Jim Cownie wrote about this topic here: https://cpufun.substack.com/p/to-sched_yield-or-not-to-sched_yield.

> @stepannassyr Is the time.h header ( > clock_gettime() and friends) and corresponding -lrt link option available > in your environment? > I’m not sure why this is in BLIS...

Some useful information from https://github.com/mpi-forum/mpi-issues/issues/65: * [Half-precision floating-point format](https://en.wikipedia.org/wiki/Half-precision_floating-point_format) on Wikipedia. * [ISO/IEC JTC 1/SC 22/WG14 N1945](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1945.pdf) (ISO C proposal) * [ISO/IEC JTC1 SC22 WG14 N2017](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2016.pdf) (ISO C++ proposal) *...

I recommend that BLIS not support float16 but rather [bfloat16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format). The latest research in machine learning suggests that float16 is inferior to bfloat16 for training because of the software and...

int8 and int16 are usually employed for inference although I’m aware of some efforts to use in training. Not sure if worth the software pain though.

https://arxiv.org/pdf/1904.06376.pdf ("Leveraging the bfloat16 Artificial Intelligence Datatype For Higher-Precision Computations") is relevant reading for anyone following this thread.

@jacobgorm I have spoken to @dnparikh and @fgvanzee about this on a number of occasions and I am confident that this is a priority for them.