Frédéric Bastien

Results 28 issues of Frédéric Bastien

This ask less works by other optimization pass. This is a safe subset of another PR. In case of revert, it will revert less changes. The main changes are: -...

comp:xla
size:M

I’m making this issue here in DLPack, as I do not think of a better place for this. This issue is between many software and as this is the goal...

https://github.com/mila-udem/blocks/blob/master/blocks/serialization.py#L226 This do a dump to a temp file and then move it to the destination. This is safer in case of a crash during the dump. But it cause...

CCW

It allows to not dump the fusion internal. This WAR the issue that "too big" HTML files aren't rendered by the browser. We can manually look at the txt hlo...

awaiting review
comp:xla
size:S

If I use: ``` jax.pmap(f_pmap, in_axes=0, out_axes=0, axis_name='x', donate_argnums=0) ``` It works. But when this code is ported to jax.array + shardmap to support multi-node, this run, but warn that...

enhancement
NVIDIA GPU
GPU

This in progress PR modify the docs/Custom_Operation_for_GPUs.py tutorial to use custom_partitioning instead of xmap. Don't review now, there is still much work done. - [x] finish the forward code. -...

If we add 3 tensor and 2 of them are c contiguous but not the last, it seam wasteful to compute the index 2 times for each c contiguous array.

Optimization

We could use PyOpenCL w http://documen.tician.de/pyopencl/array.html#complex-numbers In addition, I've added a rudimentary facility for translating Fortran kernels to OpenCL, see here: https://github.com/inducer/pyopencl/tree/master/contrib/fortran-to-opencl