Maxim Yurkin

Results 172 comments of Maxim Yurkin

Yes, you hit an issue #226 . So the scattered fields are actually computed not on GPU, but rather on a single core, which is much slower than in MPI...

Do you have any more questions concerning this issue? If not, I will close it.

/cc @stefaniagl

Proof-of-principle is available at https://github.com/stefaniagl/adda

After the pull request (#304) will be merged we still need the following to finalize this issue (to be submitted as the new pull request): - new tests in `tests/2exec`...

Concerning the spherical-harmonics expansion of the internal field. Csca is then a sum of squares of coefficients of multipoles. The latter is well-known, but interestingly it also follows from Csca...

Is it possible to split the Dmatrix in several parts? It has six independent components anyway, hasn't it? I even think that long ago those six components were split into...

Concerning the maximum allowed number of buffers as kernel argument, is there some number which is always available as mandated by the particular OpenCL standard? If yes, it is best...

I agree with @mapclyps that the problem is still present, but probably not so severe. I have just run a few simulations with the current version of the code compiled...

To summarize the previous data, Nvidia definitely allows larger single objects than declares. The total memory may be the real limiting factor, e.g. at the level of 3/4 times the...