Erik Faulhaber

Results 240 comments of Erik Faulhaber

Cool! But then I would have to do something like ```` # ```@cast # my_code() # ``` my_code() #!md ```` so that `my_code()` is 1. run when I include the...

> We should investigate whether our current atomics are functional when used on unified memory that's being used from different devices (they probably aren't). Here's a MWE to show that...

@vchuravy wrote a workaround for an atomic add: ```julia function atomic_system_add(ptr::CUDA.LLVMPtr{Int64, CUDA.AS.Global}, val::Int64) CUDA.LLVM.Interop.@asmcall( "atom.sys.global.add.u64 \$0, [\$1], \$2;", "=l,l,l,~{memory}", true, Int64, Tuple{CUDA.LLVMPtr{Int64, CUDA.AS.Global}, Int64}, ptr, val ) end ``` Or,...

Also see https://github.com/trixi-framework/TrixiParticles.jl/pull/722 for the TrixiParticles version of this.

> This has the same effect as setting thread = True() for time integration schemes that support this option It appears it is actually not equivalent, for reasons that I...

Good idea, but I don't like the fact that a different data type would silently change the underlying NHS implementation, which could be confusing, especially for performance comparisons. @LasNikas what...

Also note that `FullGridCellList` is slightly faster on CPUs as well, and the fully parallel update is significantly faster on a lot of threads.

No, I think this will just cause a lot of conflicts every time we merge main into dev. Since this is not urgent, we can just wait until we get...

I see. I think they're using a different kind of shifting, and I don't think they're using TIC in DualSPHysics, but we'll have to check the paper and probably some...