accel
accel copied to clipboard
Compile entire crate by nvptx64-nvidia-cuda target
A proposition to resolve #61
Motivation
#[kernel] function cannot use any variable, function, and so on because it will be compiled as a stand alone device code.
fn add_2(a: &mut f32) {
*a = *a + 2.0;
}
#[kernel]
pub fn add_2_all(a: *mut f32, n: usize) {
let i = ::accel_core::index();
unsafe { add_2(&mut *a.offset(i)) }; // add_2 cannot find
}
Resolution
Compile entire crate both as x86_64 and nvptx64 targets.
- rust-ptx-linker will eliminate non-PTX kernel code which does not called from PTX kernel
Problems
- [ ] std must be compiled with nvptx
- [ ] Compile flow (How to trigger nvptx build instead of proc-macro?)