dinhv
dinhv
Thank you for your quick response. After reading it, my current idea looks like this by adapting the code in https://github.com/moskewcz/boda/blob/master/src/rtc_fwd.cc#L479 : ``` for( map_str_p_conv_op_t::iterator i = cp->convs->begin(); i !=...
To keep the integrated tuning with rtc_prof easy, is it recommendable to call ops_prof just with the following parameters: ` ops-prof --kg-tune-tag=ocl-def --func-mrd-toler="(cudnn_conv=4e-4)" --ops-fn="%(boda_test_dir)/sgemm-ops-debug.txt" --gen-data="(str_vals=(type=gen_data),nda_vals=(vi=(tn=float,v=0.0),mode=(tn=uint32_t,v=5)))" --op-tunes="(ocl-def=(use_be=ocl,),ocl-4-16-4-lm0=(use_be=ocl,MNt=4:4,MNb=16:16,Kb=4,use_local_mem=0))" `
I would like to to call the `ops_prof_t::main` function from the `conv_pipe_fwd_t::init` function for profiling a convolution function. I used to call the `ops-prof` with following parameters for profiling: ```...
That's what I was asking for. Thank you!
Yes, I did because I could answer it by myself. I simply looked at the wrong code line but now I have another question, also at the same code line....
> so, in short, the error you're seeing is due to trying to call add_cnn_codegen_annotations() on an already-annotated operation (since the parent conv was annotated in init() as per my...
So this is the code snippet: ``` for( map_str_p_conv_op_t::iterator i = cp->convs->begin(); i != cp->convs->end(); ++i ) { p_conv_op_t const & oi = must_find( *op_infos, i->first ); //integration of profiling...
I found the bug. `auto_tuner.auto_tuning(nia, op_copy);` calls `add_cnn_codegen_annotations` twice on op_copy, one for the known good and one for the tuning parameters I've set. Solved it by making a second...
Thank you for the answers. So `filts_smem_sz = filts.dstride("y")`. Using the same input for Boda as I described above, `filts_smem_sz = filts.dstride("y") = rcg.op.nda_vals["filts"].stride = 1056`, but where is this...
Ok, so currently I'm trying to derive a simple constraint for the usage of shared memory in tiled convolution. Do you have any hints, which parameters I have to consider...