marlin
marlin copied to clipboard
perfmance
can use one big kernel rather than many small kernel? maybe one kernel faster?