fastLink
fastLink copied to clipboard
Improving gamma.R funs
importing collapse::qF
for the quick creation of factors. Then refactoring gamma funs to use those factors and references to decrease the number of parallel calls and memory pressure. This improves performance quite a bit
My benchmark was with first names from two datasets (one voter file and another list). I tested with exponentially larger sets and the speed and memory usage was especially noticeable on the expensive gammaCKpar.R files.
One addition is that matrices are often used instead of vectors. In my most main branch the cpp file doesn't have matrices as input. I don't know if multidimensional vars were considered at one point, but there are a bunch of calls and coercion to form these that can be removed.
Thanks so much for sharing this with us @jw2249a! This is fantastic! I am checking the new functions as we speak. I will report back soon.
Re matrices vs vectors: your intuition is correct. We left the door open for linkage fields that could be compared in more complex ways.