renjin
renjin copied to clipboard
Speed issue on merging data frames
When trying to merge 2 data frames(121K, 31K rows) it is taking a few minutes to complete. Takes a few seconds in R Studio.
SalesOrderDetail 121K records SalesOrderHeader 31K records
SalesOrderDetail <- read.csv('PATH, sep='\t', header = TRUE) SalesOrderHeader <- read.csv('PATH', sep='\t', header = TRUE) merged <- merge(x = SalesOrderHeader, y = SalesOrderDetail, by.x = 'SalesOrderID', by.y = 'SalesOrderID')
Data is from the Microsoft Adventure Works database, CSV files can be download below
Renjin currently uses a version of merge()
written in pure R to replace the (internal) C implementation from GNU R. See #10. This is probably why performance is not optimal.