For large tables in R dplyr's
function inner_join()
is much faster than merge()
Using the merge() function in R on big tables can be time consuming. Luckily the join functions in the new package dplyr are much faster. The package offers four different joins: inner_join (similar to merge with all.x=F and all.y=F) left_join (similar to merge with all.x=T and all.y=F) semi_join (not really an equivalent in merge() unless […]
Continue reading →