Faster Versions of R

0

As you know, R software is widely used among statisticians and data scientists for statistical computing, data analysis and graphing. One of the major reasons for its popularity is that it is free and open-source, and another one is because of its huge external package support, that lets you solve any statistical problem. In spite of its popularity and numerous advantages however, R software users often complain about the vexing issue of memory. For anyone who works with large datasets, often with just a laptop setup, with 64-bit processor and 4 GB of RAM, they face issues with high computation times and additional memory (e.g., 18GB of RAM) for smooth processing.

One way of making R more memory efficient is through use of R packages such as ff, filehash, R.huge, or bigmemory which are designed to store objects on hard drives rather than RAM. However, the usage of these packages can sometimes be tedious due to learning efforts involved. So you would be wondering if at all there is any solution for the R memory and high computation times problem, which would involve only running the same R scripts but in less time. Well the answer for this comes in two options: pqR and Revolution R Open.

pqR

pqR, a pretty quick version of R, is based on R-2.15.0 and new version of R interpreter. It provides improvements on Base R functionality in many ways mostly in the way of speeding up the run times of the scripts. One of the core features of pqR version is the ability to execute numeric computations in parallel with each other on systems with multiple processors or processor cores. Currently it is available for usage only for Unix/Linux/Mac systems and not supported for Windows systems. As per comparison studies, pqR is generally faster than R with about ~30% improvement in terms of performance.

25 M

Above figure shows the relative run times (on an Intel X5680 processor) of nine simple test programs using pqR, and using all releases of R by the R Core Team from 2.11.1 to 3.0.11. These programs mostly operate on small objects, doing simple operations, so this is a test of general interpretive overhead. As you can see, there has been little change in speed of interpreted programs in Base R versions comparison with pqR version.

Download the pqR version now at,

http://www.pqr-project.org/

Revolution R Open (RRO)

Revolution R Open (RRO) by Revolution Analytics is a 100% open source and drop-in replacement for the Base distribution of R, with several significant performance enhancements. One can make use of all the existing R packages, and at the same time run faster R codes. Currently, RRO 8.0.1 includes R 3.1.2 and offers support for Windows, Mac OS X, and Linux based platforms. One of the RRO enhancements is the inclusion of high performance linear algebra libraries, specifically the Intel MKL. This library significantly speeds up many statistical calculations, e.g. the matrix algebra that forms the basis of many statistical algorithms.

25 MM

Above figure showcases the results of 5 tests on matrix operations, run on a Samsung laptop with an Intel i7 4-core CPU2. From the graphic you can see that a matrix multiplication runs 27 times faster with the MKL than without, and linear discriminant analysis is 3.6 times faster.

Download RRO version now at,

http://mran.revolutionanalytics.com/open/

Sources:

  1. https://radfordneal.wordpress.com/2013/06/22/announcing-pqr-a-faster-version-of-r/
  2. http://blog.revolutionanalytics.com/2014/10/revolution-r-open-mkl.html

Related Articles:

Using Pipes in R

Stringi Package in R

 

Interested in learning about other Analytics and Big Data tools and techniques? Click on our course links and explore more.
Jigsaw’s Data Science with SAS Course – click here.
Jigsaw’s Data Science with R Course – click here.
Jigsaw’s Big Data Course – click here.