Sunday, December 28, 2014

Installing Packages in R

R has thousands of packages that extend its use to almost every arena of research. It is highly likely that you will not need most of these packages but it is also likely that you will need several of them (perhaps from 5 to 20) depending on what you plan to do. Most of these packages are available on the the CRAN website.

http://www.cran.r-project.org/web/packages/available_packages_by_name.html

Bioconductor is another source of R packages for bioinformatics-related research packages can be downloaded from its website.

http://www.bioconductor.org/packages/release/bioc/

To use a package, there are two steps:

First Step: Download and Install a Package – you can download a package to a local directory and install it from there using drop down menu OR you can directly install it from the CRAN repository. Depending on the R GUI user interface that you use, the exact steps may be slightly different. Often, first you have to specify which mirror you want to use; chose a mirror that is geographically closer to you for faster downloads. Then you can chose a package from the package list.

I use RStudio GUI. In Rstudio, click on the tab labeled ‘Packages’. If this tab is not visible, press Ctrl+7 and the tab will become visible (usually in the right lower quadrant of the window). From there you can chose install, then type in the name of package (if multiple packages, enter package names separated by space or a comma). Make sure that ‘install dependencies’ box is checked. Make sure that the correct repository and installation location are selected. Then click Install. You can also chose to use installation command directly from the console; the command below will install ggplot2 package:

install.packages("ggplot2")
Note that you need to install packages only once
Second Step: Loading a Package – Installing a package makes it available for later use but packages are not automatically uploaded during a session. Once you have installed a package, you will need to load that package when you need it during a session. To load a package use the function ‘library()’. the following command will load the package ggplot2.
library(ggplot2)
Some may like to use ‘require()’ function instead of ‘library()’. However, see this post for the differences between the two and why one should prefer ‘library()’ over ‘require()’
Personally, I try to load all needed packages at the beginning of a script. However, this strategy may not work if there is an overlap in the names of functions between two packages and you may see a warning “The following objects were masked from ‘package:xyz’:”. 
Some other useful commands to know
.libpaths() # – will give you location of library for packages
library() # – will show you all installed packages
search() # – will show you currently loaded packages

No comments: