Sunday, June 30, 2013

Downloading and Merging NHANES datasets in R

The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations. The data files for more recent surveys are given in SAS Export format. To read these files in to R, one needs to use functions in the foreign package. If you don’t have this package, you may need to install it first. In the first step, we download these files and then in the second step we import these files to R.

# load foreign package (Converts data files into R)
require(foreign)    

# Set your working directory
setwd( "<YOUR WORKING DIRECTORY>")

### Download demographics file of NHANES 2005-2006 dataset
download.file(
ftp://ftp.cdc.gov/pub/Health_Statistics/nchs/nhanes/2005-2006/DEMO_D.XPT,
"Demo0506.xpt", mode='wb')

###Read downloaded file
Demo56<-read.xport("Demo0506.xpt")

### Download Blood pressure file of NHANES 2005-2006 dataset
download.file(
ftp://ftp.cdc.gov/pub/Health_Statistics/nchs/nhanes/2005-2006/BPX_D.XPT,
"BP0506.xpt", mode='wb')

### Read downloaded file
BP56<-read.xport("BP0506.xpt")

### Merge the two files
N_05_06 <- merge(Demo56, BP56, all=T)

You can download several files and then merge them together to get your dataset.

Saturday, June 29, 2013

Updating R – in Windows 7

R is a great statistical software with tremendous flexibility. However, there is not a very straightforward (point and clinic) way of updating it. R Users have developed several different methods of updating R with its packages, including one described on CRAN.

I came across this one post, it is about updating R on Mac; tried it on Windows 7 with minor changes and it worked fine.

So here is what I did:
First, in the older version I wrote following commands
tmp <- installed.packages()
installedpkgs <- as.vector(tmp[is.na(tmp[,"Priority"]), 1])
save(installedpkgs, file="installed_old.rda")

Then I downloaded and installed newer version of the R. In the newer version of R I wrote following commands:
source(
http://bioconductor.org/biocLite.R)
biocLite()
load("installed_old.rda")
tmp <- installed.packages()
installedpkgs.new <- as.vector(tmp[is.na(tmp[,"Priority"]), 1])
missing <- setdiff(installedpkgs, installedpkgs.new)
for (i in 1:length(missing)) biocLite(missing[i])

All packages were automatically installed to the newer version. Then, I went to Windows Control Panel and uninstalled the older version of R.

That’s it!