Where Business Meets the Blogroll

RStudio and Amazon Web Services EC2

RStudioAmazon Web ServicesThere’s a problem dogging statistical researchers all over the Internet…

How can we use RStudio to run R on Amazon Web Services EC2?

After heading down several blind alleys, I can happily report an answer.

1) I am going to start with the assumption that you have used Amazon Web Services EC2 already. Read Running R on AWS if you need an introduction to the basic steps. (Ajay Ohri’s post mentions several AMIs, but none of them work for our needs.)

2) To run RStudio Server, you need an Amazon Machine Image with fairly recent server OS and R versions. This is where I can save you a lot of time and say: as of 12 March 2011, I only found one image that is new enough. It goes by the very catchy name of: akoya-ubuntu-10.04-amd64-server-20101114 (ami-e42cdb8d).

One nice thing about this 64 bit image is you can run it with the dirt cheap “t1.micro” option while you are testing your configuration. Then, once it all works you can pick a more powerful (and more expensive) configuration.

3) After you start your instance, make sure you have Port 8787 for TCP transport defined in your security Group.

4) Connect to your instance via SSH.

ssh -i mykeys.pem ubuntu@ec2-50-16-35-73.compute-1.amazonaws.com

Use

ubuntu

(instead of root) because this is a Ubuntu OS instance. Replace the

ec2-50-16-35-73.compute-1.amazonaws.com

portion with public DNS for your EC2.

5) After logging your Ubuntu instance, there are two additional steps. Installing RStudio and creating a new user. To install RStudio Server, execute these two commands:

wget https://s3.amazonaws.com/rstudio-server/rstudio-server-0.92.44-amd64.deb
sudo dpkg -i rstudio-server-0.92.44-amd64.deb
To create a new user, type:
sudo adduser rwebuser

6) Finally, return to a web browser. Enter your instance’s public DNS followed by “:8787″. For example…

http://ec2-50-16-35-73.compute-1.amazonaws.com:8787

If all goes well, RStudio will prompt you for a userid and password. Use the account credentials you created in step 5 (e.g., rwebuser).

Other Notes

  • Back in Nov. 2009, I wrote about using biocep for statistical analysis via cloud computing. The required AMIs are no longer available and RStudio has a much superior interface.
  • I tried out the AMIs from http://www.cloudbiolinux.com/, but their version of R is too old for RStudio. Also, I was unable to get a Mac NX client working on the first try so I abandoned that option.
  • R is a tricky topic to Google for. If you haven’t found it already, I highly recommend the R-bloggers website–also available in a handy daily digest. That’s a good place to search for help.

Conclusion

Eventually someone will add RStudio to an AMI, eliminating several of these steps. Let me know if you find other AMIs that also support RStudio.

11 Responses to RStudio and Amazon Web Services EC2





Categories