RStudio and Amazon Web Services EC2

There’s a problem dogging statistical researchers all over the Internet…
How can we use RStudio to run R on Amazon Web Services EC2?
After heading down several blind alleys, I can happily report an answer.
1) I am going to start with the assumption that you have used Amazon Web Services EC2 already. Read Running R on AWS if you need an introduction to the basic steps. (Ajay Ohri’s post mentions several AMIs, but none of them work for our needs.)
2) To run RStudio Server, you need an Amazon Machine Image with fairly recent server OS and R versions. This is where I can save you a lot of time and say: as of 12 March 2011, I only found one image that is new enough. It goes by the very catchy name of: akoya-ubuntu-10.04-amd64-server-20101114 (ami-e42cdb8d).
One nice thing about this 64 bit image is you can run it with the dirt cheap “t1.micro” option while you are testing your configuration. Then, once it all works you can pick a more powerful (and more expensive) configuration.
3) After you start your instance, make sure you have Port 8787 for TCP transport defined in your security Group.
4) Connect to your instance via SSH.
ssh -i mykeys.pem ubuntu@ec2-50-16-35-73.compute-1.amazonaws.com
Use
ubuntu
(instead of root) because this is a Ubuntu OS instance. Replace the
ec2-50-16-35-73.compute-1.amazonaws.com
portion with public DNS for your EC2.
5) After logging your Ubuntu instance, there are two additional steps. Installing RStudio and creating a new user. To install RStudio Server, execute these two commands:
wget https://s3.amazonaws.com/rstudio-server/rstudio-server-0.92.44-amd64.deb sudo dpkg -i rstudio-server-0.92.44-amd64.deb
sudo adduser rwebuser
6) Finally, return to a web browser. Enter your instance’s public DNS followed by “:8787″. For example…
http://ec2-50-16-35-73.compute-1.amazonaws.com:8787
If all goes well, RStudio will prompt you for a userid and password. Use the account credentials you created in step 5 (e.g., rwebuser).
Other Notes
- Back in Nov. 2009, I wrote about using biocep for statistical analysis via cloud computing. The required AMIs are no longer available and RStudio has a much superior interface.
- I tried out the AMIs from http://www.cloudbiolinux.com/, but their version of R is too old for RStudio. Also, I was unable to get a Mac NX client working on the first try so I abandoned that option.
- R is a tricky topic to Google for. If you haven’t found it already, I highly recommend the R-bloggers website–also available in a handy daily digest. That’s a good place to search for help.
Conclusion
Eventually someone will add RStudio to an AMI, eliminating several of these steps. Let me know if you find other AMIs that also support RStudio.

Thanks for the info, Steven. I’ve been playing with RStudio for a couple of weeks now, but only locally.
BTW, another place to search for help with R is http://www.rseek.org.
Pingback: Inundata – R + EC2 + RStudio Server
Thanks, worked a treat!
Mike — Thanks for the feedback. Glad to hear it worked for someone else, too!
Has anyone found an amazon AMI with rstudio pre-installed?
Zach — I haven’t found one yet. If anyone else does, I’d love to hear about it.
This is terrifically helpful — thanks!
Do you have any idea how I might load my data into RStudio?
I have tried uploading to S3, but even though I can use “wget” to load files from S3 into the EC2 instance, it won’t let me wget them into the R folder, or copy them from the EC2 ubuntu folder (where wget does work) to the R folder. The RStudio browser tool only seems to have visibility to the R folder and its subfolders. I tried using the command “chmod u+x /home” to make R and its subfolders writable, but chmod did not work (threw an error).
Any ideas?
Matt — Thank you. I haven’t tried using any S3 data yet, so can’t help you out with that one. If I run across anything I’ll post it here.
Update: I haven’t tested it myself yet, but according to the documentation here (http://bioconductor.org/help/bioconductor-cloud-ami/), the bioconductor.org AMIs now have RStudio server installed.
All the instructions for finding and running their AMIs are at that link. (You may still find some of the above instructions helpful for running RStudio server.)
It’s really a cool and helpful piece of information. I’m glad that you shared this useful information with us. Please stay us up to date like this. Thanks for sharing.
Just stumbled across this page. I’ve been maintaining AMIs specifically for RStudio Server for a few months now. Details are at http://www.louisaslett.com/RStudio_AMI/
Hope that helps.