Monday, January 7, 2013

Blog Series: WoG, Cesky Krumlov; Day 1: Amazon Cloud

The Amazon Cloud
Dr. Konrad Paszkiewicz
Director, Wellcome Trust Biomedical Bioinformatics Hub

Topic: Amazon Cloud


  • No need to house/maintain servers
  • No need to worry about backing up
  • Only pay for what you use
  • Upgrade are handled by Amazon
  • You can expand and delete storage as you have need
  • There are many many preconfigured virtual machines (VM) to pick from if you don't want to develop one on your own (QIIME, STACKS and Short read aligment all have their own VMs)
  • You will pay for it; storage even when you are not using it, time using it etc and many researchers are resistant to the idea that they have to 'pay' for computing power and storage. And they are surprised by how much computational power and storage can cost. Researchers need to start being trained to think of about computing costs in their grant proposals--cost of programming and software, costs of hardware, costs of the people that will need to be hired to program and do analysis.
  • Cost of an Amazon VM can run 0.20-3.00/hr; you also pay by the Gigabyte.
  • Data transfer to your VM could be slow
  • If the network is down you won't be able to access your VM
  • Typical per month cost is $200-$400 depending on how much data you store and how extensively you use the machine. So you have to decide is it worth it? What are the costs and benefits of buying and maintaining your own server or computing system that's powerful enough to run analysis and store all your data versus using Amazon web services.
Read on if you are interested in the tutorial/exercise we went through today despite it most likely not being applicable to your situation unless you have access and/or are interested in implementing amazon cloud services.

So for this workshop we were all sponsored by Amazon for our cloud 'experience', so we will be acquainted with and look at how to use the cloud for computational biology.

A tutorial/exercise was provided taking us through the process of logging in, creating and launching our Amazon instances. The pdf can be downloaded; it's in Part 2; however unless you have an amazon account the tutorial will not help you much. All the same here are some things to consider when setting up your instance:
  • We will be using elastic computing (EC2): the service AWS is known for. It enables you to rent hourly Linux and Windows.
  • Once you've logged into the Amazon cloud there will be a services drop down menu top left that will lead you to EC2 where you can launch an 'instance' using the classic wizard.
  • Now, we already had Amazon Machine Images (AMI) created for the workshop so all of the above assumes you have an AMI to launch, if not one will have to be created or utilized from available AMIs.
  • You basically choose an AMI, select how 'big' you want your instance, name it, decide how you will connect with it, and hit Launch.
  • Remember to shut your instance down (via Stop) when you are done or you will continue to be charged. If you hit terminate you'll possibly lose everything you've done on your VM.
  • There is a GUI application that will allow you to get into your VM without have to use the terminal for those of you that prefer GUIs, it's called NoMachine (NX client). Unfortunately this didn't work for my computer (Mac OSX version 10.7.5) and my colleagues computer (Mac OSX 8) so we will attempt to install it directly on our VMs next time we log into AWS or will end up SSH-ing in (using the terminal/command-line interface). 
The tutorial is a good read, again this is Part 2 that you'll be looking over. So feel free, it has pictures showing what the different screens should look like so you can familiarize yourself and consider whether AWS is for you. I'm sure I'll have more to say on this as we actually get in and use the cloud. Right now it was all about just getting logged in and up and running tonight.

At the end of the presentation Konrad emphasized something that I've heard Tyghe (my programmer husband for those of you just joining us) say about a hundred times as well when I get super frustrated with learning programming...

"When it comes to programming and next generation sequencing analysis, they are skill sets. Skill sets you have to practice. Use them everyday and it will come easier and faster over time"