CompSci How2: Setting Up a Deep Learning Server on Amazon Web Services

This article is part of a series of brief tutorials I’m writing on performing particular tasks with computers. Indeed, this series may seem random, because I’m writing it partly with selfish intentions: I want to create an easily searched and centrally accessible archive of notes for myself, to inform both my project work and my teaching. I started compiling these tutorials as Word documents on my computer, but that seemed a bit too selfish. So, I will now post these how-to’s on this forum so that others can find them, read them, and use them.

Setting Up a Deep Learning Server on Amazon Web Services

I followed the tutorial at http://course.fast.ai/lessons/aws.html to set up a deep learning server on AWS, but I ran into numerous problems, and the process took me several hours as a result. The goal of this article is to fill in the missing details and correct details that need to be fixed.

I needed to install Cygwin and include in the installation the following that were recommended at http://thecoatlessprofessor.com/programming/installing-amazon-web-services-command-line-interface-aws-cli-for-windows-os-x-and-linux-2/.

  • libuuid-devel
  • binutils
  • openssh, libssh2-devel, libssh2_1
  • curl
  • wget
  • python version 2.7.5

The command line interface for accessing aws functionality is called awscli. I had to install awscli, but under cygwin and not part of the host Windows operating system. This required installing pip in cygwin. Here is what I typed in cygwin:

$ curl -O https://bootstrap.pypa.io/get-pip.py

$ python get-pip.py

$ pip install awscli

This installs the aws command line tools in cygwin.

I set up a symbolic link called aws in /usr/bin/ to /cygdrive/c/Anaconda3/Scripts/aws.cmd (notice the capital S in Scripts). I set up the symbolic link like so:

ln -s /cygdrive/c/Anaconda3/Scripts/aws.cmd aws

This allowed me to invoke the aws command by simply typing aws.

I had to create a folder named .ssh under my home directory to hold private key (pem) files. This private key is how your host authenticates what it sends to Amazon’s servers.

I had to create an Amazon aws account at aws.amazon.com. I requested special access to create a GPU-based server. To do that, I went to aws.amazon.com/contact-us/ec2-request and requested a p2.xlarge server with an instance limit of 1 in the us-west-2 service area (Oregon). I entered fast.ai.MOOC in the comment area per the instructions in the http://course.fast.ai/lessons/aws.html recording.

I then went to console.aws.amazon.com to create a user and corresponding security keys. Go to Identity and Access Management section as described in the video. That will yield the security credentials, including access key and secret access key, and a console login link, too. I stored these in an Excel spreadsheet called aws_credentials.xlsx on my local machine, and you should save yours, too, since you have to use them in the next step.

The next step is to type

aws configure

at the cygwin prompt. You’ll be asked to provide the access key and secret access key you just saved. You should also specify the region as us-west-2 and the output type as text. This will produce the public-private key pair you need to communicate with Amazon’s server. The private key is called aws-key.pem by default, and it will be saved in the .ssh folder you created a few steps earlier.

I then needed to run a setup script to create the instance. The script is called setup_p2.sh. I used wget to download it:

wget “http://files.fast.ai/files/setup_p2.sh”

I put that file in /cygdrive/c/Anaconda3/Scripts/

I then switched to that directory and ran it

bash setup_p2.sh

This set up an instance on Amazon’s server and gave me an instruction for connecting to it:

ssh -i /home/username/.ssh/aws-key.pem ubuntu@ec2-aaa-bbb-ccc-ddd.us-west-2.compute.amazonaws.com

where aaa-bbb-ccc-ddd is the ip address Amazon assigned to your instance.

After I issued this command, I was running on Amazon’s GPU-based server.

When you are done working, you need to stop the instance. Remember to stop an instance before shutting down your sessions. Otherwise, you’ll keep being billed hourly. Here’s how to do that from the command line (assuming the instance id is i-01234567890abcdef01234)

$ aws ec2 stop-instances –instance-ids i-01234567890abcdef01234

Here’s how to restart that instance from the AWS command line interface

$ aws ec2 start-instances –instance-ids i-01234567890abcdef01234

Note that, every time you restart, you will be billed a full hour.

I created an alarm in AWS Cloudwatch to stop the instance i-01234567890abcdef01234 when the CPU usage was less than 5% for 6 consecutive 5-minute periods. This can help guard against leaving an instance running inadvertently. For example, I forgot to stop my GPU instance, and I ended up being charged over $30 because it was running for over thirty hours before I discovered it.

Hopefully these instructions will help you avoid the difficulties I encountered and get you up and running on an AWS GPU more quickly than I did.

About Ray Klump

Associate Dean, College of Aviation, Science, and Technology at Lewis University Director, Master of Science in Information Security Lewis University http://online.lewisu.edu/ms-information-security.asp, http://online.lewisu.edu/resource/engineering-technology/articles.asp, http://cs.lewisu.edu. You can find him on Google+.

Leave a Reply

Your email address will not be published. Required fields are marked *