This article is part of a series of brief tutorials I’m writing on performing particular tasks with computers. Indeed, this series may seem random, because I’m writing it partly with selfish intentions: I want to create an easily searched and centrally accessible archive of notes for myself, to inform both my project work and my teaching. I started compiling these tutorials as Word documents on my computer, but that seemed a bit too selfish. So, I will now post these how-to’s on this forum so that others can find them, read them, and use them.
Setting Up a Deep Learning Server on Amazon Web Services
I followed the tutorial at http://course.fast.ai/lessons/aws.html to set up a deep learning server on AWS, but I ran into numerous problems, and the process took me several hours as a result. The goal of this article is to fill in the missing details and correct details that need to be fixed.
I needed to install Cygwin and include in the installation the following that were recommended at http://thecoatlessprofessor.com/programming/installing-amazon-web-services-command-line-interface-aws-cli-for-windows-os-x-and-linux-2/.
- openssh, libssh2-devel, libssh2_1
- python version 2.7.5
The command line interface for accessing aws functionality is called awscli. I had to install awscli, but under cygwin and not part of the host Windows operating system. This required installing pip in cygwin. Here is what I typed in cygwin:
$ curl -O https://bootstrap.pypa.io/get-pip.py
$ python get-pip.py
$ pip install awscli
This installs the aws command line tools in cygwin.
I set up a symbolic link called aws in /usr/bin/ to /cygdrive/c/Anaconda3/Scripts/aws.cmd (notice the capital S in Scripts). I set up the symbolic link like so:
ln -s /cygdrive/c/Anaconda3/Scripts/aws.cmd aws
This allowed me to invoke the aws command by simply typing aws.
I had to create a folder named .ssh under my home directory to hold private key (pem) files. This private key is how your host authenticates what it sends to Amazon’s servers.
I had to create an Amazon aws account at aws.amazon.com. I requested special access to create a GPU-based server. To do that, I went to aws.amazon.com/contact-us/ec2-request and requested a p2.xlarge server with an instance limit of 1 in the us-west-2 service area (Oregon). I entered fast.ai.MOOC in the comment area per the instructions in the http://course.fast.ai/lessons/aws.html recording.
I then went to console.aws.amazon.com to create a user and corresponding security keys. Go to Identity and Access Management section as described in the video. That will yield the security credentials, including access key and secret access key, and a console login link, too. I stored these in an Excel spreadsheet called aws_credentials.xlsx on my local machine, and you should save yours, too, since you have to use them in the next step.
The next step is to type
at the cygwin prompt. You’ll be asked to provide the access key and secret access key you just saved. You should also specify the region as us-west-2 and the output type as text. This will produce the public-private key pair you need to communicate with Amazon’s server. The private key is called aws-key.pem by default, and it will be saved in the .ssh folder you created a few steps earlier.
I then needed to run a setup script to create the instance. The script is called setup_p2.sh. I used wget to download it:
I put that file in /cygdrive/c/Anaconda3/Scripts/
I then switched to that directory and ran it
This set up an instance on Amazon’s server and gave me an instruction for connecting to it:
ssh -i /home/username/.ssh/aws-key.pem email@example.com
where aaa-bbb-ccc-ddd is the ip address Amazon assigned to your instance.
After I issued this command, I was running on Amazon’s GPU-based server.
When you are done working, you need to stop the instance. Remember to stop an instance before shutting down your sessions. Otherwise, you’ll keep being billed hourly. Here’s how to do that from the command line (assuming the instance id is i-01234567890abcdef01234)
$ aws ec2 stop-instances –instance-ids i-01234567890abcdef01234
Here’s how to restart that instance from the AWS command line interface
$ aws ec2 start-instances –instance-ids i-01234567890abcdef01234
Note that, every time you restart, you will be billed a full hour.
I created an alarm in AWS Cloudwatch to stop the instance i-01234567890abcdef01234 when the CPU usage was less than 5% for 6 consecutive 5-minute periods. This can help guard against leaving an instance running inadvertently. For example, I forgot to stop my GPU instance, and I ended up being charged over $30 because it was running for over thirty hours before I discovered it.
Hopefully these instructions will help you avoid the difficulties I encountered and get you up and running on an AWS GPU more quickly than I did.