Thursday, November 23, 2017

Easy Jupyter notebook setup on AWS GPU EC2 with machine learning AMI

The Amazon machine learning AMI (link may change in the future) is set up for CUDA/GPU support and preinstalled: TensorFlow, Keras, MXNet, Caffe, Caffe2, PyTorch, Theano, CNTK, and Torch.

I chose the least expensive g2.2xlarge EC2 instance type with a GPU and used the One Click Launch option (you will need to specify a key file pem file for the AWS region where you are starting the instance). to have an instance running and available in about a minute. This GPU instance costs $0.65/hour so remember to either stop it (if you want to reuse it later and don't mind paying a small cost of persistent local storage) or terminate it if you don't want to be charged for the 60GB of SSD storage space associated with the EC2.

I am very comfortable working in SSH shells using Emacs, screen, etc. When an instance boots up, the Actions -> Connect menu shows you the temporary public address which you can use to SSH in:

ssh -i "~/.ssh/key-pair.pem" [email protected]

I keep my pem files in ~/.ssh, you might store them in a different place. If you haven't used EC2 instances before and don't already have an pem access files, follow these directions.

Anaconda is installed so jupyter is also pre-installed and can be started from any directory on your EC2 using:

jupyter notebook

After some printout, you will see a local URI to access the Jupyter notebook that will look something like this:

    Copy/paste this URL into your browser when you connect for the first time,
    to login with a token:

In another terminal window start another SSH session but this time map the local port 8888 to port 8888 on the EC2:

ssh -L 8888: -i ".ssh/key-pair.pem" [email protected]

Now on your laptop you can attach to the remote Jupyter instance using (your token will be different):

Alternative to using SSH tunnel:

A nice alternative is to install (on your laptop - no server side installation is required) and use sshuttle. Assuming I have a domain name attached to the sometimes running EC2, I use the following aliases in my bash configuration file:

alias gpu='ssh -i "~/.ssh/key-pair.pem" [email protected]YDOMAIN.COM'
alias tun="sshuttle -r [email protected]MYDOMAIN.COM 0/0 -e 'ssh -i \".ssh/key-pair.pem\"'"

Note: Keeping an Elastic IP Address attached to a EC2 when the EC2 is usually not running will cost you about $3.40/month, but I find having a "permanent" IP address assigned to a domain name is convenient.

Goodies in the AWS machine learning AMI:

There are many examples installed for each of these frameworks. I usually use Keras and I was pleased to see the following examples ready to run:

There are many other examples for the other frameworks TensorFlow, MXNet, Caffe, Caffe2, PyTorch, Theano, CNTK, and Torch.

Sunday, November 12, 2017

My blockchain side project to 'give something back'

I am very busy with my new machine learning job but I always like to try to split off some of my own free time for occasional side projects that I think will help me learn new technologies. My latest side interest is in blockchain technologies and specifically I am interested in blockchain as a platform and environment for AI agents.

I liked Tim O’Reilley’s call for action for corporations and people to take a longer term view of working for things of long term value to society in his recent keynote speech: Our Skynet Moment 

While I consider myself to be a talented practitioner for building machine learning and general AI applications since 1982, I don't feel like I work at the level of creating any groundbreaking technologies myself. So, as far as 'giving something back' to society, it seems like my best bet is in putting some energy into distributed systems that push back against centralized control by corporations and governments, things that enpower people.

Although it is really early days, I think that the Hyperledger projects look very promising and I like how this organization is organized in a similar fashion as the Apache Foundation.

I would like to start slow (I don't have much free time right now!) and will record any open source experiments I do at my new site I may or may not finish a book project on any open source software that I write: hyperledgerAI: Writing Combined Blockchain and Artificial Intelligence Applications. I will start with small well documented sample applications built on Hyperledger.