This post series is meant to be a summary of what openstack is, and how it can be used as a tool to increase productivity.

The context, or reason, for why I’m writing these posts is that I recently built a small internal OpenStack cloud for use in the Planet Labs office. This is the first job I’ve had that isn’t all about the cloud, and it turns out that most people in the world still do not understand how to leverage the cloud in their day-to-day workflows. So it is my goal to help everyone in the office, as well as anyone who stumbles across this blog series.

What is OpenStack?

OpenStack is an open source implementation of Infrastructure as a Service (IaaS). This basically means that you have a few physical hardware servers, which can virtualize several virtual servers. There are a few main components: Compute, Network, Storage, and an Object Store.

What does ‘compute’ mean?

By compute I basically mean hardware virtualization/virtual machines.

When people hear virtual machine they usually think about things like virtual box, or VMWare Fusion. These are common consumer/desktop virtualization applications which behave similarly to how OpenStack works behind the scenes. When you create a virtual machine using one of these applications you are using a component called the Hypervisor.

A hypervisor is a piece of software that runs on a host (a host is a physical hardware server), and launches + monitors + maintains several guests (a guest (a.k.a instance) is a virtual machine that runs inside of the host).

A few of the most common hypervisors are Xen, KVM, and ESXi. Xen is used by most large public cloud providers such as Amazon EC2, Rackspace Cloud, HP Cloud, many others. KVM is used more in startups, devshops, and smaller companies as it is free and open source. ESXi is actually what is used when you use a VMWare-based virutalization application like VMWare Fusion.

What does ‘network’ mean?

By compute I mean software defined networking. In OpenStack this component is called Neutron, and it allows you to virtualize complex networking architecture on a simple flat network. This can be useful for when you are trying to replicate what a physical hardware network looks like in a real datacenter, in the cloud.

[insert example here – note: I suck at networking]

What does ‘storage’ mean?

By storage I mean block storage. Block storage is commonly used on instances which need to maintain portable data for long periods of time. For example if I launch an instance, make a bunch of changes, save the data to a block storage device, I should be able to unmount the block storage device from inside the instance, and attach it to any new instance in the future.

The interface, or how you use block storage, is basically identical to what it looks like to mount and write to a USB Flash Drive.

What is an ‘object store’?

An object store is basically an ever-growing bucket of files. Typically when using an object store you can upload millions of keys (aka files) to a single bucket. Each key is replicated across several servers, for redundancy, and you can be assured that your data will be both available, and safe in the event of hardware failure.

In OpenStack this is called ‘Cinder’. You also may have heard of Amazon’s S3 service.

To my knowledge the most frequent way of deploying OpenStack’s Horizon django application is via apache, and modwsgi. I think this is primarily because in the early days of devstack the community decided that apache was (at the time) the most common web server.

But now it’s 2013, and things have changed. NGINX is the new hottness, and gunicorn has been replaced with uWSGI for speed and reliability.

Since I didn’t have any specific Horizon instructions, I would like to share some of the configuration I used recently to deploy Horizon using NGINX and uWSGI.

The operating system I used was Ubuntu 13.04 (raring), with the addition of Ubuntu’s cloud archive apt repositories.

Install Dependencies

# Remove the backslash, my websites markdown parser is broken
sudo echo "deb http:\//ubuntu-cloud.archive.canonical.com/ubuntu precise-updates/grizzly main" | tee -a /etc/apt/sources.list.d/ubuntu-cloud-archive-grizzly

sudo apt-get update
sudo apt-get install -y build-essential python-dev python-pip nginx-extras memcached node-less openstack-dashboard

sudo update-rc.d -f  apache2 remove # Disable apache

sudo pip install uwsgi

/etc/nginx/sites-enabled/horizon.conf

server {
    listen 80;
    server_name openstack.foo.com;

    location / { try_files $uri @horizon; }
    location @horizon {
        include uwsgi_params;
        uwsgi_pass unix:/tmp/horizon.sock;
    }
    location /static {
      alias /usr/share/openstack-dashboard/static/;
    }
}

/etc/uwsgi/horizon.ini

[uwsgi]
master = true
processes = 10
threads = 2
chmod-socket = 666

socket = /tmp/horizon.sock
pidfile = /tmp/horizon.pid
log-syslog = '[horizon]'

chdir = /usr/share/openstack-dashboard/
env = DJANGO_SETTINGS_MODULE=openstack_dashboard.settings
module = django.core.handlers.wsgi:WSGIHandler()

/etc/init/horizon.conf

description "OpenStack Horizon App"

start on runlevel [2345]
stop on runlevel [!2345]

respawn

exec uwsgi --ini /etc/uwsgi/horizon.ini

Then you can control horizon via upstart:

sudo service horizon start
sudo service horizon stop
sudo service horizon restart

A long time ago I created a cointosser application that would do something like 10,000 coin flips via random.org which was sort of slow. I thought it might be cool to instead transmit the data as an image which can be read by pixels – a 100×100 image is 10,000 trials and the file is pretty small/easy to parse. I wrote a simple prototype of this in Go, and plan on running it on a raspberry pi using its hardware random number generator

package main

import (
  "image"
  "image/color"
  "image/png"
  "math/rand"
  "os"
  "time"
)

func main() {
  img := image.NewRGBA(image.Rect(0, 0, 100, 100))
  random_seed := rand.New(rand.NewSource(time.Now().UnixNano()))

  // For each pixel in the `img` randomly set pixel to black or white
  for x := 0; x < img.Bounds().Max.X; x++ {
    for y := 0; y < img.Bounds().Max.Y; y++ {
      if int(random_seed.Intn(2)) == int(0) {
        img.Set(x, y, color.Black)
      } else {
        img.Set(x, y, color.White)
      }
    }
  }

  // save image out to png file
  export_img, _ := os.Create("random.png")
  defer export_img.Close()
  png.Encode(export_img, img)

}

When developing tools that depend on relatively useful IaaS APIs it becomes blatantly obvious which providers fail to provide the necessary APIs to perform the fundamental tasks required to build cloud tools and applications. It also becomes very frustrating when a provider claims to be something it clearly is not.

Rackspace does not support public key authentication out of the box, which is something so fundamental to the way cloud developers work, that it is difficult to consider Rackspace a serious cloud provider.

If you say you’re running OpenStack, you should really be running OpenStack with full API support…

/rant

p.s. I’m working on cloud agnosticism for CloudEnvy, and my biggest frustration thus far has been Rackspace.

For the last few years I’ve been working on OpenStack, and one of the aspirations many of us in the community have had from the beginning was to actually be able to use OpenStack for our own personal projects. A while ago I started working on a project called CloudEnvy which has potential to change the development patterns of web developers everywhere.

What is CloudEnvy?

CloudEnvy is a tool which allows you to configure and distribute reproducible development environments in the cloud. Basically you create an Envyfile.yml in the root of your project’s git repo, which defines things like environment name (translates to instance hostname), server image to use (for example ubuntu or centos), what type of instance to launch (m1.tiny, etc), and any provision scripts required to build out the environment. (right now provision scripts are primarily bash, but realistically they could be ruby, python, perl, or whatever…)

Now I know you’re thinking “Wait, isn’t that the same thing as Vagrant?” The answer to that question is YES! Vagrant is amazing, but there are some pretty significant advantages of using CloudEnvy over Vagrant.

Advantages of cloud development environments

  • Datacenter internet (download at 10,000kb/s instead of 100kb/s)
  • Cloud Mentality (If somethings broken, just kill it and spawn a new environment. It only takes 20 seconds.)
  • Local machine performance (I own a macbook air with 4gb of ram, launching more than a single tiny vm locally is futile)

Using CloudEnvy

To get started you first need to install CloudEnvy. We (aka Brian Waldon) regularly update and maintain packages on PIP, so the following should get you the most recent release:

sudo pip install cloudenvy

Next you need to setup your global configuration file – this is located at ~/.cloudenvy.yml. This is where you define your cloud credentials.

cloudenvy:
  clouds:
    cloud01:
      os_username: username
      os_password: password
      os_tenant_name: tenant_name
      os_auth_url: http://keystone.example.com:5000/v2.0/

      # Optional
      #os_region_name: RegionOne

Now that you can actually connect to a cloud, it’s time to get your project setup. As an example of how things work I will be outlining how to launch Devstack as a cloud environment (Devstack is an OpenStack development environment).
The following would be in the Envyfile.yml of your project’s root directory:

project_config:
  name: devstack
  image: 'Ubuntu 12.04 cloudimg amd64'
  remote_user: ubuntu
  flavor_name: m1.large
  provision_scripts:
    - provision.sh
  sec_groups:
    - icmp, -1, -1, 0.0.0.0/0
    - tcp, 22, 22, 0.0.0.0/0
    - tcp, 80, 80, 0.0.0.0/0
    - tcp, 8770, 8770, 0.0.0.0/0
    - tcp, 8774, 8774, 0.0.0.0/0
    - tcp, 8775, 8775, 0.0.0.0/0
    - tcp, 8776, 8776, 0.0.0.0/0
    - tcp, 9292, 9292, 0.0.0.0/0

Currently the best way to provision an environment is by running a bash script. Note that in the Envyfile.yml we have defined a single provision script; now lets actually flesh it out. The following bash script can live anywhere, as long as the path is correctly defined in the Envyfile.yml. In our example it’s located in the same directory as the Envyfile.yml at provision.sh:

#!/bin/bash

# Skip ssh new host check.
cat<<EOF | tee ~/.ssh/config
Host *
  StrictHostKeyChecking no
  UserKnownHostsFile /dev/null
  User ubuntu
EOF

sudo apt-get update
sudo apt-get install -y git python-netaddr

git clone https://github.com/openstack-dev/devstack.git #-b stable/folsom

cd devstack/

cat<<LOCALRC | tee localrc
FIXED_RANGE=192.168.2.0/24
MYSQL_PASSWORD=secrete
RABBIT_PASSWORD=secrete
SERVICE_TOKEN=secrete
SERVICE_PASSWORD=secrete
ADMIN_PASSWORD=secrete
LOCALRC

./stack.sh

Once the provision script is written it’s time to actually launch your first cloud environment. For your first launch I recommend using the -v tag to get useful output for debugging. Running the following command will launch an instance using your public key, create a security group for this specific environment, allocate and assign a floating ip, and run the provision.sh script.

envy -v up

You should see output saying what CloudEnvy is doing, and you should see all of the output from the provision script. When your environment is complete it should return the instance ip address. In case you forget it and need it for something you can always get it again by running:

envy ip

Having multiple environments makes it kind of difficult to remember all of the ips for all of your environments, so we have a command which will ssh into your current project’s environment:

envy ssh

That’s really all you need to know to get started.

Where is envy going from here?

CloudEnvy is very useful in several different use cases, and I’ve been very happy to see it being used regularly by a whole slew of people from different backgrounds.

Going forward there are a few things on my priority list

  1. Cloud Agnosticism Right now CloudEnvy only works with OpenStack, but honestly that’s not enough. CloudEnvy needs to work with Amazon EC2, Rackspace Cloud, HP Cloud, and all other providers with a sane API.
  2. Multi-node Environments Not sure if this is ever going to make it into CloudEnvy, but it would really be nice if there was a recommended, or at least documented path for deploying multi-node development environments.
  3. Building a community Tools mean nothing if there isnt a community around them that documents what can be done, and outlines best practices. I would love to see envyfiles for projects and platforms I have never heard of, currently there are only a few examples of this located here: http://github.com/cloudenvy

If you have any questions, or would like to contribute feel free to hit me up on twitter @jakedahn