Sunday, October 5, 2014

Docker the lightweight virtual machine on Linux platform

Docker is an open source technology that facilitates the deployment of software in containers by providing operating system-level virtualization and resource isolation. It is a very convenient tool for creating self contained software environments with its own view process tree, memory, network and installed softwares.

It is very lightweight compared the virtual machines while capable of providing a lot of facilities that the virtual machines can offer. For example, let us assume we are developing a real-time-big-data analytics product and we need Elasticsearch, Memcached and Cassandra clusters for our back-end storage. We start prototyping using a single instances for each of these components. Then we start working with real clusters and generally developers will have them installed on
virtual machines (Though we can create the clusters on the same machine by using different ports for each instance, but that is just inconvenient ways to do the things). VMs are generally heavy-weights. What if we can have some lightweight isolated process containers to run each instance of Cassandra or Elasticsearch or Memcached?  Yes, we can use Docker to solve our problem by running them in a "virtually real" distributed world !!


In this post I will explain how we create images for Docker containers and run them on Ubuntu 14.0 platform. But this can be easily mapped to other Linux distributions or other versions of Ubuntu as well.

Let us install Docker (docker.io) software first.

$sudo apt-get update
$sudo apt-get install docker.io

Now let us create a docker image based on Ubuntu 14.04 and add memcached to it. In short we will have a memcached docker images on Ubuntu 14.04.
Let us create a Docker file (Dockerfile) in a directory, say /home/geet/test and put the below lines:

# Dockerfile content
FROM ubuntu:14.04
MAINTAINER "Put your name"
RUN apt-get update
RUN apt-get install -y memcached
ENTRYPOINT ["/usr/bin/memcached" ]
VOLUME [ "/home/geet/sws" ]

The first line "FROM ubuntu:14.04"  specifies a base image ubuntu with tag 14.04. Docker build will look up the image in the current host, and if it cannot find it will download it from Docker hub which is a public Docker image registry.

MAINTAINER specifies the name of the maintainer and can be omitted.
RUN instruction executes the commands (i.e. here it will run "apt-get update" and  "apt-get install -y memcached").

ENTRYPOINT specifies the command to run when we run the image or containers based on the image. Note that memcached will run in foreground and not as a daemon. Because docker container will exit as soon as the command specified in the ENTRYPOINT exits. Hence we will keep the command running in foreground.

VOLUME specifies a list of external directories (in the host machine or other containers) which will be mounted by the container when started.

We create the new docker image issuing the below commands sample:

$ cd /home/geet/test  # this directory has the Dockerfile
$ sudo docker build -t "nipun/memcached:ubuntu14.0"  .

The "-t" option is used to give a name to the new image.

After the build is complete we will get a new image ready to run memcached. Let us check the output of the below commands:

 $ sudo docker images
REPOSITORY           TAG                 IMAGE ID               CREATED           VIRTUAL SIZE
nipun/memcached     ubuntu14.0     7416281a318d       6 hours ago         217.4 MB
ubuntu                      14.04               6b4e8a7373fe        3 days ago          194.9 MB

So, we have two images now and the image  nipun/memcached:nipun/memcached is create successfully.

Now let us run the images
$ sudo docker run -h m1  --name="memcache1" -P nipun/memcached:ubuntu14.0 -u root
#--Press control-C after few seconds
$ sudo docker run -h m2  --name="memcache2" -P nipun/memcached:ubuntu14.0 -u root
#--Press control-C after few seconds 

$ sudo docker run -h m3  --name="memcache3" -P nipun/memcached:ubuntu14.0 -u root
#--Press control-C after few seconds 

-P option is to expose all the ports opened in the container to the host. Memcached listens on port 11211 and because we used -P option, clients running on host will be able to connect to the memcached servers running on the containers.

Now let us run a test to check if our memcached cluster is running fine. We install our favourite python memcached client.

$ sudo docker start memcache1 memcache2 memcache3
$ sudo pip install python-memcached

And run the below script

# script memcache_test.py
#
import memcache

# Pass the list of servers to memcache.Clent API
client = memcache.Client(['172.17.0.5:11211', '172.17.0.6:11211', '172.17.0.7:11211'])
client.set('testkey', 'This is value for the testkey')
val = client.get('testkey')
print val
if val == 'This is value for the testkey' :
    print 'Got correct value. Success!!!!!'
client.disconnect_all()


$python memcache_test.py
The output...
This is value for the testkey
Got correct value. Success!!!!!

So, our pseudo distributed setup for memcached clusters where each memcached server is running in Docker container just worked !!