In this tutorial, we will teach you the concept of Docker data volumes, along with what they are and the reason that they are useful. There are several types of volumes; we will show you how to use them and when to use them. We will also present you with a couple of examples of how you could use Docker volumes using the docker command line tool.
Once you get to the end of this tutorial you should be more comfortable when creating and using any type of Docker data volume.
Prerequisites
For you to follow this tutorial, you are required to have the following things:
- Ubuntu 14.04 VPS
- A non-root user with sudo privileges (Initial Server Setup with Ubuntu 14.04explains how to set this up.)
- Docker installed with the instructions from Step 1of How To Install and Use Docker Compose on Ubuntu 14.04
An Explanation of Docker Containers
When working with Docker, it needs an understanding of several particular Docker concepts and much of the documentation focuses on explaining how to utilize Docker’s tool set without requiring much explanation of the reason why you would want to utilize any of these tools. This could become confusing if you are new to Docker, so we are going to begin by following through some basics and then hopping onto working via Docker containers. If you have worked with Docker before, you are free to skip to the next section of the tutorial and would just like to know how you can get started with data volumes.
A Docker container can be pretty similar to a virtual machine. This could allow you to run a pre-packaged ‘Linux box’ within a container. The main difference between a Docker container and the usual virtual machine is that Docker is not as isolated from the surrounding environment as a regular virtual machine can be. A Docker container can share the Linux kernel via the host operating system and this means that it won’t have to ‘boot’ the way a virtual machine would.
Setting up a Docker container is a swift and cheap operation, in many cases you can bring up a full Docker container, the equivalent of a regular virtual machine, in the same time as it will need to run a regular command line program. This is good since it allows deploying complex systems to be a much simpler and more modular procedure, although it is a different paradigm from the regular virtual machine approach and includes some unexpected side effects for people who came from the virtualization world.
Learning the Types of Docker Data Volumes
Here are three main use cases for Docker data volumes:
- To ensure that there is always data around once a container is removed
- To share data between the host filesystem and the Docker container
- To share data with different Docker containers
While this third point is, advanced we are not going to delve into that in this tutorial. The first two, however, can be quite common.
In the first and easiest case, you will just need the data to stay around even if you want to remove the container, so it is frequently simplest to just let Docker manage where the data gets stored.
Keeping Data Persistent
There is no way to exactly create a ‘data volume’ in Docker, so we will create a data volume container with a volume connected to it. With any different container which you would like to connect to this data volume container, use the Docker’s ‘–volumes-from’ option to obtain the volume using this container and apply them to the current container. This may seem quite unusual at first look, so we are going to run through a simple example of how to use this approach to cause our ‘byebye’ file to stick around even if the container is deleted.
We will begin by creating a new data volume container in order to store our volume.
docker create -v /tmp --name datacontainer ubuntu
This made a container called ‘datacontainer’ based off of the Ubuntu image and within the directory ‘/tmp’.
Next, if we were to run a new Ubuntu container using the ‘–volumes-from’ flag and execute ‘bash’ again then everything written to the ‘/tmp directory’ should be saved to the ‘/tmp’ volume of our ‘datacontainer’ container.
And now we start the Ubuntu image.
docker run -t -i --volumes-from datacontainer ubuntu /bin/bash
The command line ‘–t’ will call a terminal from within the container. And the ‘–I’ flag should cause the connection to be interactive.
Once in the ‘bash’ prompt for the Ubuntu container, create a file in ‘/tmp’.
echo "This is an example" > /tmp/helloworld
Now you may use ‘exit’ to return to your host machine’s shell. Afterwards, execute the same command again.
docker run -t -i --volumes-from datacontainer ubuntu /bin/bash
The file ‘helloworld’ should now be there.
cat /tmp/helloworld
You should see.
Output of cat /tmp/helloworld
This is an example.
You may append as many ‘–volumes-from’ flags as you would like. For example: if you needed to assemble a container which uses data from several data containers, you could also make as much data volume containers as you would like.
The only warning to this direction is that you may only select the mount path within the container, /tm in our example, once you have made the data volume container.
Sharing Data Between the Host and the Docker Container
Another common use for Docker containers is sharing files between the host machine and the Docker container. This works differently from the last example. You will not have to create a ‘data-only’ container first. You could simply run a container of any Docker image and replace one of its directories with the contents of a directory on the host system.
We will give you a real-world example; let’s assume that you wanted to use the official Docker Nginx image but you also wished to keep a permanent copy of the Nginx’s log file to scan later. By default the Nginx Docker image logs into the ‘/var/log/nginx’ directory although that is ‘/var/log/nginx’ in the Docker Nginx container. Which is not quite reachable from the host filesystem.
You can make a folder to store your logs and, afterwards, run a copy of the Nginx image using a shared volume so that Nginx creates its logs to our host’s filesystem instead of to the ‘/var/log/nginx’ within the container.
mkdir ~/nginxlogs
Now, start the container.
docker run -d -v ~/nginxlogs:/var/log/nginx -p 5000:80 -i nginx
This particular run command is a bit different from the ones you may have used thus far, so we will split it up to three parts for you.
‘-v ~/nginxlogs:/var/log/nginx’ – We set up a volume which links the ‘/var/log/nginx’ directory within the Nginx container to the’ ~/nginxlogs’ directory on the host machine.
Docker is using a ‘:’ in order to separate the host’s path from the container path, the host path will usually come first.
‘-d’ – This will detach the process and run it in the background. Otherwise, you would have just been watching an empty Nginx prompt and you would not be able to use this terminal once you killed Nginx.
‘-p 5000:80’ – Setup a port forward, since Nginx container is port 80 on default and it maps the Nginx container’s port 80 to port 5000 on the host system.
If you have paid close attention, you might have also seen one other change from the previous run command; up until now you have been using a command at the end of all our run statements, normally it is ‘/bin/bash’, to tell Docker which command to run within the container. Since the Nginx image is an official Docker image, it follows Docker best practices and the one who made the image set the image to run the command for Nginx to start automatically. You can drop the usual ‘/bin/bash’ here and afterwards let the creators of the image select which command to run in the container for you.
Since you have acquired a copy of Nginx running within a Docker container on your machine, and your host machine’s port 5000 maps directly to that copy of Nginx’s port 80, you should use curl in order to quickly test the request.
curl localhost:5000
You should have a screenfull of HTML returned from Nginx, revealing that Nginx is up and running. If you see in the ‘~/nginxlogs’ folder on the host machine and notice the ‘access.log’ file you should see a log message from Nginx showing your request.
cat ~/nginxlogs/access.log
You should see something close to the below.
Output of `cat ~/nginxlogs/access.log` 172.17.42.1 - - [23/Oct/2015:05:22:51 +0000] "GET / HTTP/1.1" 200 612 "-" "curl/7.35.0" "-"
If you happen to make changes to the ‘~/nginxlogs’, you will be able to see them from within the Docker container in real time as well.
Conclusion
This should be everything you need to know about creating data volume containers and whose volumes you can use as a way to persist data in other containers, including how to share folders between the host filesystem and a Docker container.