Unable to find image 'nginx:latest' locally
latest: Pulling from library/nginx
802b00ed6f79: Downloading [================================================> ] 21.63MB/22.49MB
5291925314b3: Download complete
bd9f53b2c2de: Download complete
The number of layers a container is using is related to the number of RUN and COPY/ADD commands that were used in the Dockerfile during that containers build.
Too many layers and your machines take a large amount of storage and can take a long time to download and extract. Too few layers and the machine takes F O R E V E R to rebuild.
How a Dockerfile builds
The way that a Dockerfile’s build happens is sequentially from the top to the bottom. The build is cached each time, so if you don’t change anything, and then rebuild the same container, the build will happen pretty instantaneously.
A Single RUN Command
If we look at the following Dockerfile, this container has 2 layers that is a smaller implementation of a container for this particular application.
FROM node:alpine
WORKDIR /app
COPY . /app
RUN npm install -q --no-color --no-progress \
&& npm run build
All of the work for this container is being done in a single RUN command. This means that any time that anything changes in the source application, the container is going to have to rerun install, fetch all the dependencies again, and then run our build.
Optimizing Your Dockerfile
Now that we know our Dockerfile is building from top to bottom, and each layered command is cached and only rerun when when something was changed, we can optimize our Dockerfile to make our life much better.
If we look at the optimized Dockerfile, we will notice that we’ve increased the number of layers to 6. The container has a few more layers, but overall the size of the machine has not increased.
FROM node:alpine
WORKDIR /app
COPY package.json /app/package.json
COPY package-lock.json /app/package-lock.json
RUN npm install -q --no-color --no-progress
COPY tsconfig.json /app/tsconfig.json
# The application source is here
COPY src /app/src/
RUN npm run build
CMD ["npm", "run", "serve"]
If we step through the Dockerfile above:
- First we are defining the work directory to
/app
, this is the directory we will copy our application into. - Next we are copying the
package.json
andpackage-lock.json
into the container. These are the pre-install dependencies, any changes to these files will cause the container to rerun all the following commands. - After we copy the dependency list’s into the app (the
package.json
file is a list of dependencies), then we go ahead and run the dependency install command, in the above example it is a node dependency install vianpm install
. The rest of the flags are just to make the install quiet. - After installing our dependencies we copy in our source application code. This way, when our source code changes and our dependencies don’t, we won’t need to rerun our install command.
- After copying in our source code, we need to run the build command, this command is only going to be run if the source code was changed.
- The CMD is the end command our container is actually running.
One More Example
Bundling a Rails app might look like:
FROM ruby
RUN apt-get update -qq && apt-get install -y build-essential libpq-dev nodejs \
&& mkdir /myapp \
&& rm -rf tmp/pids/* tmp/db
WORKDIR /myapp
COPY Gemfile /myapp/Gemfile
COPY Gemfile.lock /myapp/Gemfile.lock
RUN bundle install
COPY . /myapp
CMD ["bundle", "exec", "rails", "s", "-p", "3000", "-b", "0.0.0.0"]
Notice that we are copying our Gemfile
and Gemfile.lock
, installing our bundle dependencies, and then copying in our app and running the Rails engine.
Summary
Optimizing your Dockerfile can make rebuilds a lot more painless, while still keeping the slimness of your container and limiting the number of layers