Quite often your simple Dockerfile could be contributing to a large increase in your final image size. There are several disadvantages to having a large image, topmost being the hindrance to scale quickly.
1. Base Image
Check if you have the most optimal base image suitable for your application/service. But a word of caution, some of the popular smaller base images (ex: Alpine linux) may not work for all due to organizational restrictions related to using a hardened image.
2. Packages & Package Manager
Absolutely critical to inspect the packages being installed and make sure to only install those that are essential to run your application. Certain package managers like ‘apt’ also install the ‘good-to-have’ packages. Look for the options to disable these recommendations.
apt --no-install-recommends
3. Reduce lines of instructions
By design, each line of instruction in the Dockerfile adds a layer to the image. This design is extremely helpful in caching but also inherently adds to the size of the image. Wherever possible, combine multiple instructions into a single instruction by means of &&.
RUN apt-get update && apt-get install -y \
package-x \
package-y \
package-z \
&& rm -rf /var/lib/apt/lists/*
NOTE: This applies only to ADD, COPY, RUN
4. Multi-stage Builds
A Dockerfile can be divided into stages, with an intermediate stage(image) that includes instructions to install packages, run build steps and generate an artefact/output which can then be copied over to the final image. This significantly reduces the image size due to the packages and the build steps that will be eliminated from the final image.
A sample template:
#your intermediate image
FROM optimal_base_image AS intermediate
COPY files_necessary_for_your_package_manager
RUN install_package_build#your final image
FROM optimal_base_image
COPY --from=intermediate /artefact /artefact
5. .dockerignore
You can specify the list of files/file extensions that should not be copied to your image. Quite often the COPY step includes copying directories which may include the generated temp files and other build outputs that are not required in your final image. This is a fail safe mechanism to eliminate those files during a COPY instruction.
Conclusion
Apart from the scaling aspects, a smaller image size helps maintain the security posture. The image scans are faster and vulnerabilities in check with only the limited packages installed.
NOTE: If the instructions are ordered right with the least frequently changed instructions on top, the benefits of caching can be claimed which results in a faster image build time.