Issue
This Content is from Stack Overflow. Question asked by John Doe
I’ve been recently refactoring a Dockerfile
and decided to try ADD
over RUN curl
to make the file cleaner. To my surprise, this resulted in quite a size difference:
$ docker images | grep test
test curl 3aa809928665 7 minutes ago 746MB
test add da152355bb4d 3 minutes ago 941MB
Even more surprisingly, I tried a few Dockerfile
s that do nothing except ADD
ing or curl
ing things, and their sizes are identical. I also tried with and without buildkit, the result is the same (although without buildkit images are slightly smaller).
Here’s the actual Dockerfile
on Pastebin. I don’t understand why this happens with this particular Dockerfile
, because essentially I’m doing exactly the same things.
Any ideas?
Solution
You notice this, because ADDed files do not disappear from older image layers even if you remove them later. Consider the following dockerfiles:
# a
FROM alpine:latest
RUN apk add --no-cache curl
ADD https://www.python.org/ftp/python/3.10.7/Python-3.10.7.tar.xz Python.tar.xz
RUN rm Python.tar.xz
# b
FROM alpine:latest
RUN apk add --no-cache curl
RUN curl -o Python.tar.xz https://www.python.org/ftp/python/3.10.7/Python-3.10.7.tar.xz
RUN rm Python.tar.xz
# c
FROM alpine:latest
RUN apk add --no-cache curl
RUN curl -o Python.tar.xz https://www.python.org/ftp/python/3.10.7/Python-3.10.7.tar.xz && \
rm Python.tar.xz
Building each of them in the same context, I got the following results:
REPOSITORY TAG IMAGE ID CREATED SIZE
<none> <none> cc79832a5ffa 9 seconds ago 27.3MB
<none> <none> 87ea16448764 13 seconds ago 7.68MB
<none> <none> 7f794f03b960 18 seconds ago 27.3MB
alpine latest 9c6f07244728 5 weeks ago 5.54MB
(guess which file yields different result)
If at some point you "finished" a layer with some files you don’t need in final image – you wasted the space. So your single RUN command is the most efficient. To improve readability, you may try to adapt multi-stage build here, so that all curl/ADD, unzip/tar -x commands are isolated on build stage, and then you have only required binaries to copy from build stage to deploy stage. I’m not sure however that you’ll gain much here.
This Question was asked in StackOverflow by John Doe and Answered by SUTerliakov It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.