I am trying to create a Docker image for my Python/Django application. Which method would be best to follow?
ADD requirements.txt /code/requirements.txt
RUN pip install -r requirements.txt
OR
RUN pip install package1 package2 package3
RUN pip install package4 package5
One good thing I find with second approach is that cache can be re-used. Each new line of pip install will be added when a new version is being made, ie, one release will have at-most one line on pip install packages.
I feel that first approach will invalidate the cache as requirements file will change when new packages are added, and all the packages are re-installed.
Which method do you prefer?
Please support your answers with reasons.
I follow a combination of both. I maintain a requirements file for the reasons you mentioned. And, I install packages individually to optimize my build time.
I would go with the requirements file, but not because of Docker.
What if I move to a different platform, like Heroku? My Dockerfile won’t be read. What if I want to run some tests on my local machine? It may take a lot of time to build the image for the sake of one test, why don’t I install it in a plain virtualenv? But then again, the Dockerfile is of no use.
I tend to keep portability in mind when developing. Putting requirements in the Dockerfile would kill that.
I do not know much about Python, but I have built a few python images when I was just starting out with Docker. IIRC, I used the requirements.txt file since it's sort of similar to the package.json file in JavaScript. Installing packages manually is going to be a hassle, yes?
Luis Orduz
Software Engineer
You're correct in that keeping the packages in the dockerfile instead of the requirements file might save some time during development. However, I do prefer the requirements file in that it's cleared and easier to understand, both for other developers and oneself.
Requirements files also make it easier to use different files for different environments (thanks to their extensibility) while that could only be approached in docker using different images that depend on one another (or just entirely different dockerfiles that all mention the same packages, which breaks DRY), and might eventually require the introduction of tools such as docker-compose.
There's also the issue of portability to keep in mind. If the team is already using python, it and pip are a given for all the project's developers, while docker is more optional.