Node.js Best Practices for Docker
TL;DR
- Use a single
Dockerfile
anddocker-compose.yml
where possible for simplicity. - Use bind mounts for development to sync changes without rebuilding images.
- Be cautious—host files can override files in Docker (
host > docker
). - Avoid installing dependencies on the host if using a different OS than the Docker image.
- Use
WORKDIR
instead ofRUN mkdir
to create and switch directories. - Prefer
COPY
overADD
unless extracting tar files or copying from URLs. - Run your containers as a non-root user (
USER node
) for better security.
Introduction
This article provides best practices for Dockerizing Node.js applications for both development and production.
It assumes basic knowledge of Docker, Dockerfiles, and docker-compose
. Instead of a step-by-step guide, I'll focus on practical snippets and insights for an optimized setup.
While the concepts here apply primarily to Node.js, many of them can be used for other languages as well. However, every project has different requirements, so consider these best practices as guidelines rather than strict rules.
Using Docker for Development
Before we proceed with the actual best practices, I'd like to talk a bit about using docker for development first
This is because using docker for dev will define some of our constraints when writing our Dockerfile for production
You can actually use multiple Dockerfile
and docker-compose.yml
files by
providing the -f
flag
Here's an example where we run docker-compose up
using docker-compose.dev.yml
as the docker-compose file
docker-compose -f docker-compose.dev.yml up
Here's another example where we build a dockerfile using a custom docker file name like Dockerfile.prod
.
This also allows us to specify a Dockerfile in a different path which can useful in certain scenarios like monorepos
docker build -t mplibunao/some-app -f ./server/Dockerfile.prod
Here's an example of a docker-compose file that specifies a custom Dockerfile or custom path to the dockerfile
version: "3.6"
services:
server:
build:
context: ./
dockerfile: ./server/Dockerfile
For the most part though, the Dockerfile for dev and prod should be similar. In fact it is best to use one Dockerfile for all of your environments instead of one for each (dev
, ci
, prod
)
This is because using multiple files creates too many combinations of flags, commands and files
Here is an example package.json
scripts which uses different
docker-compose.yml
files
"scripts": {
"docker:prod": "sudo docker-compose -f docker-compose.prod.yml up -d",
"docker:dev": "docker-compose -f docker-compose.dev.yml up",
"docker:test": "docker-compose -f docker-compose.test.yml up --abort-on-container-exit",
"docker:seed:prod": "sudo docker-compose -f docker-compose.prod.yml run rental-pay-api sh -c 'npm run seed'",
"docker:seed:dev": "docker-compose -f docker-compose.dev.yml run rental-pay-api sh -c 'npm run seed'",
"docker:test-watch": "docker-compose -f docker-compose.test.yml run rental-pay-api yarn test-watch"
},
This is just an example as you might not even use docker-compose
for other
environments other than dev. But you get what I mean.
In any case, we will talk in more detail how to address this issue later
Why docker for dev?
I've met a lot of very talented devs who do not like using docker for development which is fine as each approach has its own trade-off
Pros:
- Eliminates problems with software versions (think
postgres
,node
) and managing them. Note that you can handle this using version managers likeasdf
,nvm
,rvm
- Easy on-boarding for devs. Allows developers to move across projects with ease
especially for juniors working remote
- To be fair, for most small or monorepo projects this might not be even a problem
- But for more complex architectures with lots of moving parts and services, it makes me feel better knowing that juniors will be able to start contributing much faster
- Allows you to deploy anywhere for prod (
heroku
,dokku
,digital ocean
,aws
) - If you're using docker for prod, might as well use docker for dev. This has the added benefit of being as close to production as possible
Cons:
- For setups which are not fully dockerized (eg:
postgres
running on docker but the rest not on docker), it's a bit harder to make these dockerized services talk with non-dockerized services- Yes there are work-arounds like
host.docker.internal
but doesn't work the same across different OS - I prefer all services running using
docker-compose
so everything is run in a single docker network
- Yes there are work-arounds like
- Docker uses more memory. Might encounter difficulty running apps using multiple containers (you can adjust the memory usage)
- Learning curve. You need to learn docker to use docker. You also
encounter lots of small issues like
permission errors
when editing files generated by containers, etc - Knowing docker will not make you good at devops. You still need to know about sysadmin or cloud technologies like file permissions, load balancers, etc.
Using docker-compose for development
For development, I like to use docker-compose
since it handles a lot of stuff
for me like networking, volumes, env variables, etc though the docker-compose.yml
I will walk through the important things you need to get up and running for dev
Volumes
Persistent volumes are important for using docker for dev as you need to persist the changes you are making to the code
There are 2 mount types: volumes
and bind mounts
- Volumes - allow you to persist data outside of the container UFS (Union File System)
Similar to bind mounts but stored in a part of the host system managed by docker
In linux machines, you can navigate to this location returned when you run
docker inspect <volume-name-or-id>
For windows/mac it uses a bit of magic where it's actually creating a linux VM and so that data is actually inside that linux VM (thus not accessible directly)
Can be anonymous or named volumes
# docker-compose.yml with postgres using named volume
# Not sure what the syntax of anonymous volumes are for docker-compose.yml
# But doesn't really matter since named is better anyway
version: "3.6"
services:
postgres:
image: postgres:13.2
restart: always
ports:
- "5432:5432"
volumes:
- db_data:/var/lib/postgresql/data
env_file:
- .env
volumes:
db_data:
- Bind Mounts - Mounts or sharing of a host directory or file into a container
Links container path to host path (Eg: ~/projects/todo
in the host machine links to /app
inside the container)
The file or directory does not need to exist on the Docker host already. It is created on demand if it does not exist yet
This can be bad because you are relying on the host machine's filesystem
Eg: Let's say we have the following docker-compose.yml
for our todo project
version: "3.6"
services:
web:
volumes:
# Creates bind mount between `~/projects/todo/node_modules` and `/app/node_modules`
- ./node_modules:/app/node_modules
ports:
- "3000:3000"
The volume part of the yaml file links ~/projects/todo/node_modules
and /app/node_modules
When you run yarn install
during the image build process, it will create
a node_modules
folder with dependencies inside the container
If you just pulled the git repo you would likely not have a node_modules
in your host machine ~/projects/todo
without running yarn install
This becomes an issue when you run docker-compose up
as the bind mount is
created via the docker-compose.yml
.
Since ~/projects/todo/node_modules
does not exist, it either fails to bind the
node_modules
on your host machine or it overwrites /app/node_modules
inside
your container, essentially hiding it.
The same is true if you have an empty node_modules
on your host machine; The
node_modules
inside the container will be empty as well.
Thus you should be careful when using bind mounts as it can hide files in your container if it does not exactly match
Another possible case is if you are using a different os like macos
and your
container is using linux os like ubuntu
or alpine
If you yarn install
on your host machine then docker-compose up
, you might
encounter some issue with your dependencies
This is because some dependencies are installed differently on different
operating systems (eg: argon
, maybe bcrypt
or some filesystem dependency)
So when you docker-compose up
, the node_modules
from inside your container
gets overwritten by your macos node_modules
which can cause dependency errors
So if bind modules are so dangerous, why use them at all?
- This is because bind mounts allow the container to change the host filesystem.
Eg: When you install a node dependency or generate migration file using docker, the migration file or dependency files are added to your file system as well.
I realized this is not really a big deal since you can generate your migration file from the host and it will not be an issue since host overwrites docker
-
More importantly this allows you to write code without constantly needing to rebuild your docker image
-
Bind mounts are also very performant which is perfect for development
version: "3.6"
services:
web:
volumes:
# Creates bind mount between `~/projects/todo` and `/app`
# Keeps your host and docker filesystem in-sync
- .:/app
ports:
- "3000:3000"
Note: Similar to volumes you can also create named bind mounts
version: "3.6"
services:
web:
volumes:
- web:/app
ports:
- "3000:3000"
volumes:
web:
# bind named volume to current working directory
driver_opts:
type: none
device: ${PWD}
o: bind
Env variables for docker-compose
Since we are going to be using docker-compose
for development, we will be
talking more about passing env variables in that context
Env variables is an important part of developing applications; They allow us to deploy our apps across different environment with ease.
There are a few ways to set env variables
- Through the
Dockerfile
'sENV
# Hard code into your dockerfile
ENV NODE_ENV production
# Pass through ARG
ARG NODE_ENV
ENV NODE_ENV $NODE_ENV
$ docker build --build-arg NODE_ENV=development .
# You can also set a default value for ARG
ARG NODE_ENV=production
ENV NODE_ENV $NODE_ENV
- Through
environment
indocker-compose.yml
version: "2"
services:
rental-pay-api:
command: yarn dev -- -L
environment:
- NODE_ENV=development
- PORT=3001
- JWT_EXPIRATION_MINUTES=1440
- JWT_SECRET=bA2xcjpf8y5aSUFsNB2qN5yymUBSs6es3qHoFpGkec75RCeBb8cpKauGefw5qy4
- Through passing an
env
file indocker-compose.yml
version: "3.6"
services:
web:
volumes:
- .:/app
ports:
- "3000:3000"
# pass ./.env file
env_file:
- .env
Using the env_file
is my preferred way since it allows you to manage your env
varibles through your env file, which is normally how people do it
non-dockerized setups anyway
It also allow you to use different env files for different setups; like if you
are using docker-compose
for prod as well, then you can pass an
.env.production
instead
Commands for development
In this section, we will mostly be talking about docker-compose
commands over
docker
commands
The syntax for the 2 are actually very similar, with the only difference being which alias or name you use to refer to a particular container.
docker
commands require using either the container id or name to refer to a
container. For containers started using docker-compose
the naming scheme is
usually <name-of-directory>_<service-name>_<number>
. You can usually check the name
and id of a container by running docker ps
or docker container ls
$ docker container ls ? ? ? 874 ? 15:52:15 ?
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f097cd0930dc reddit-clone_web "docker-entrypoint.sÉ" 2 minutes ago Up 2 minutes 0.0.0.0:3000->3000/tcp, :::3000->3000/tcp reddit-clone_web_1
907cd880f15d reddit-clone_server "docker-entrypoint.sÉ" 2 minutes ago Up 2 minutes 0.0.0.0:4000->4000/tcp, :::4000->4000/tcp reddit-clone_server_1
16e9d7a1cf42 redis:6.2.3 "docker-entrypoint.sÉ" 2 minutes ago Up 2 minutes 0.0.0.0:6379->6379/tcp, :::6379->6379/tcp reddit-clone_redis_1
458282ae41d9 postgres:13.2 "docker-entrypoint.sÉ" 2 minutes ago Up 2 minutes 0.0.0.0:5432->5432/tcp, :::5432->5432/tcp reddit-clone_postgres_1
$ docker run --rm reddit-clone_web_1 yarn add typescript
In the snippet above, we check the container name of the web
service using
docker container ls
, then run the web
container with the command yarn add typescript
to install typescript on our web
container
docker-compose
commands on the other hand allow you to refer to a container
using it's service name
version: "3.6"
services:
# postgres is the service name I'm referring to
postgres:
image: postgres:13.2
restart: always
ports:
- "5432:5432"
volumes:
- db_data:/var/lib/postgresql/data
env_file:
- .env
volumes:
db_data:
$ docker-compose run postgres psql -U postgres
In the command above, we run the psql
command in the postgres container
For running commands inside the container, either approach will generally work,
but my personal preference is to use docker-compose run
or docker-compose exec
over docker run
or docker exec
. This is because using the service name
allows me to run commands without first needing to check the name of the
container.
Maybe it's just me but I always end up having to double check the container
name/id before I run commands. But then, I think it's possible to pass the
container name in the docker-compose.yml
which might help with this issue.
Also I'm not sure if it's a zsh
plugin, but I'm able to auto-complete the
container name so that should help if you prefer docker
commands
Using docker-compose
also runs the containers dependencies which can be
positive if you need these service running (like your database
as a dependency
for your backend
when running migrations) or negative if you don't really need
these other service running for a particular command (just takes longer to start)
Now I'm going to list some of the important commands I use for development. You may create aliases or use bash scripts or makefiles to help running these commands more convenient
- Run
Allows you to run a container
Also allows you to pass commands to overwrite the CMD
command in your Dockerfile
docker run --rm reddit-clone_postgres_1 psql -U postgres
docker-compose run --rm postgres psql -U postgres
Flags | Description |
---|---|
--rm | Remove the container after closing (always include this) |
--service-ports | By default, containers are started at a random port to avoid collisions. This tells docker to start it the defined port (useful for long running containers you need to connect to) |
- Exec
Allows you to run a command against an existing container (already running)
This useful in certain situations like running one-off commands on already running containers without needing to turn it off first
$ docker exec reddit-clone_web_1 yarn add typescript
$ docker-compose exec web yarn add typescript
In the example above we install a new node dependency using the container
without needing to close the container first before running a run
command with
yarn add typescript
. This is particlarly useful when you are using a different
OS than your docker container as it prevents installing a macos
version of your
deps which can happen if you install on your host machine.
Another use-case is for opening a bash shell inside your container to inspect or run commands (you can also do this with run)
$ docker exec -it reddit-clone_web_1 bash
$ docker-compose exec -it web bash
- Logs
Allows you to view logs of a specific container
# tail flag allows you to only retrieve only a set number of lines
$ docker logs reddit-clone_web_1 --tail 100
$ docker-compose logs web --tail 100
Dockerfile
WORKDIR
Changes working directory similarly to RUN cd /some/path
Instead of using the following to create the directory and make it your working directory
RUN mkdir -p /usr/src/app
WORKDIR /usr/src/app
You can just use WORKDIR
directly as it creates the directory if it doesn't
exist yet
COPY vs ADD
Dockerfiles gives you two wayo to copy files from the host into a image: COPY
and ADD
They are similar but ADD
provides additional functionality:
- Allows adding files from URLs instead of a file or directory
- Allows you to extract
tar
files
Note: In most cases, if you're using a URL, you download a zip file and then use the
RUN
command to extract it. However, you might as well just useRUN
andcurl
instead ofADD
here, so you chain everything into 1RUN
command to make a smaller Docker image.
Which one should you use?
For copying files from the host to the docker image, it is recommended to use
COPY
since it is more explicit about your intent
Unless you have specific use-cases for using ADD
, it's better to just use
COPY
to avoid surprises
Non-root User
By default, docker runs containers as root (uid=0) which can be a security issue.
Always try to run containers as a non-root user. The node image provides the
user node
for exactly this purpose
# Dockerfile
FROM node:14.16.1-alpine
# Put this at the end
USER node
Alternatively you can also create a non-root user using the following commands
# Dockerfile
FROM Centos:7
# Add new user john with user id 8877
RUN useradd -u 8877 john
# Switch to user
USER john
Issues with non-root user
There are a lot of issues I've encountered when trying to run my containers as non-root users. I will try to cover all of them
Permission denied when editing the file
This can usually happen even if you are not using non-root user inside your docker container, but usually depends on the OS as well.
This normally happens when trying to edit a file generated by your docker container on your IDE. This is because by default docker container runs as root user while on your host machine, you are not logged in as a root user. This results in your IDE failing to write to the file because of lack of permissions
$ docker-compose run web sh
$ id
uid=0(root) gid=0(root)
# in your host machine
id
uid=1000(mp) gid=1000(staff)
In the snippet above we run id
on both the container and the host machine to
see that we are using a different uid
and gid
One solution is to run the following command to change the permission of the file on your host machine (change mp to your user)
sudo chown -R mp:mp .
Another is to make the uid
inside the container and the host match. By
default the first user in linux uses the value of 1000
for both uid
and
gid
. Coincidentally, the USER node
we mentioned above uses a value of 1000
for uid
and gid
so that solves our issue already.
Note: For macos this is not usually an issue even though it uses a different
uid
andgid
because of how docker for mac works*
Permission error starting a container when using a non-root user
This happens because docker builds the image using a root user. However when we start the container we are using a different user which results in permissions error similar to the above issue.
The only difference is we can't run sudo chown node:node .
now
The solution is to make the files owned by the non-root user during build time
# COPY files as the node user
COPY .
# compile typescript to javascript
RUN yarn build
# chown the files during build time
RUN chown -R node:node /app/dist
Normally, we wouldn't need to use the second solution and would only need to use
COPY --chown
. However, there are cases where we need to RUN chown -R
during
build time like when compiling the source code.
In the example above, we are compiling the typescript
code into javascript
inside the /app/dist
directory. Now since we are running as root user during build
time, the /app/dist
becomes owned by root. Therefore we have to change it's
permission to our node
user