Nick's Blog

ECS Instance – Cannot connect to the Docker daemon at unix:///var/run/docker.sock

When you have the following issues together; you may find out this article useful.

ECS Instance:   Cannot connect to the Docker daemon at unix:///var/run/docker.sock
CloudFormation: stuck at UPDATE_IN_PROGRESS

You may have the following error when you run the following command.

docker ps
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?

This means your agent may have issue to connect to the docker damaen. You can run the following command on the host to confirm:

# service docker status
docker dead but pid file exists

There is another way that you can verify it. You can log into AWS > ECS > Clusters > [ Target cluster ] > ECS Instances
For the disconnected host, you will see the Agent Connected column with a false value.

If you are using CloudFormation to deploy the ECS stack. You will see the stack status stuck at UPDATE_IN_PROGRESS. This is either because there is no resouce in the host pool, or the task stucked on the target host has disconnected agent.
On the other hand, the lack of resource also depends on whether all hosts are on a health state.

How to get it fixed? Run the following command

sudo service docker stop && sudo service docker start
# service docker status
docker (pid  7744) is running...
# sudo start ecs
ecs start/running, process 8242

Log back into the AWS console, you will see the the agent connected value back true. And your cloudformation stack should complete the update.