Docker for AI
DOCKER Installation and Setup in ubuntu 20.04
- Docker install in ubuntu follwoing this reference. The commands are copied to below in the order with one change in the last step.
1
2
$ sudo apt-get update
1
2
3
4
5
$ sudo apt-get install \
ca-certificates \
curl \
gnupg \
lsb-release
1
$ sudo mkdir -p /etc/apt/keyrings
1
2
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
1
2
3
$ echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
1
$ sudo apt-get update
1
$ sudo apt-get install docker-ce docker-ce-cli containerd.io docker-compose-plugin
1
$ apt-cache madison docker-ce
The following cmd is changed as follows from the original docker docs.
1
$ sudo apt-get install docker
1
$ sudo docker run hello-world
1
sudo docker images
1. Pytorch Container in Docker
Now run the follwoing command in the terminal from this pytorch-docker reference :
1
docker pull pytorch/pytorch
- Also, refere to the official pytorch github docker-image here.
2. Nvidia docker
Note that running the pytorch docker image requires nvidia-docker
. So we also need to install that from this reference:
- https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html
1
2
3
4
5
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
1
2
$ sudo apt-get update
1
$ sudo apt-get install -y nvidia-docker2
1
2
$ sudo systemctl restart docker
1
$ sudo docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
3. Nvidia NGC pytorch
https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch
Nvidia NGC pytorch docker containers
1 2
$ sudo docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:xx.xx-py3
in place of the
xx.xx-py3
put the latest tag from the left panel on the above nvidia pytorch docker containers website.so in our case,
22.08-py3
1
$ sudo docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:22.08-py3
Note this is an important docker command which you use every time to pull the pytorch docker image.
However the first time you run it, you will find that it does not work. To make it run, we need to set the
default-runtime
asnvidia
as descibed in thisdusty-nv jetson-containers
github repo:- https://github.com/dusty-nv/jetson-containers
So inser the following command in terminal:
1
$ sudo vim /etc/docker/daemon.json
And edit the file to include the following code line added by vim i
cmd.
1
2
3
,
"default-runtime": "nvidia"
After you are done editing the file. press :wq
to save the file and exit. Note: you only do :q
to exit vim
editor.
Now that you have set your default-runtime as nvidia you you should see the following the whole file in the vim editor as follows:
Now you can try downloading the pytorch nvidia ngc container again by running the following command.
1
$ sudo docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:22.08-py3
It should download 7GB container and run it as shown below:
Run Jupyter Notebooks inside Docker
- First run the Nvidia NGC pytorch docker image
Navigate to the TensorRT github clone on local laptop. And pull the docker image with the -v
and 22.08-py3
for the latest tag.
1
2
3
(base) arcs-base@arcs-base:~/Robo_vision/TensorRT$
sudo docker run --gpus=all --rm -it -v $PWD:/Torch-TensorRT --net=host --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 nvcr.io/nvidia/pytorch:22.08-py3 bash
then go to the notebooks directory inside the docker
1
cd /workspace/examples/torch_tensorrt/notebooks
Next, Once you have entered the appropriate notebooks
directory, start Jupyter with
1
root@arcs-base:/workspace/examples/torch_tensorrt/notebooks# jupyter notebook --allow-root --ip 0.0.0.0 --port 8888
Note: all the above commands are written in the README.md inside the notebooks directory.
At this point, you should be able to see the following output:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[I 07:38:08.359 NotebookApp] jupyter_tensorboard extension loaded.
[I 07:38:09.328 NotebookApp] JupyterLab extension loaded from /opt/conda/lib/python3.8/site-packages/jupyterlab
[I 07:38:09.328 NotebookApp] JupyterLab application directory is /opt/conda/share/jupyter/lab
[I 07:38:09.338 NotebookApp] [Jupytext Server Extension] NotebookApp.contents_manager_class is (a subclass of) jupytext.TextFileContentsManager already - OK
[I 07:38:09.339 NotebookApp] Serving notebooks from local directory: /opt/pytorch/torch_tensorrt/notebooks
[I 07:38:09.340 NotebookApp] Jupyter Notebook 6.4.10 is running at:
[I 07:38:09.340 NotebookApp] http://hostname:8888/?token=99459696da52446c6823daa9a89e7e0efc2b7ff24d0ee9b0
[I 07:38:09.340 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
[C 07:38:09.352 NotebookApp]
To access the notebook, open this file in a browser:
file:///root/.local/share/jupyter/runtime/nbserver-465-open.html
Or copy and paste this URL:
http://hostname:8888/?token=99459696da52446c6823daa9a89e7e0efc2b7ff24d0ee9b0
Copy-paste the last link in the browser.
then replace the word hostname:8888
from the above link to localhost:8888
.
eg: http://localhost:8888/notebooks/Resnet50-example.ipynb
Your Jupyter Notenook should now be up and running.
You can just run the cells in the notebook like you normally do.