Getting started with the NVIDIA Jetson Nano

This page is a work in progress and catalogs my thoughts in getting started with the NVIDIA Jetson Nano Developer Kit.

For system setup stuff, I am maintaining an Ansible playbook for configuring Jetson Nano post-install too.

User Environment

I set up the NVIDIA Jetson Nano with the official Jetson Nano Developer Kit SD Card Image according to the Getting Started docs. NVIDIA's nomenclature is a little confusing; I think this image is called "JetPack" and it includes:

An Ubuntu-based OS with all the NVIDIA drivers called "L4T"
CUDA (although you wouldn't know without digging)
Docker and support for containerized applications from NVIDIA GPU Cloud (NGC)

NVIDIA doesn't seem to provide strong support Jetson with a lot of its high-profile software suites outside of learning and machine vision. For example,

DIGITS is a collection of Python tools for deep learning, but there is no supported easy install for it on Jetson.
RAPIDS is a collection of Python libraries for data science, but again, there's no supported easy install for it on Jetson.
DALI is a collection of tools for data loading. Again, no support for it on Jetson yet.
OpenACC offers a programming model and runtime for pragma-based GPU acceleration of C and Fortran apps. No easy support for Jetson yet.

I also found that many analytics tools only support Pascal-generation or newer GPUs. There may be ways to get all of this running by building and installing things by hand, but I was expecting a friendlier experience from a single-board computer.

It looks to me like Jetson is really geared towards machine vision and robotics; it is not as a general platform for learning the NVIDIA software ecosystem for other things like high-performance computing and data analytics.

CUDA Support

The Jetson Nano SD image comes with CUDA preinstalled, but trying to use it on a fresh install throws an error:

glock@jetson:~$ nvcc
-bash: nvcc: command not found

Turns out you have to manually edit your .bashrc and add CUDA paths to your environment:

export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

This struck me as a big oversight in creating a flawless out-of-box experience, but it's easy enough to fix. I made the following script and stuck it in /etc/profile.d/cuda.sh for my Ansible setter-upper:

if [ -n "${BASH_VERSION-}" ]; then
    if [[ $PATH != */usr/local/cuda/bin* ]]; then
        export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}
    fi
    if [[ $LD_LIBRARY_PATH != */usr/local/cuda/lib64* ]]; then
        export LD_LIBRARY_PATH=/usr/local/cuda/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
    fi
fi

NVIDIA GPU Cloud - Containerized Applications

You can think of Jetson Nano's OS environment like a substrate for running containerized environments which is a big departure from most Raspberry Pi-like single-board computers and traditional HPC environments. Logging into the Jetson Nano itself gives you a lean environment--shells, text editors, and basic Linux stuff are there, but there are no precreated Python environments, TensorFlow, etc.

Instead of installing all your own libraries and tools though, you can launch application containers that drop you in a system that has all of the necessary bells and whistles required to develop and execute applications in a well-defined environment. This is closer to what one would expect in a cloud computing environment: you choose the entire software stack you need as an all-inclusive appliance, press go, and don't fuss with any software dependencies, compilation, or environment-specific configuration. It's quite nice.

This containerized ecosystem is branded as NVIDIA GPU-Accelerated Cloud or NGC, and anyone can browse their "App Store" equivalent, the NGC Catalog. I set up the CLI client using the instructions in the NGC Overview which involved

Downloading the ngc binary for ARM64
Creating an NGC account using my Google account
Generating an NGC API key
Running ngc config set and punching in my API key
Running ngc diag all to make sure everything worked

Once you've got this set up, you can access NGC without having to click around the NGC website. For example, the NVIDIA DLI Getting Started with AI on Jetson Nano course tells you to retrieve the latest tag for the dli-nano-ai container from the website so you can fetch and run the course's container. Instead, you can do

$ ngc registry image list 'nvidia/dli/dli-nano-ai:*'

to get all the available tags.

Docker on Jetson Nano

All of the following assumes that you have added yourself to the docker group on your Jetson Nano. See the user setup section below for more information.

Finding Images

Forewarning: NGC seems to be quite new, and most of the containers hosted on it are not compatible with ARM or Jetson Nano. I could only find a couple containers that will actually work on Jetson:

DLI Getting Started with AI on Jetson Nano - the container used for the course that is copackaged with the Nano
CUDA for Arm64 - CUDA, which also ships with Jetson Nano's OS image

NGC has a label system you could use to search for containers matching a certain criteria (like "supports ARM64..."), but they aren't used consistently so you kind of have to wade through a combination of labels and container names to figure out what NGC offerings may work. In addition, you have to read each container's README because many only work with Pascal or newer GPUs.

I've had success looking for the following labels:

L4T
ARM
ARM64 - this has a lot of containers that seem to be meant for non-Jetson systems though

That all said, you can repurpose a lot of these images to create your own containerized development or execution environments on Jetson Nano.

Installing Images

Once you know what image you want to retrieve,

$ ngc registry image pull nvidia/l4t-ml:r32.4.4-py3
Logging in to https://nvcr.io... login successful.
r32.4.4-py3: Pulling from nvidia/l4t-ml

It's not clear to me what the advantage of using the ngc command to pull images is versus just calling docker directly. It doesn't look like ngc keeps any local information about what images have already been pulled.

You may also have to explicitly name a tag or else you get an error like this:

$ ngc registry image pull nvidia/l4t-ml
Logging in to https://nvcr.io... login successful.
Error: manifest for nvcr.io/nvidia/l4t-ml:latest not found: manifest unknown: manifest unknown

I think this is because specific containers only work with specific versions of JetPack. It would be nice if ngc could detect this automatically.

Running Containerized Services

Once you've pulled an image from NGC,

#!/usr/bin/env bash
docker run \
    --runtime nvidia \
    -it \
    --rm \
    --network host \
    --volume "$HOME/nvdli-data:/nvdli-nano/data" \
    --device /dev/video0 \
    "nvcr.io/nvidia/l4t-ml:r32.4.4-py3"

where

-it means run the container interactively
--rm means delete the container when it is complete
--network host means the container will open ports on host itself
--volume establishes a bind mount between the local host and the container
--device passes the USB camera into container for image capture

The exact image name (nvcr.io/nvidia/l4t-ml:...) is just the image name from the previous step (ngc registry image pull ...) with nvcr.io/ prepended.

For what it's worth, the docker run ... command will work even if you don't ngc registry image pull beforehand. So again, I'm not sure what value the NGC pull command does.

Running Containers as Non-Root

These images are also good for running GPU-accelerated code interactively. To run the NVIDIA DLI container as an interactive environment, you can do something like:

#!/usr/bin/env bash
docker run \
    --runtime nvidia \
    -it \
    --rm \
    --network host \
    --volume "$HOME:$HOME" \
    --volume "/etc/passwd:/etc/passwd:ro" \
    --volume "/etc/group:/etc/group:ro" \
    --volume "/etc/shadow:/etc/shadow:ro" \
    --volume "/etc/gshadow:/etc/gshadow:ro" \
    -u $(id -u ${USER}):$(getent group video | awk -F: '{print $3}') \
    --device /dev/video0 \
    "nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.4.4"

Note that we expose users and groups from our jetson nano inside the container. This prevents us from accidentally creating root-owned files on our host as we play inside the container and is achived by

Mounting our home directory into the container
Mounting the passwd and group files into the container
Lauching our shell with the UID of our host account and the GID of video

Making the container run as the video group is necessary to allow the shell to utilize the GPU. If you don't do this, you would have to first run

$ newgrp video

from inside the container after launch.

Also note that in the exact example above, you'll get this error:

/bin/bash: /var/log/jupyter.log: Permission denied

This is just a result of Jupyter trying to start up as a non-root user. It can be ignored.

Running Containers with docker-compose

If you want to use docker-compose instead of the big long docker run command, create a docker-compose.yml file that looks like this:

version: "3"
services:
  app:
    image: "nvcr.io/nvidia/dli/dli-nano-ai:v2.0.1-r32.4.4"
    user: "1000:44"
    working_dir: $HOME
    devices:
      - /dev/video0
    volumes:
      - /etc/group:/etc/group:ro
      - /etc/passwd:/etc/passwd:ro
      - /etc/shadow:/etc/shadow:ro
      - /etc/gshadow:/etc/gshadow:ro
      - $HOME:$HOME
    stdin_open: true
    tty: true
    entrypoint: /bin/bash

Then you can simply run

$ docker-compose run --rm app

This avoids the jupyter.log error because you aren't bothering to run it.

Note that

This assumes you have made the nvidia runtime the system default. See Docker Setup below for how to do this.
Install docker-compose using apt install docker-compose before attempting the above. docker-compose version 1.17.1 is sufficient.

System Setup

User Setup

When you first boot up a fresh SD card, the Jetson installer asks you to create a new user which becomes uid 1000.

To be able to run the docker command without sudo, you have to add this user to the docker group.

sudo usermod -a -G docker glock

Bear in mind that membership in this docker group is equivalent to root access. See the Manage Docker as a non-root user page in the Docker docs for more information, and see how I add the default user to the docker group in Ansible.

You should also make sure your new user is a member of the video group so that you can access the GPU.

Docker Setup

You should set the default docker runtime system-wide to be nvidia so that you don't have to explicitly use --runtime nvidia every time run a container. Edit /etc/docker/daemon.json and add a default-runtime key:

{
    "runtimes": {
        "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
        }
    },
    "default-runtime": "nvidia"
}

Then

$ sudo service docker restart

Wifi

My wifi experience wasn't great. I tried both of these USB dongles:

driver	device
rtl8192cu	7392:7811 Edimax Technology Co., Ltd EW-7811Un 802.11n Wireless Adapter [Realtek RTL8188CUS]
rt2800usb	148f:5370 Ralink Technology, Corp. RT5370 Wireless Adapter

They work and can hold a connection, but the packet loss on both is > 15% and the latency is quite variable. These were both cheap dongles with no external antenna, and the connection quality (loss and variability) did improve when the Jetson was adjacent to my wifi router. I do wonder if the Jetson's physical design interferes with cheap USB dongles' small antennae though, as both of these dongles are rock-solid on Raspberry Pi.

Capacity Management

NVIDIA recommends using an SD card with at least 32 GB, and that's no joke--the reliance on container images to provide a software environment not only takes up a lot of space, but imposes constraints on what sort of external storage you can use. This is because Docker relies on extended attributes which NFS does not support.

The big offender of capacity consumption is /var/lib/docker. After installing the NVIDIA Deep Learning Institute image for the Getting Started with AI on Jetson Nano course,

root@jetson:/var/lib/docker# du -hs *
20K     builder
72K     buildkit
4.0K    containers
11M     image
52K     network
3.9G    overlay2
20K     plugins
4.0K    runtimes
4.0K    swarm
4.0K    tmp
4.0K    trust
28K     volumes

and the overlay2 directory cannot be relocated to NFS due to its dependence on xattr support.

Relocating the entire docker data directory to an external SSD should be perfectly possible by editing /etc/docker/daemon.json.