Skip to content

Install on NVIDIA Jetson

Overview

Jetson is NVIDIA’s line of compact, power-efficient modules designed to run AI and deep learning workloads at the edge. They combine a GPU, CPU, and neural accelerators on a single board, making them ideal for robotics, drones, smart cameras, and other embedded applications where you need real-time computer vision or inference without a cloud connection. For more details, see NVIDIA’s official Jetson overview:
https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/

Prerequisites

Disk Space: Allocate at least 10 GB free for the Roboflow Jetson image (8.14 GB).

JetPack Version: Must be running a supported JetPack (4.5, 4.6, 5.x, or 6.x).

Recommended Hardware: For best performance while running Inference, we recommend an NVIDIA Orin NX 16 GB or above.

Docker & NVIDIA Container Toolkit
- Requires Docker + NVIDIA runtime so containers can access the GPU.
- Instead of detailing installation here, follow these instructions:
- Docker install:
https://docs.docker.com/engine/install/ubuntu/
- NVIDIA Container Toolkit:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html

We have specialized containers built with support for hardware acceleration on JetPack L4T. To automatically detect your JetPack version and use the right container with good default settings run:

pip install inference-cli
inference server start

Manually Starting the Container

If you want more control of the container settings you can also start it manually. Jetson devices with NVIDIA Jetpack are pre-configured with NVIDIA Container Runtime and will be hardware accelerated out of the box:

sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    roboflow/roboflow-inference-server-jetson-6.2.0:latest
sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    roboflow/roboflow-inference-server-jetson-6.0.0:latest
sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    roboflow/roboflow-inference-server-jetson-5.1.1:latest

Warning

Jetpack 4 is deprecated and will not receive future updates. Please migrate to Jetpack 6.

sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    CPUExecutionProvider]" \
    roboflow/roboflow-inference-server-jetson-4.6.1:latest

Warning

Jetpack 4 is deprecated and will not receive future updates. Please migrate to Jetpack 6.

sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    CPUExecutionProvider]" \
    roboflow/roboflow-inference-server-jetson-4.5.0:latest

TensorRT

You can optionally enable TensorRT, NVIDIA's model optimization runtime that will greatly increase your models' speed at the expense of a heavy compilation and optimization step (sometimes 15+ minutes) the first time you load each model.

Enable TensorRT by adding TensorrtExecutionProvider to the ONNXRUNTIME_EXECUTION_PROVIDERS environment variable.

sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    -e ONNXRUNTIME_EXECUTION_PROVIDERS="[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]" \
    roboflow/roboflow-inference-server-jetson-6.2.0:latest
sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    -e ONNXRUNTIME_EXECUTION_PROVIDERS="[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]" \
    roboflow/roboflow-inference-server-jetson-6.0.0:latest
sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    -e ONNXRUNTIME_EXECUTION_PROVIDERS="[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]" \
    roboflow/roboflow-inference-server-jetson-5.1.1:latest

Warning

Jetpack 4 is deprecated and will not receive future updates. Please migrate to Jetpack 6.

sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    -e ONNXRUNTIME_EXECUTION_PROVIDERS="[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]" \
    roboflow/roboflow-inference-server-jetson-4.6.1:latest

Warning

Jetpack 4 is deprecated and will not receive future updates. Please migrate to Jetpack 6.

sudo docker run -d \
    --name inference-server \
    --runtime nvidia \
    --read-only \
    -p 9001:9001 \
    --volume ~/.inference/cache:/tmp:rw \
    --security-opt="no-new-privileges" \
    --cap-drop="ALL" \
    --cap-add="NET_BIND_SERVICE" \
    -e ONNXRUNTIME_EXECUTION_PROVIDERS="[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]" \
    roboflow/roboflow-inference-server-jetson-4.5.0:latest

Docker Compose

If you are using Docker Compose for your application, the equivalent yaml is:

version: "3.9"

services:
  inference-server:
    container_name: inference-server
    image: roboflow/roboflow-inference-server-jetson-6.2.0:latest

    read_only: true
    ports:
      - "9001:9001"

    volumes:
      - "${HOME}/.inference/cache:/tmp:rw"

    runtime: nvidia

    # Optionally: uncomment the following lines to enable TensorRT:
    # environment:
    #   ONNXRUNTIME_EXECUTION_PROVIDERS: "[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]"

    security_opt:
      - no-new-privileges
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE
version: "3.9"

services:
  inference-server:
    container_name: inference-server
    image: roboflow/roboflow-inference-server-jetson-6.0.0:latest

    read_only: true
    ports:
      - "9001:9001"

    volumes:
      - "${HOME}/.inference/cache:/tmp:rw"

    runtime: nvidia

    # Optionally: uncomment the following lines to enable TensorRT:
    # environment:
    #   ONNXRUNTIME_EXECUTION_PROVIDERS: "[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]"

    security_opt:
      - no-new-privileges
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE
version: "3.9"

services:
  inference-server:
    container_name: inference-server
    image: roboflow/roboflow-inference-server-jetson-5.1.1:latest

    read_only: true
    ports:
      - "9001:9001"

    volumes:
      - "${HOME}/.inference/cache:/tmp:rw"

    runtime: nvidia

    # Optionally: uncomment the following lines to enable TensorRT:
    # environment:
    #   ONNXRUNTIME_EXECUTION_PROVIDERS: "[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]"

    security_opt:
      - no-new-privileges
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE

Warning

Jetpack 4 is deprecated and will not receive future updates. Please migrate to Jetpack 6.

version: "3.9"

services:
  inference-server:
    container_name: inference-server
    image: roboflow/roboflow-inference-server-jetson-4.6.1:latest

    read_only: true
    ports:
      - "9001:9001"

    volumes:
      - "${HOME}/.inference/cache:/tmp:rw"

    runtime: nvidia

    # Optionally: uncomment the following lines to enable TensorRT:
    # environment:
    #   ONNXRUNTIME_EXECUTION_PROVIDERS: "[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]"

    security_opt:
      - no-new-privileges
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE

Warning

Jetpack 4 is deprecated and will not receive future updates. Please migrate to Jetpack 6.

version: "3.9"

services:
  inference-server:
    container_name: inference-server
    image: roboflow/roboflow-inference-server-jetson-4.5.0:latest

    read_only: true
    ports:
      - "9001:9001"

    volumes:
      - "${HOME}/.inference/cache:/tmp:rw"

    runtime: nvidia

    # Optionally: uncomment the following lines to enable TensorRT:
    # environment:
    #   ONNXRUNTIME_EXECUTION_PROVIDERS: "[TensorrtExecutionProvider,CUDAExecutionProvider,CPUExecutionProvider]"

    security_opt:
      - no-new-privileges
    cap_drop:
      - ALL
    cap_add:
      - NET_BIND_SERVICE

Using Your New Server

See Using Your New Server for next steps.