Reference Overview¶

Inference has several components that work together to serve computer vision models. The diagram below shows how they fit together.

inference-sdk - Lightweight Python client for communicating with the Inference Server.
inference-cli - Command-line tool for managing the Inference Server and running common tasks.
Inference Server - HTTP server (Docker) that wraps the inference package as a REST API.
inference - Core Python package for model loading, inference, and Workflows execution.

Inference Architecture