What is Inference?
Roboflow Inference is an open-source platform designed to simplify the deployment of computer vision models. It enables developers to perform object detection, classification, instance segmentation and keypoint detection, and utilize foundation models like CLIP, Segment Anything, and YOLO-World through a Python-native package, a self-hosted inference server, or a fully managed API.
Explore our enterprise options for advanced features like server deployment, active learning, and commercial licenses for YOLOv5 and YOLOv8.
Get started with our "Run your first model" guide
Here is an example of a model running on a video using Inference:
π» installΒΆ
Inference package requires Python>=3.8,<=3.11. Click here to learn more about running Inference inside Docker.
pip install inference
π running on a GPU
To enhance model performance in GPU-accelerated environments, install CUDA-compatible dependencies instead:pip install inference-gpu
π advanced models
Inference supports multiple model types for specialized tasks. From Grounding DINO for identifying objects with a text prompt, to DocTR for OCR, to CogVLM for asking questions about images - you can find out more in the Foundation Models page.Note that
inference
and inference-gpu
packages install only the minimal shared dependencies. Instead, install model-specific dependencies to ensure code compatibility and license compliance.
The
inference
and inference-gpu
packages install only the minimal shared dependencies. Install model-specific dependencies to ensure code compatibility and license compliance. Learn more about the models supported by Inference.
pip install inference[yolo-world]
π₯ quickstartΒΆ
Use Inference SDK to run models locally with just a few lines of code. The image input can be a URL, a numpy array, or a PIL image.
from inference import get_model
model = get_model(model_id="yolov8n-640")
results = model.infer("https://media.roboflow.com/inference/people-walking.jpg")
π roboflow models
Set up your
ROBOFLOW_API_KEY
to access thousands of fine-tuned models shared by the Roboflow Universe community and your custom model. Navigate to π keys section to learn more.
from inference import get_model
model = get_model(model_id="soccer-players-5fuqs/1")
results = model.infer(
image="https://media.roboflow.com/inference/soccer.jpg",
confidence=0.5,
iou_threshold=0.5
)
π foundational models
- CLIP Embeddings - generate text and image embeddings that you can use for zero-shot classification or assessing image similarity.from inference.models import Clip
model = Clip()
embeddings_text = clip.embed_text("a football match")
embeddings_image = model.embed_image("https://media.roboflow.com/inference/soccer.jpg")
from inference.models import SegmentAnything
model = SegmentAnything()
result = model.segment_image("https://media.roboflow.com/inference/soccer.jpg")
from inference.models import YOLOWorld
model = YOLOWorld(model_id="yolo_world/l")
result = model.infer(
image="https://media.roboflow.com/inference/dog.jpeg",
text=["person", "backpack", "dog", "eye", "nose", "ear", "tongue"],
confidence=0.03
)
π inference serverΒΆ
You can also run Inference as a microservice with Docker.
deploy serverΒΆ
The inference server is distributed via Docker. Behind the scenes, inference will download and run the image that is appropriate for your hardware. Here, you can learn more about the supported images.
inference server start
run clientΒΆ
Consume inference server predictions using the HTTP client available in the Inference SDK.
from inference_sdk import InferenceHTTPClient
client = InferenceHTTPClient(
api_url="http://localhost:9001",
api_key=<ROBOFLOW_API_KEY>
)
with client.use_model(model_id="soccer-players-5fuqs/1"):
predictions = client.infer("https://media.roboflow.com/inference/soccer.jpg")
If you're using the hosted API, change the local API URL to https://detect.roboflow.com
. Accessing the hosted inference server and/or using any of the fine-tuned models require a ROBOFLOW_API_KEY
. For further information, visit the π keys section.
π₯ inference pipelineΒΆ
The inference pipeline is an efficient method for processing static video files and streams. Select a model, define the video source, and set a callback action. You can choose from predefined callbacks that allow you to display results on the screen or save them to a file.
from inference import InferencePipeline
from inference.core.interfaces.stream.sinks import render_boxes
pipeline = InferencePipeline.init(
model_id="yolov8x-1280",
video_reference="https://media.roboflow.com/inference/people-walking.mp4",
on_prediction=render_boxes
)
pipeline.start()
pipeline.join()
π keysΒΆ
Inference enables the deployment of a wide range of pre-trained and foundational models without an API key. To access thousands of fine-tuned models shared by the Roboflow Universe community, configure your API key.
export ROBOFLOW_API_KEY=<YOUR_API_KEY>
π documentationΒΆ
Visit our documentation to explore comprehensive guides, detailed API references, and a wide array of tutorials designed to help you harness the full potential of the Inference package.
Β© licenseΒΆ
The Roboflow Inference code is distributed under the Apache 2.0 license. However, each supported model is subject to its licensing. Detailed information on each model's license can be found here.
β‘οΈ extrasΒΆ
Below you can find list of extras available for inference
and inference-gpu
Name | Description | Notes |
---|---|---|
clip |
CLIP model | N/A |
gaze |
L2CS-Net model | N/A |
grounding-dino |
Grounding Dino model | N/A |
sam |
SAM and SAM2 models | The extras depend on rasterio which require GDAL library to work. If the installation fails with gdal-config command error - run sudo apt-get install libgdal-dev for Linux or follow official installation guide |
yolo-world |
Yolo-World model | N/A |
transformers |
transformers based models, like Florence-2 |
N/A |
Installing extras
To install specific extras you need to run
pip install inferenence[extras-name]
pip install inferenence-gpu[extras-name]