BoT-SORT Tracker¶

Class: BoTSORTBlockV1

Source: inference.core.workflows.core_steps.trackers.botsort.v1.BoTSORTBlockV1

Track objects across video frames using the BoT-SORT algorithm from the roboflow/trackers package.

BoT-SORT follows a ByteTrack-style association pipeline (high- and low-confidence detections, Kalman track states) and can apply camera motion compensation (CMC) before association when enabled. CMC estimates a global affine motion between frames so predicted boxes align better when the camera moves.

When to use BoT-SORT: - Scenes with moving or shaking cameras (enable Camera motion compensation). - Dense detection noise where ByteTrack-style two-stage matching helps. - When you want ByteTrack-like behaviour with an optional motion-compensation stage.

When to consider alternatives: - Fixed camera and you only need speed: ByteTrack or SORT may be simpler. - Heavy occlusion and erratic object motion without camera motion: OC-SORT. - Low-texture backgrounds where sparse-feature CMC is unreliable.

Camera motion compensation: When enabled, the block passes the workflow image pixels to the tracker each frame. If the image cannot be decoded to a numpy array, the tracker runs without CMC for that frame (a warning is logged).

Instant first-frame activation defaults to off so behaviour aligns with other core tracker blocks for new_instances / already_seen_instances. Enable it if you want tracks on frame 1 to receive stable IDs immediately (original BoT-SORT paper-style).

Outputs three detection sets: - tracked_detections: All confirmed tracked detections with assigned track IDs. - new_instances: Detections whose track ID appears for the first time. - already_seen_instances: Detections whose track ID has been seen in a prior frame.

The block maintains separate tracker state and instance cache per video_identifier, enabling multi-stream tracking within a single workflow.

Type identifier¶

Use the following identifier in step "type" field: roboflow_core/trackers_botsort@v1to add the block as as step in your workflow.

Properties¶

Name	Type	Description	Refs
`name`	`str`	Enter a unique identifier for this step..	❌
`minimum_iou_threshold_first_assoc`	`float`	Minimum fused similarity (IoU × confidence) for the first (high-confidence) association step. Default: 0.2..	✅
`minimum_iou_threshold_second_assoc`	`float`	Minimum IoU for the second (low-confidence) association step. Default: 0.5..	✅
`minimum_iou_threshold_unconfirmed_assoc`	`float`	Minimum fused similarity for matching unconfirmed tracks to remaining high-confidence detections. Default: 0.3..	✅
`minimum_consecutive_frames`	`int`	Number of consecutive frames a track must be matched before it is emitted as a confirmed track (tracker_id != -1). Default: 2..	✅
`lost_track_buffer`	`int`	Number of frames to keep a track alive after it loses its matched detection. Higher values improve occlusion recovery. Default: 30..	✅
`track_activation_threshold`	`float`	Minimum detection confidence required to spawn a new track. Detections below this threshold are not used to create new tracks. Default: 0.7..	✅
`high_conf_det_threshold`	`float`	Confidence threshold for high-confidence detections used in association. Default: 0.6..	✅
`enable_cmc`	`bool`	Enable camera motion compensation (uses per-frame image pixels). Recommended for moving cameras..	✅
`cmc_method`	`str`	Camera motion estimator. One of: orb, sift, sparseOptFlow, ecc. Default: {DEFAULT_CMC_METHOD!r}..	❌
`cmc_downscale`	`int`	Downscale factor applied inside CMC for speed and robustness. Default: 2..	✅
`instant_first_frame_activation`	`bool`	If true, tracks on the first frame receive IDs immediately (paper-style). Default false so new/already-seen outputs match other core trackers..	✅
`instances_cache_size`	`int`	Maximum number of track IDs retained in the instance cache for new/already-seen categorisation. Uses FIFO eviction. Default: 16384..	❌

The Refs column marks possibility to parametrise the property with dynamic values available in workflow runtime. See Bindings for more info.

Available Connections¶

Compatible Blocks

Check what blocks you can connect to BoT-SORT Tracker in version v1.

Input and Output Bindings¶

The available connections depend on its binding kinds. Check what binding kinds BoT-SORT Tracker in version v1 has.

Bindings

input
- image (image): Input image with embedded video metadata (fps and video_identifier). Used to initialise and retrieve per-video tracker state. When camera motion compensation is enabled, frame pixels are read from this image..
- detections (Union[keypoint_detection_prediction, rle_instance_segmentation_prediction, object_detection_prediction, instance_segmentation_prediction]): Detection predictions for the current frame to track..
- minimum_iou_threshold_first_assoc (float_zero_to_one): Minimum fused similarity (IoU × confidence) for the first (high-confidence) association step. Default: 0.2..
- minimum_iou_threshold_second_assoc (float_zero_to_one): Minimum IoU for the second (low-confidence) association step. Default: 0.5..
- minimum_iou_threshold_unconfirmed_assoc (float_zero_to_one): Minimum fused similarity for matching unconfirmed tracks to remaining high-confidence detections. Default: 0.3..
- minimum_consecutive_frames (integer): Number of consecutive frames a track must be matched before it is emitted as a confirmed track (tracker_id != -1). Default: 2..
- lost_track_buffer (integer): Number of frames to keep a track alive after it loses its matched detection. Higher values improve occlusion recovery. Default: 30..
- track_activation_threshold (float_zero_to_one): Minimum detection confidence required to spawn a new track. Detections below this threshold are not used to create new tracks. Default: 0.7..
- high_conf_det_threshold (float_zero_to_one): Confidence threshold for high-confidence detections used in association. Default: 0.6..
- enable_cmc (boolean): Enable camera motion compensation (uses per-frame image pixels). Recommended for moving cameras..
- cmc_downscale (integer): Downscale factor applied inside CMC for speed and robustness. Default: 2..
- instant_first_frame_activation (boolean): If true, tracks on the first frame receive IDs immediately (paper-style). Default false so new/already-seen outputs match other core trackers..
output
- tracked_detections (Union[object_detection_prediction, instance_segmentation_prediction, keypoint_detection_prediction, rle_instance_segmentation_prediction]): Prediction with detected bounding boxes in form of sv.Detections(...) object if object_detection_prediction or Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object if instance_segmentation_prediction or Prediction with detected bounding boxes and detected keypoints in form of sv.Detections(...) object if keypoint_detection_prediction or Prediction with detected bounding boxes and RLE-encoded segmentation masks in form of sv.Detections(...) object if rle_instance_segmentation_prediction.
- new_instances (Union[object_detection_prediction, instance_segmentation_prediction, keypoint_detection_prediction, rle_instance_segmentation_prediction]): Prediction with detected bounding boxes in form of sv.Detections(...) object if object_detection_prediction or Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object if instance_segmentation_prediction or Prediction with detected bounding boxes and detected keypoints in form of sv.Detections(...) object if keypoint_detection_prediction or Prediction with detected bounding boxes and RLE-encoded segmentation masks in form of sv.Detections(...) object if rle_instance_segmentation_prediction.
- already_seen_instances (Union[object_detection_prediction, instance_segmentation_prediction, keypoint_detection_prediction, rle_instance_segmentation_prediction]): Prediction with detected bounding boxes in form of sv.Detections(...) object if object_detection_prediction or Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object if instance_segmentation_prediction or Prediction with detected bounding boxes and detected keypoints in form of sv.Detections(...) object if keypoint_detection_prediction or Prediction with detected bounding boxes and RLE-encoded segmentation masks in form of sv.Detections(...) object if rle_instance_segmentation_prediction.

Example JSON definition of step BoT-SORT Tracker in version v1

{
    "name": "<your_step_name_here>",
    "type": "roboflow_core/trackers_botsort@v1",
    "image": "<block_does_not_provide_example>",
    "detections": "$steps.object_detection_model.predictions",
    "minimum_iou_threshold_first_assoc": 0.2,
    "minimum_iou_threshold_second_assoc": 0.5,
    "minimum_iou_threshold_unconfirmed_assoc": 0.3,
    "minimum_consecutive_frames": 2,
    "lost_track_buffer": 30,
    "track_activation_threshold": 0.7,
    "high_conf_det_threshold": 0.6,
    "enable_cmc": false,
    "cmc_method": "sparseOptFlow",
    "cmc_downscale": 2,
    "instant_first_frame_activation": false,
    "instances_cache_size": "<block_does_not_provide_example>"
}