ByteTrack Tracker¶

Class: ByteTrackBlockV1

Source: inference.core.workflows.core_steps.trackers.bytetrack.v1.ByteTrackBlockV1

Track objects across video frames using the ByteTrack algorithm from the roboflow/trackers package.

ByteTrack splits detections into high- and low-confidence pools and runs two rounds of IoU-based association. The first round matches high-confidence detections to existing tracks; the second recovers weak detections that overlap unmatched tracks. This makes ByteTrack particularly effective in dense environments where objects are frequently partially occluded and detector confidence fluctuates.

When to use ByteTrack: - General-purpose tracking across diverse scenes. - Dense or crowded environments with partial occlusions. - Sports tracking and fast-moving objects (highest benchmark scores on SportsMOT). - When your detector produces a mix of high- and low-confidence detections that you want to retain.

When to consider alternatives: - For maximum simplicity and speed with a strong detector, use SORT. - For scenes with heavy occlusion and non-linear motion, use OC-SORT.

Outputs three detection sets: - tracked_detections: All confirmed tracked detections with assigned track IDs. - new_instances: Detections whose track ID appears for the first time. - already_seen_instances: Detections whose track ID has been seen in a prior frame.

The block maintains separate tracker state and instance cache per video_identifier, enabling multi-stream tracking within a single workflow.

Type identifier¶

Use the following identifier in step "type" field: roboflow_core/trackers_bytetrack@v1to add the block as as step in your workflow.

Properties¶

Name	Type	Description	Refs
`name`	`str`	Enter a unique identifier for this step..	❌
`minimum_iou_threshold`	`float`	Minimum IoU required to associate a detection with an existing track. Default: 0.1..	✅
`minimum_consecutive_frames`	`int`	Number of consecutive frames a track must be matched before it is emitted as a confirmed track (tracker_id != -1). Default: 2..	✅
`lost_track_buffer`	`int`	Number of frames to keep a track alive after it loses its matched detection. Higher values improve occlusion recovery. Default: 30..	✅
`track_activation_threshold`	`float`	Minimum detection confidence required to spawn a new track. Detections below this threshold are not used to create new tracks. Default: 0.7..	✅
`high_conf_det_threshold`	`float`	Confidence threshold for high-confidence detections used in association. Default: 0.6..	✅
`instances_cache_size`	`int`	Maximum number of track IDs retained in the instance cache for new/already-seen categorisation. Uses FIFO eviction. Default: 16384..	❌

The Refs column marks possibility to parametrise the property with dynamic values available in workflow runtime. See Bindings for more info.

Runtime compatibility¶

soft — runtime hosted_serverless, dedicated_deployment; execution remote; input video: Block keeps per-video state in process memory (keyed by video_metadata.video_identifier). With remote step execution on stateless or multi-replica HTTP runtimes, successive requests may be served by different worker processes, so the state resets between calls and the output is meaningless for tracking / counting / aggregation. Use local step execution in an InferencePipeline for stable cross-frame results.
soft — input image: Block depends on temporal context from video or repeated-frame workflows. With a still image/photo, there is no meaningful history to track, compare, aggregate, or visualize, so the block provides little or no benefit.

Available Connections¶

Compatible Blocks

Check what blocks you can connect to ByteTrack Tracker in version v1.

Input and Output Bindings¶

The available connections depend on its binding kinds. Check what binding kinds ByteTrack Tracker in version v1 has.

Bindings

input
- image (image): Input image with embedded video metadata (fps and video_identifier). Used to initialise and retrieve per-video tracker state..
- detections (Union[rle_instance_segmentation_prediction, instance_segmentation_prediction, keypoint_detection_prediction, object_detection_prediction]): Detection predictions for the current frame to track..
- minimum_iou_threshold (float_zero_to_one): Minimum IoU required to associate a detection with an existing track. Default: 0.1..
- minimum_consecutive_frames (integer): Number of consecutive frames a track must be matched before it is emitted as a confirmed track (tracker_id != -1). Default: 2..
- lost_track_buffer (integer): Number of frames to keep a track alive after it loses its matched detection. Higher values improve occlusion recovery. Default: 30..
- track_activation_threshold (float_zero_to_one): Minimum detection confidence required to spawn a new track. Detections below this threshold are not used to create new tracks. Default: 0.7..
- high_conf_det_threshold (float_zero_to_one): Confidence threshold for high-confidence detections used in association. Default: 0.6..
output
- tracked_detections (Union[object_detection_prediction, instance_segmentation_prediction, keypoint_detection_prediction, rle_instance_segmentation_prediction]): Prediction with detected bounding boxes in form of sv.Detections(...) object if object_detection_prediction or Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object if instance_segmentation_prediction or Prediction with detected bounding boxes and detected keypoints in form of sv.Detections(...) object if keypoint_detection_prediction or Prediction with detected bounding boxes and RLE-encoded segmentation masks in form of sv.Detections(...) object if rle_instance_segmentation_prediction.
- new_instances (Union[object_detection_prediction, instance_segmentation_prediction, keypoint_detection_prediction, rle_instance_segmentation_prediction]): Prediction with detected bounding boxes in form of sv.Detections(...) object if object_detection_prediction or Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object if instance_segmentation_prediction or Prediction with detected bounding boxes and detected keypoints in form of sv.Detections(...) object if keypoint_detection_prediction or Prediction with detected bounding boxes and RLE-encoded segmentation masks in form of sv.Detections(...) object if rle_instance_segmentation_prediction.
- already_seen_instances (Union[object_detection_prediction, instance_segmentation_prediction, keypoint_detection_prediction, rle_instance_segmentation_prediction]): Prediction with detected bounding boxes in form of sv.Detections(...) object if object_detection_prediction or Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object if instance_segmentation_prediction or Prediction with detected bounding boxes and detected keypoints in form of sv.Detections(...) object if keypoint_detection_prediction or Prediction with detected bounding boxes and RLE-encoded segmentation masks in form of sv.Detections(...) object if rle_instance_segmentation_prediction.

Example JSON definition of step ByteTrack Tracker in version v1

{
    "name": "<your_step_name_here>",
    "type": "roboflow_core/trackers_bytetrack@v1",
    "image": "<block_does_not_provide_example>",
    "detections": "$steps.object_detection_model.predictions",
    "minimum_iou_threshold": 0.1,
    "minimum_consecutive_frames": 2,
    "lost_track_buffer": 30,
    "track_activation_threshold": 0.7,
    "high_conf_det_threshold": 0.6,
    "instances_cache_size": "<block_does_not_provide_example>"
}