Detections Stitch¶
Class: DetectionsStitchBlockV1
Source: inference.core.workflows.core_steps.fusion.detections_stitch.v1.DetectionsStitchBlockV1
Merge detections from multiple image slices or crops back into a single unified detection result by converting coordinates from slice/crop space to original image coordinates, combining all detections, and optionally filtering overlapping detections to enable SAHI workflows, multi-stage detection pipelines, and coordinate-space merging workflows where detections from sub-images need to be reconstructed as if they were detected on the original image.
How This Block Works¶
This block merges detections that were made on multiple sub-parts (slices or crops) of the same input image, reconstructing them as a single detection result in the original image coordinate space. The block:
- Receives reference image and slice/crop predictions:
- Takes the original reference image that was sliced or cropped
- Receives predictions from detection models that processed each slice/crop
- Predictions must contain parent coordinate metadata indicating slice/crop position
- Retrieves crop offsets for each detection:
- Extracts parent coordinates from each detection's metadata
- Gets the offset (x, y position) indicating where each slice/crop was located in the original image
- Uses this offset to transform coordinates from slice space to original image space
- Manages crop metadata:
- Updates image dimensions in detection metadata to match reference image dimensions
- Validates that detections were not scaled (scaled detections are not supported)
- Attaches parent coordinate information to detections for proper coordinate transformation
- Transforms coordinates to original image space:
- Moves bounding box coordinates (xyxy) from slice/crop coordinates to original image coordinates
- Transforms segmentation masks from slice/crop space to original image space (if present)
- Applies offset to align detections with their position in the original image
- Merges all transformed detections:
- Combines all re-aligned detections from all slices/crops into a single detection result
- Creates unified detection output containing all detections from all sub-images
- Applies overlap filtering (optional):
- None strategy: Returns all merged detections without filtering (may contain duplicates from overlapping slices)
- NMS (Non-Maximum Suppression): Removes lower-confidence detections when IoU exceeds threshold, keeping only the highest confidence detection for each overlapping region
- NMM (Non-Maximum Merge): Combines overlapping detections instead of discarding them, merging detections that exceed IoU threshold
- Returns merged detections:
- Outputs unified detection result in original image coordinate space
- Reduces dimensionality by 1 (multiple slice detections → single image detections)
- All detections are now referenced to the original image dimensions and coordinates
This block is essential for SAHI (Slicing Adaptive Inference) workflows where an image is sliced, each slice is processed separately, and results need to be merged back. Overlapping slices can produce duplicate detections for the same object, so overlap filtering (NMS/NMM) helps clean up these duplicates. The coordinate transformation ensures that detection coordinates are correctly positioned relative to the original image, not the slices.
Common Use Cases¶
- SAHI Workflows: Complete SAHI technique by merging detections from image slices back to original image coordinates (e.g., merge slice detections from SAHI processing, reconstruct full-image detections from slices, combine small object detection results), enabling SAHI detection workflows
- Multi-Stage Detection: Merge detections from secondary high-resolution models applied to dynamically cropped regions (e.g., coarse detection → crop → precise detection → merge, two-stage detection pipelines, hierarchical detection workflows), enabling multi-stage detection workflows
- Small Object Detection: Combine detection results from sliced images processed separately for small object detection (e.g., merge detections from aerial image slices, combine slice detection results, reconstruct detections from tiled images), enabling small object detection workflows
- High-Resolution Processing: Merge detections from high-resolution images processed in smaller chunks (e.g., merge detections from satellite image tiles, combine results from medical image regions, reconstruct detections from large image segments), enabling high-resolution detection workflows
- Coordinate Space Unification: Convert detections from multiple coordinate spaces (slice/crop space) to a single unified coordinate space (original image space) for consistent processing (e.g., unify detection coordinates, merge coordinate spaces, standardize detection positions), enabling coordinate unification workflows
- Overlapping Region Handling: Handle duplicate detections from overlapping slices or crops by applying overlap filtering (e.g., remove duplicate detections from overlapping slices, merge overlapping detections, clean up overlapping results), enabling overlap resolution workflows
Connecting to Other Blocks¶
This block receives slice/crop predictions and reference images, and produces merged detections:
- After detection models in SAHI workflows following Image Slicer → Detection Model → Detections Stitch pattern to merge slice detections (e.g., merge SAHI slice detections, reconstruct full-image detections, combine slice results), enabling SAHI completion workflows
- After secondary detection models in multi-stage pipelines following Dynamic Crop → Detection Model → Detections Stitch pattern to merge cropped detections (e.g., merge cropped region detections, combine two-stage detection results, unify multi-stage outputs), enabling multi-stage detection workflows
- Before visualization blocks to visualize merged detection results on the original image (e.g., visualize merged detections, display stitched results, show unified detection output), enabling visualization workflows
- Before filtering or analytics blocks to process merged detection results (e.g., filter merged detections, analyze stitched results, process unified outputs), enabling analysis workflows
- Before sink or storage blocks to store or export merged detection results (e.g., save merged detections, export stitched results, store unified outputs), enabling storage workflows
- In workflow outputs to provide merged detections as final workflow output (e.g., return merged detections, output stitched results, provide unified detection output), enabling output workflows
Requirements¶
This block requires a reference image (the original image that was sliced/cropped) and predictions from detection models that processed slices/crops. The predictions must contain parent coordinate metadata (PARENT_COORDINATES_KEY) indicating the position of each slice/crop in the original image. The block does not support scaled detections (detections that were resized relative to the parent image). Predictions should be from object detection or instance segmentation models. The block supports three overlap filtering strategies: "none" (no filtering, may include duplicates), "nms" (Non-Maximum Suppression, removes lower-confidence overlapping detections, default), and "nmm" (Non-Maximum Merge, combines overlapping detections). The IoU threshold (default 0.3) determines when detections are considered overlapping for filtering purposes. For more information on SAHI technique, see: https://ieeexplore.ieee.org/document/9897990.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/detections_stitch@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
overlap_filtering_strategy |
str |
Strategy for handling overlapping detections when merging results from overlapping slices/crops. 'none': No filtering applied, all detections are kept (may include duplicates from overlapping regions). 'nms' (Non-Maximum Suppression, default): Removes lower-confidence detections when IoU exceeds threshold, keeping only the highest confidence detection for each overlapping region. 'nmm' (Non-Maximum Merge): Combines overlapping detections instead of discarding them, merging detections that exceed IoU threshold. Use 'none' when you want to preserve all detections, 'nms' to remove duplicates (recommended for most cases), or 'nmm' to combine overlapping detections.. | ✅ |
iou_threshold |
float |
Intersection over Union (IoU) threshold for overlap filtering. Range: 0.0 to 1.0. When overlap filtering strategy is 'nms' or 'nmm', detections with IoU above this threshold are considered overlapping. For NMS: overlapping detections with IoU above threshold result in lower-confidence detection being removed. For NMM: overlapping detections with IoU above threshold are merged. Lower values (e.g., 0.2-0.3) are more aggressive, removing/merging more detections. Higher values (e.g., 0.5-0.7) are more permissive, only handling highly overlapping detections. Default 0.3 works well for most use cases with overlapping slices.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Detections Stitch in version v1.
- inputs:
VLM As Classifier,Line Counter,MoonshotAI Kimi,Stability AI Image Generation,Trace Visualization,Path Deviation,Anthropic Claude,Per-Class Confidence Filter,Icon Visualization,SIFT Comparison,Morphological Transformation,Color Visualization,LMM For Classification,Perspective Correction,Corner Visualization,Roboflow Custom Metadata,Detections Merge,Halo Visualization,Dynamic Zone,Qwen-VL,Email Notification,Halo Visualization,Object Detection Model,Google Gemma,Background Color Visualization,Ellipse Visualization,Email Notification,Twilio SMS/MMS Notification,Text Display,Polygon Visualization,Crop Visualization,Absolute Static Crop,Image Preprocessing,Template Matching,Model Monitoring Inference Aggregator,Relative Static Crop,OpenRouter,OpenAI,VLM As Detector,Florence-2 Model,OCR Model,Heatmap Visualization,Motion Detection,OpenAI,Detections Filter,Blur Visualization,Depth Estimation,Instance Segmentation Model,Stability AI Outpainting,Anthropic Claude,YOLO-World Model,Google Gemini,Clip Comparison,Google Gemini,Background Subtraction,Keypoint Visualization,CSV Formatter,Webhook Sink,Byte Tracker,Stitch Images,Florence-2 Model,Current Time,Detections List Roll-Up,Contrast Equalization,Mask Edge Snap,OpenAI,Moondream2,VLM As Detector,Google Gemini,Triangle Visualization,Slack Notification,Overlap Filter,Time in Zone,Detections Stabilizer,SIFT,Local File Sink,Image Contours,Keypoint Detection Model,GLM-OCR,Roboflow Asset Library Attributes,Image Slicer,Polygon Zone Visualization,Contrast Enhancement,Time in Zone,Google Gemma API,Stitch OCR Detections,Image Threshold,Line Counter Visualization,Camera Calibration,QR Code Generator,Detection Offset,ByteTrack Tracker,Detection Event Log,Detections Transformation,S3 Sink,Microsoft SQL Server Sink,Mask Area Measurement,Google Vision OCR,Twilio SMS Notification,Image Blur,Detections Combine,Morphological Transformation,Camera Focus,Roboflow Vision Events,Stability AI Inpainting,PTZ Tracking (ONVIF),Classification Label Visualization,Bounding Rectangle,SAM2 Video Tracker,Stitch OCR Detections,Event Writer,Grid Visualization,Qwen3.5-VL,Mask Visualization,Byte Tracker,Llama 3.2 Vision,Reference Path Visualization,Image Slicer,Label Visualization,Velocity,Identify Outliers,Byte Tracker,OPC UA Writer Sink,Dot Visualization,Identify Changes,Dynamic Crop,Detections Stitch,Circle Visualization,Path Deviation,BoT-SORT Tracker,SAM3 Video Tracker,Camera Focus,Llama 3.2 Vision,Segment Anything 2 Model,OpenAI-Compatible LLM,MoonshotAI Kimi,Single-Label Classification Model,CogVLM,Object Detection Model,SAM 3 Interactive,Qwen 3.6 API,Detections Consensus,Bounding Box Visualization,Multi-Label Classification Model,LMM,SAM 3,OpenAI,Image Convert Grayscale,Instance Segmentation Model,Roboflow Visual Search,EasyOCR,Roboflow Dataset Upload,SAM 3,Detections Classes Replacement,Instance Segmentation Model,Pixelate Visualization,Instance Segmentation Model,SORT Tracker,Roboflow Dataset Upload,PLC Writer,Track Class Lock,Qwen 3.5 API,Object Detection Model,Anthropic Claude,Time in Zone,MQTT Writer,Polygon Visualization,OC-SORT Tracker,SAM 3,Model Comparison Visualization,Seg Preview - outputs:
Line Counter,Time in Zone,Stitch OCR Detections,Trace Visualization,Path Deviation,Distance Measurement,Detection Offset,ByteTrack Tracker,Detection Event Log,Per-Class Confidence Filter,Icon Visualization,Detections Transformation,Color Visualization,Perspective Correction,Corner Visualization,Mask Area Measurement,Roboflow Custom Metadata,Detections Merge,Halo Visualization,Dynamic Zone,Detections Combine,Size Measurement,Roboflow Vision Events,Halo Visualization,PTZ Tracking (ONVIF),Stability AI Inpainting,Stitch OCR Detections,SAM2 Video Tracker,Bounding Rectangle,Event Writer,Background Color Visualization,Byte Tracker,Mask Visualization,Ellipse Visualization,Velocity,Label Visualization,Byte Tracker,Dot Visualization,Polygon Visualization,Crop Visualization,Path Deviation,Detections Stitch,Dynamic Crop,Circle Visualization,BoT-SORT Tracker,Model Monitoring Inference Aggregator,Camera Focus,Segment Anything 2 Model,Florence-2 Model,Heatmap Visualization,Detections Filter,Overlap Analysis,Blur Visualization,SAM 3 Interactive,Detections Consensus,Byte Tracker,Bounding Box Visualization,Florence-2 Model,Detections List Roll-Up,Mask Edge Snap,Line Counter,Triangle Visualization,Overlap Filter,Roboflow Dataset Upload,Time in Zone,Detections Classes Replacement,Pixelate Visualization,Detections Stabilizer,SORT Tracker,Roboflow Dataset Upload,Track Class Lock,Time in Zone,Polygon Visualization,OC-SORT Tracker,Model Comparison Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Detections Stitch in version v1 has.
Bindings
-
input
reference_image(image): Original reference image that was sliced or cropped to produce the input predictions. This image is used to determine the target coordinate space and image dimensions for the merged detections. All detection coordinates will be transformed to match this reference image's coordinate system. The same image that was provided to Image Slicer or Dynamic Crop blocks should be used here to ensure proper coordinate alignment..predictions(Union[object_detection_prediction,instance_segmentation_prediction]): Model predictions (object detection or instance segmentation) from detection models that processed image slices or crops. These predictions must contain parent coordinate metadata indicating the position of each slice/crop in the original image. Predictions are collected from multiple slices/crops and merged into a single unified detection result. The block converts coordinates from slice/crop space to original image space and combines all detections..overlap_filtering_strategy(string): Strategy for handling overlapping detections when merging results from overlapping slices/crops. 'none': No filtering applied, all detections are kept (may include duplicates from overlapping regions). 'nms' (Non-Maximum Suppression, default): Removes lower-confidence detections when IoU exceeds threshold, keeping only the highest confidence detection for each overlapping region. 'nmm' (Non-Maximum Merge): Combines overlapping detections instead of discarding them, merging detections that exceed IoU threshold. Use 'none' when you want to preserve all detections, 'nms' to remove duplicates (recommended for most cases), or 'nmm' to combine overlapping detections..iou_threshold(float_zero_to_one): Intersection over Union (IoU) threshold for overlap filtering. Range: 0.0 to 1.0. When overlap filtering strategy is 'nms' or 'nmm', detections with IoU above this threshold are considered overlapping. For NMS: overlapping detections with IoU above threshold result in lower-confidence detection being removed. For NMM: overlapping detections with IoU above threshold are merged. Lower values (e.g., 0.2-0.3) are more aggressive, removing/merging more detections. Higher values (e.g., 0.5-0.7) are more permissive, only handling highly overlapping detections. Default 0.3 works well for most use cases with overlapping slices..
-
output
predictions(Union[object_detection_prediction,instance_segmentation_prediction]): Prediction with detected bounding boxes in form of sv.Detections(...) object ifobject_detection_predictionor Prediction with detected bounding boxes and segmentation masks in form of sv.Detections(...) object ifinstance_segmentation_prediction.
Example JSON definition of step Detections Stitch in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/detections_stitch@v1",
"reference_image": "$inputs.image",
"predictions": "$steps.object_detection.predictions",
"overlap_filtering_strategy": "none",
"iou_threshold": 0.2
}