Stitch OCR Detections¶
Class: StitchOCRDetectionsBlockV1
Combines OCR detection results into a coherent text string by organizing detections spatially. This transformation is perfect for turning individual OCR results into structured, readable text!
How It Works¶
This transformation reconstructs the original text from OCR detection results by:
-
📐 Grouping text detections into rows based on their vertical (
y) positions -
📏 Sorting detections within each row by horizontal (
x) position -
📜 Concatenating the text in reading order (left-to-right, top-to-bottom)
Parameters¶
-
tolerance: Controls how close detections need to be vertically to be considered part of the same line of text. A higher tolerance will group detections that are further apart vertically. -
reading_direction: Determines the order in which text is read. Available options:-
"left_to_right": Standard left-to-right reading (e.g., English) ➡️
-
"right_to_left": Right-to-left reading (e.g., Arabic) ⬅️
-
"vertical_top_to_bottom": Vertical reading from top to bottom ⬇️
-
"vertical_bottom_to_top": Vertical reading from bottom to top ⬆️
-
"auto": Automatically detects the reading direction based on the spatial arrangement of text elements.
-
Why Use This Transformation?¶
This is especially useful for:
-
📖 Converting individual character/word detections into a readable text block
-
📝 Reconstructing multi-line text from OCR results
-
🔀 Maintaining proper reading order for detected text elements
-
🌏 Supporting different writing systems and text orientations
Example Usage¶
Use this transformation after an OCR model that outputs individual words or characters, so you can reconstruct the original text layout in its intended format.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/stitch_ocr_detections@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
reading_direction |
str |
The direction of the text in the image.. | ❌ |
tolerance |
int |
The tolerance for grouping detections into the same line of text.. | ✅ |
delimiter |
str |
The delimiter to use for stitching text.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Stitch OCR Detections in version v1.
- inputs:
VLM as Detector,Byte Tracker,Google Vision OCR,Overlap Filter,Detections Stabilizer,Time in Zone,SIFT Comparison,Image Contours,VLM as Detector,CSV Formatter,OpenAI,Detections Filter,Detections Classes Replacement,Perspective Correction,LMM For Classification,Twilio SMS Notification,VLM as Classifier,Single-Label Classification Model,Roboflow Dataset Upload,Detections Combine,Roboflow Dataset Upload,Stitch OCR Detections,Model Monitoring Inference Aggregator,Template Matching,Moondream2,Velocity,Webhook Sink,OpenAI,OCR Model,Distance Measurement,Florence-2 Model,Detections Transformation,EasyOCR,SIFT Comparison,Multi-Label Classification Model,Florence-2 Model,Time in Zone,Detection Offset,Slack Notification,Clip Comparison,Instance Segmentation Model,Detections Merge,Path Deviation,Line Counter,Byte Tracker,PTZ Tracking (ONVIF).md),Object Detection Model,OpenAI,Keypoint Detection Model,Google Gemini,Anthropic Claude,LMM,Email Notification,Google Gemini,Byte Tracker,Llama 3.2 Vision,Pixel Color Count,Dynamic Crop,Path Deviation,Detections Consensus,Line Counter,YOLO-World Model,Email Notification,Local File Sink,Time in Zone,CogVLM,Roboflow Custom Metadata,Detections Stitch,Object Detection Model - outputs:
Google Vision OCR,SAM 3,Classification Label Visualization,Circle Visualization,Image Preprocessing,LMM For Classification,Ellipse Visualization,Triangle Visualization,Stability AI Inpainting,QR Code Generator,Background Color Visualization,Model Monitoring Inference Aggregator,Segment Anything 2 Model,Moondream2,Distance Measurement,Dot Visualization,Florence-2 Model,Morphological Transformation,Reference Path Visualization,Halo Visualization,SIFT Comparison,Polygon Visualization,Florence-2 Model,Slack Notification,Clip Comparison,Perception Encoder Embedding Model,Instance Segmentation Model,OpenAI,Color Visualization,Line Counter,PTZ Tracking (ONVIF).md),Google Gemini,Label Visualization,Email Notification,Llama 3.2 Vision,Trace Visualization,Line Counter,YOLO-World Model,Size Measurement,Email Notification,Corner Visualization,Mask Visualization,Time in Zone,OpenAI,Roboflow Custom Metadata,Stability AI Outpainting,CogVLM,Stitch OCR Detections,Detections Stitch,Cache Set,Time in Zone,Crop Visualization,OpenAI,Detections Classes Replacement,Perspective Correction,Twilio SMS Notification,Seg Preview,Contrast Equalization,Roboflow Dataset Upload,Roboflow Dataset Upload,Polygon Zone Visualization,CLIP Embedding Model,Stability AI Image Generation,Webhook Sink,Bounding Box Visualization,Line Counter Visualization,Instance Segmentation Model,Icon Visualization,Image Blur,Time in Zone,Image Threshold,Path Deviation,Anthropic Claude,LMM,Google Gemini,Pixel Color Count,Dynamic Crop,Path Deviation,Model Comparison Visualization,Cache Get,Local File Sink,Keypoint Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Stitch OCR Detections in version v1 has.
Bindings
-
input
predictions(object_detection_prediction): The output of an OCR detection model..tolerance(integer): The tolerance for grouping detections into the same line of text..delimiter(string): The delimiter to use for stitching text..
-
output
ocr_text(string): String value.
Example JSON definition of step Stitch OCR Detections in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/stitch_ocr_detections@v1",
"predictions": "$steps.my_ocr_detection_model.predictions",
"reading_direction": "right_to_left",
"tolerance": 10,
"delimiter": ""
}