Stitch OCR Detections¶
Class: StitchOCRDetectionsBlockV1
Combines OCR detection results into a coherent text string by organizing detections spatially. This transformation is perfect for turning individual OCR results into structured, readable text!
How It Works¶
This transformation reconstructs the original text from OCR detection results by:
-
📐 Grouping text detections into rows based on their vertical (
y) positions -
📏 Sorting detections within each row by horizontal (
x) position -
📜 Concatenating the text in reading order (left-to-right, top-to-bottom)
Parameters¶
-
tolerance: Controls how close detections need to be vertically to be considered part of the same line of text. A higher tolerance will group detections that are further apart vertically. -
reading_direction: Determines the order in which text is read. Available options:-
"left_to_right": Standard left-to-right reading (e.g., English) ➡️
-
"right_to_left": Right-to-left reading (e.g., Arabic) ⬅️
-
"vertical_top_to_bottom": Vertical reading from top to bottom ⬇️
-
"vertical_bottom_to_top": Vertical reading from bottom to top ⬆️
-
"auto": Automatically detects the reading direction based on the spatial arrangement of text elements.
-
Why Use This Transformation?¶
This is especially useful for:
-
📖 Converting individual character/word detections into a readable text block
-
📝 Reconstructing multi-line text from OCR results
-
🔀 Maintaining proper reading order for detected text elements
-
🌏 Supporting different writing systems and text orientations
Example Usage¶
Use this transformation after an OCR model that outputs individual words or characters, so you can reconstruct the original text layout in its intended format.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/stitch_ocr_detections@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
reading_direction |
str |
The direction of the text in the image.. | ❌ |
tolerance |
int |
The tolerance for grouping detections into the same line of text.. | ✅ |
delimiter |
str |
The delimiter to use for stitching text.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Stitch OCR Detections in version v1.
- inputs:
Moondream2,Detections Stitch,Velocity,LMM,Instance Segmentation Model,Line Counter,Time in Zone,Dynamic Crop,Multi-Label Classification Model,Webhook Sink,Google Gemini,Single-Label Classification Model,Anthropic Claude,Email Notification,Keypoint Detection Model,EasyOCR,OCR Model,Detections Classes Replacement,Detections Consensus,Llama 3.2 Vision,OpenAI,OpenAI,Line Counter,SIFT Comparison,Roboflow Custom Metadata,Twilio SMS Notification,Florence-2 Model,Perspective Correction,VLM as Detector,Google Gemini,CSV Formatter,Detections Transformation,PTZ Tracking (ONVIF).md),Roboflow Dataset Upload,Stitch OCR Detections,Local File Sink,CogVLM,Detections Merge,Slack Notification,Path Deviation,Detections Combine,Model Monitoring Inference Aggregator,Overlap Filter,Time in Zone,Pixel Color Count,Object Detection Model,Email Notification,Time in Zone,Object Detection Model,Template Matching,VLM as Detector,Florence-2 Model,LMM For Classification,Image Contours,OpenAI,Roboflow Dataset Upload,YOLO-World Model,Detections Filter,SIFT Comparison,Distance Measurement,Path Deviation,Motion Detection,Byte Tracker,Clip Comparison,Google Vision OCR,VLM as Classifier,Byte Tracker,Byte Tracker,Detections Stabilizer,Detection Offset,OpenAI,Anthropic Claude - outputs:
Moondream2,Line Counter Visualization,Detections Stitch,LMM,Instance Segmentation Model,Line Counter,Time in Zone,Instance Segmentation Model,Dynamic Crop,Circle Visualization,Webhook Sink,Mask Visualization,Google Gemini,Ellipse Visualization,Anthropic Claude,Email Notification,Color Visualization,Image Preprocessing,Detections Classes Replacement,Llama 3.2 Vision,SAM 3,OpenAI,OpenAI,Line Counter,Twilio SMS Notification,Florence-2 Model,SAM 3,Roboflow Custom Metadata,Trace Visualization,Perspective Correction,Perception Encoder Embedding Model,Stability AI Outpainting,Google Gemini,Contrast Equalization,QR Code Generator,Polygon Zone Visualization,Reference Path Visualization,PTZ Tracking (ONVIF).md),Roboflow Dataset Upload,Stability AI Inpainting,Stitch OCR Detections,Bounding Box Visualization,Corner Visualization,Polygon Visualization,Local File Sink,CogVLM,Slack Notification,Path Deviation,Triangle Visualization,Model Monitoring Inference Aggregator,Icon Visualization,Time in Zone,Background Color Visualization,Classification Label Visualization,Keypoint Visualization,Size Measurement,Dot Visualization,Pixel Color Count,Email Notification,Crop Visualization,Time in Zone,Florence-2 Model,LMM For Classification,OpenAI,Roboflow Dataset Upload,YOLO-World Model,SAM 3,Halo Visualization,Model Comparison Visualization,CLIP Embedding Model,SIFT Comparison,Distance Measurement,Path Deviation,Clip Comparison,Cache Set,Seg Preview,Morphological Transformation,Google Vision OCR,Segment Anything 2 Model,Stability AI Image Generation,Image Blur,Image Threshold,Cache Get,OpenAI,Anthropic Claude,Label Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Stitch OCR Detections in version v1 has.
Bindings
-
input
predictions(object_detection_prediction): The output of an OCR detection model..tolerance(integer): The tolerance for grouping detections into the same line of text..delimiter(string): The delimiter to use for stitching text..
-
output
ocr_text(string): String value.
Example JSON definition of step Stitch OCR Detections in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/stitch_ocr_detections@v1",
"predictions": "$steps.my_ocr_detection_model.predictions",
"reading_direction": "right_to_left",
"tolerance": 10,
"delimiter": ""
}