Stitch OCR Detections¶
Class: StitchOCRDetectionsBlockV1
Combines OCR detection results into a coherent text string by organizing detections spatially. This transformation is perfect for turning individual OCR results into structured, readable text!
How It Works¶
This transformation reconstructs the original text from OCR detection results by:
-
📐 Grouping text detections into rows based on their vertical (
y
) positions -
📏 Sorting detections within each row by horizontal (
x
) position -
📜 Concatenating the text in reading order (left-to-right, top-to-bottom)
Parameters¶
-
tolerance
: Controls how close detections need to be vertically to be considered part of the same line of text. A higher tolerance will group detections that are further apart vertically. -
reading_direction
: Determines the order in which text is read. Available options:-
"left_to_right": Standard left-to-right reading (e.g., English) ➡️
-
"right_to_left": Right-to-left reading (e.g., Arabic) ⬅️
-
"vertical_top_to_bottom": Vertical reading from top to bottom ⬇️
-
"vertical_bottom_to_top": Vertical reading from bottom to top ⬆️
-
"auto": Automatically detects the reading direction based on the spatial arrangement of text elements.
-
Why Use This Transformation?¶
This is especially useful for:
-
📖 Converting individual character/word detections into a readable text block
-
📝 Reconstructing multi-line text from OCR results
-
🔀 Maintaining proper reading order for detected text elements
-
🌏 Supporting different writing systems and text orientations
Example Usage¶
Use this transformation after an OCR model that outputs individual words or characters, so you can reconstruct the original text layout in its intended format.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/stitch_ocr_detections@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
reading_direction |
str |
The direction of the text in the image.. | ❌ |
tolerance |
int |
The tolerance for grouping detections into the same line of text.. | ✅ |
delimiter |
str |
The delimiter to use for stitching text.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Stitch OCR Detections
in version v1
.
- inputs:
Local File Sink
,Byte Tracker
,Webhook Sink
,Distance Measurement
,Time in Zone
,LMM For Classification
,VLM as Classifier
,OpenAI
,Detections Stitch
,SIFT Comparison
,Time in Zone
,Velocity
,Perspective Correction
,CSV Formatter
,Object Detection Model
,Detections Transformation
,LMM
,Byte Tracker
,Pixel Color Count
,Overlap Filter
,PTZ Tracking (ONVIF)
.md),Florence-2 Model
,Florence-2 Model
,SIFT Comparison
,OpenAI
,Roboflow Custom Metadata
,Object Detection Model
,Multi-Label Classification Model
,Detections Combine
,Detection Offset
,CogVLM
,EasyOCR
,Byte Tracker
,Model Monitoring Inference Aggregator
,Line Counter
,Anthropic Claude
,Time in Zone
,Stitch OCR Detections
,VLM as Detector
,Image Contours
,Path Deviation
,OpenAI
,Slack Notification
,Google Vision OCR
,Keypoint Detection Model
,Twilio SMS Notification
,Roboflow Dataset Upload
,Email Notification
,Detections Filter
,YOLO-World Model
,Instance Segmentation Model
,Clip Comparison
,Template Matching
,Detections Classes Replacement
,OCR Model
,Detections Stabilizer
,Line Counter
,Llama 3.2 Vision
,VLM as Detector
,Google Gemini
,Dynamic Crop
,Roboflow Dataset Upload
,Detections Merge
,Path Deviation
,Single-Label Classification Model
,Detections Consensus
,Moondream2
- outputs:
Distance Measurement
,Time in Zone
,Polygon Zone Visualization
,LMM For Classification
,Dot Visualization
,Morphological Transformation
,Size Measurement
,Perspective Correction
,Corner Visualization
,LMM
,Pixel Color Count
,Florence-2 Model
,PTZ Tracking (ONVIF)
.md),Image Threshold
,Florence-2 Model
,Halo Visualization
,OpenAI
,CogVLM
,Line Counter Visualization
,Perception Encoder Embedding Model
,Stitch OCR Detections
,Stability AI Outpainting
,Twilio SMS Notification
,CLIP Embedding Model
,Google Vision OCR
,Roboflow Dataset Upload
,Email Notification
,Instance Segmentation Model
,Clip Comparison
,Keypoint Visualization
,Llama 3.2 Vision
,Bounding Box Visualization
,Line Counter
,Instance Segmentation Model
,Reference Path Visualization
,Dynamic Crop
,Roboflow Dataset Upload
,Mask Visualization
,Image Preprocessing
,Background Color Visualization
,Local File Sink
,Webhook Sink
,OpenAI
,QR Code Generator
,Detections Stitch
,Trace Visualization
,Time in Zone
,Contrast Equalization
,Cache Set
,Crop Visualization
,Stability AI Image Generation
,SIFT Comparison
,Roboflow Custom Metadata
,Cache Get
,Model Comparison Visualization
,Model Monitoring Inference Aggregator
,Line Counter
,Anthropic Claude
,Time in Zone
,Polygon Visualization
,Slack Notification
,OpenAI
,Path Deviation
,Triangle Visualization
,YOLO-World Model
,Classification Label Visualization
,Detections Classes Replacement
,Circle Visualization
,Image Blur
,Label Visualization
,Google Gemini
,Stability AI Inpainting
,Icon Visualization
,Ellipse Visualization
,Color Visualization
,Path Deviation
,Moondream2
,Segment Anything 2 Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Stitch OCR Detections
in version v1
has.
Bindings
-
input
predictions
(object_detection_prediction
): The output of an OCR detection model..tolerance
(integer
): The tolerance for grouping detections into the same line of text..delimiter
(string
): The delimiter to use for stitching text..
-
output
ocr_text
(string
): String value.
Example JSON definition of step Stitch OCR Detections
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/stitch_ocr_detections@v1",
"predictions": "$steps.my_ocr_detection_model.predictions",
"reading_direction": "right_to_left",
"tolerance": 10,
"delimiter": ""
}