OCR Model¶
Class: OCRModelBlockV1
Source: inference.core.workflows.core_steps.models.foundation.ocr.v1.OCRModelBlockV1
Retrieve the characters in an image using DocTR Optical Character Recognition (OCR).
This block returns the text within an image.
You may want to use this block in combination with a detections-based block (i.e. ObjectDetectionBlock). An object detection model could isolate specific regions from an image (i.e. a shipping container ID in a logistics use case) for further processing. You can then use a DynamicCropBlock to crop the region of interest before running OCR.
Using a detections model then cropping detections allows you to isolate your analysis on particular regions of an image.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/ocr_model@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to OCR Model in version v1.
- inputs:
Icon Visualization,Line Counter Visualization,Stability AI Outpainting,Image Contours,Image Slicer,Pixelate Visualization,Image Preprocessing,Polygon Zone Visualization,Color Visualization,Reference Path Visualization,Blur Visualization,Background Subtraction,Text Display,Ellipse Visualization,Polygon Visualization,Relative Static Crop,Stability AI Image Generation,Perspective Correction,Model Comparison Visualization,Bounding Box Visualization,Trace Visualization,Depth Estimation,Camera Focus,Classification Label Visualization,Image Slicer,Absolute Static Crop,Image Blur,Stability AI Inpainting,Polygon Visualization,Image Convert Grayscale,SIFT,Label Visualization,Image Threshold,Corner Visualization,Grid Visualization,Dynamic Crop,Contrast Equalization,Heatmap Visualization,SIFT Comparison,Stitch Images,Triangle Visualization,Morphological Transformation,Keypoint Visualization,QR Code Generator,Halo Visualization,Circle Visualization,Camera Focus,Halo Visualization,Mask Visualization,Crop Visualization,Morphological Transformation,Camera Calibration,Contrast Enhancement,Background Color Visualization,Dot Visualization - outputs:
Roboflow Dataset Upload,Line Counter Visualization,Distance Measurement,Instance Segmentation Model,Color Visualization,Multi-Label Classification Model,Ellipse Visualization,Polygon Visualization,ByteTrack Tracker,Single-Label Classification Model,Byte Tracker,Detections Consensus,Detections Classes Replacement,Cache Set,Webhook Sink,Trace Visualization,Stitch OCR Detections,Qwen 3.5 API,Camera Focus,OpenAI,SAM 3,Size Measurement,Image Threshold,Heatmap Visualization,SORT Tracker,Florence-2 Model,Halo Visualization,Detections Transformation,Path Deviation,GLM-OCR,Dot Visualization,S3 Sink,Path Deviation,Semantic Segmentation Model,Twilio SMS Notification,Seg Preview,Model Monitoring Inference Aggregator,Google Gemini,Roboflow Dataset Upload,Pixelate Visualization,Line Counter,Twilio SMS/MMS Notification,Polygon Zone Visualization,Blur Visualization,Text Display,Stability AI Image Generation,Detections Merge,Perspective Correction,Anthropic Claude,Line Counter,Bounding Box Visualization,Overlap Filter,Depth Estimation,Velocity,Stability AI Inpainting,Polygon Visualization,Roboflow Vision Events,Google Gemini,Label Visualization,Contrast Equalization,Per-Class Confidence Filter,Triangle Visualization,Halo Visualization,Circle Visualization,Segment Anything 2 Model,Mask Visualization,OpenAI,MoonshotAI Kimi,Llama 3.2 Vision,Email Notification,Slack Notification,CLIP Embedding Model,Detections Stitch,Detections Stabilizer,Email Notification,Google Gemma API,Stability AI Outpainting,Google Vision OCR,Google Gemini,Image Preprocessing,Detections Combine,Object Detection Model,OpenAI,SAM2 Video Tracker,Detection Event Log,Byte Tracker,Anthropic Claude,Time in Zone,Model Comparison Visualization,Roboflow Custom Metadata,YOLO-World Model,Detection Offset,Perception Encoder Embedding Model,Instance Segmentation Model,Detections List Roll-Up,Mask Area Measurement,Qwen 3.6 API,SIFT Comparison,Morphological Transformation,Instance Segmentation Model,CogVLM,Crop Visualization,Florence-2 Model,Time in Zone,OC-SORT Tracker,SAM 3,Local File Sink,Icon Visualization,Detections Filter,Time in Zone,Reference Path Visualization,Anthropic Claude,Clip Comparison,LMM,Pixel Color Count,Classification Label Visualization,Byte Tracker,Image Blur,SAM 3,OpenAI,Corner Visualization,Keypoint Detection Model,Dynamic Crop,Keypoint Visualization,Moondream2,QR Code Generator,LMM For Classification,Morphological Transformation,Background Color Visualization,PTZ Tracking (ONVIF),Stitch OCR Detections,Cache Get
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
OCR Model in version v1 has.
Bindings
-
input
images(image): The image to infer on..
-
output
result(string): String value.predictions(object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step OCR Model in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/ocr_model@v1",
"images": "$inputs.image"
}