Google Vision OCR¶
Class: GoogleVisionOCRBlockV1
Source: inference.core.workflows.core_steps.models.foundation.google_vision_ocr.v1.GoogleVisionOCRBlockV1
Detect text in images using Google Vision OCR.
Supported types of text detection:
text_detection: optimized for areas of text within a larger image.ocr_text_detection: optimized for dense text documents.
Provide your Google Vision API key or set the value to rf_key:account (or
rf_key:user:<id>) to proxy requests through Roboflow's API.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/google_vision_ocr@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
ocr_type |
str |
Type of OCR to use. | ❌ |
api_key |
str |
Your Google Vision API key. | ✅ |
language_hints |
List[str] |
Optional list of language codes to pass to the OCR API. If not provided, the API will attempt to detect the language automatically.If provided, language codes must be supported by the OCR API, visit https://cloud.google.com/vision/docs/languages for list of supported language codes.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Google Vision OCR in version v1.
- inputs:
Stitch Images,Image Threshold,Email Notification,Corner Visualization,Image Blur,Ellipse Visualization,OpenAI,Roboflow Dataset Upload,Object Detection Model,Depth Estimation,Stitch OCR Detections,EasyOCR,Absolute Static Crop,CogVLM,Google Gemini,Stability AI Image Generation,Grid Visualization,Dynamic Crop,Image Slicer,Image Preprocessing,Relative Static Crop,SIFT,Morphological Transformation,Instance Segmentation Model,Line Counter Visualization,Trace Visualization,LMM For Classification,Halo Visualization,Dot Visualization,GLM-OCR,Model Monitoring Inference Aggregator,Roboflow Custom Metadata,Pixelate Visualization,Circle Visualization,Image Convert Grayscale,Icon Visualization,QR Code Generator,S3 Sink,Keypoint Detection Model,Twilio SMS Notification,Halo Visualization,Camera Focus,Anthropic Claude,OCR Model,Polygon Visualization,Text Display,Reference Path Visualization,Llama 3.2 Vision,CSV Formatter,Crop Visualization,Roboflow Dataset Upload,Mask Visualization,Heatmap Visualization,Webhook Sink,Label Visualization,Classification Label Visualization,Google Vision OCR,Florence-2 Model,Florence-2 Model,VLM As Detector,Polygon Zone Visualization,Stability AI Inpainting,Google Gemini,Perspective Correction,Camera Calibration,Anthropic Claude,OpenAI,OpenAI,Qwen3.5-VL,Background Color Visualization,Anthropic Claude,Email Notification,Background Subtraction,Contrast Equalization,SIFT Comparison,Multi-Label Classification Model,Keypoint Visualization,Stitch OCR Detections,LMM,Color Visualization,Single-Label Classification Model,OpenAI,Roboflow Vision Events,Local File Sink,VLM As Classifier,Twilio SMS/MMS Notification,Triangle Visualization,Clip Comparison,Blur Visualization,Bounding Box Visualization,Camera Focus,Polygon Visualization,Google Gemini,Image Slicer,Image Contours,Model Comparison Visualization,Stability AI Outpainting,Slack Notification - outputs:
Image Threshold,Email Notification,Corner Visualization,Roboflow Dataset Upload,Stitch OCR Detections,Stability AI Image Generation,Time in Zone,Dynamic Crop,Instance Segmentation Model,Image Preprocessing,Line Counter Visualization,Detections Combine,Trace Visualization,Halo Visualization,ByteTrack Tracker,Cache Get,Roboflow Custom Metadata,Pixelate Visualization,Circle Visualization,S3 Sink,Detections Classes Replacement,Twilio SMS Notification,Halo Visualization,Anthropic Claude,OC-SORT Tracker,Polygon Visualization,Detections Consensus,Roboflow Dataset Upload,Crop Visualization,Mask Visualization,Detection Offset,CLIP Embedding Model,Heatmap Visualization,Webhook Sink,Cache Set,Google Vision OCR,Detections List Roll-Up,Florence-2 Model,Florence-2 Model,Overlap Filter,Anthropic Claude,OpenAI,OpenAI,PTZ Tracking (ONVIF),Background Color Visualization,Anthropic Claude,SIFT Comparison,Keypoint Visualization,Time in Zone,Detections Filter,Stitch OCR Detections,LMM,Detections Merge,Detections Transformation,Perception Encoder Embedding Model,SAM 3,Seg Preview,Roboflow Vision Events,Detections Stitch,Triangle Visualization,Distance Measurement,Google Gemini,Path Deviation,Model Comparison Visualization,Stability AI Outpainting,Image Blur,Ellipse Visualization,OpenAI,Time in Zone,Depth Estimation,CogVLM,Google Gemini,Velocity,Morphological Transformation,LMM For Classification,Detection Event Log,Dot Visualization,GLM-OCR,Model Monitoring Inference Aggregator,Pixel Color Count,Icon Visualization,QR Code Generator,Detections Stabilizer,Camera Focus,SAM 3,Text Display,Reference Path Visualization,Instance Segmentation Model,Llama 3.2 Vision,SORT Tracker,Byte Tracker,Label Visualization,Classification Label Visualization,Byte Tracker,Segment Anything 2 Model,Polygon Zone Visualization,Stability AI Inpainting,Google Gemini,SAM 3,Perspective Correction,Size Measurement,Email Notification,Contrast Equalization,Line Counter,Path Deviation,Byte Tracker,Line Counter,Color Visualization,OpenAI,Local File Sink,Mask Area Measurement,Twilio SMS/MMS Notification,YOLO-World Model,Clip Comparison,Bounding Box Visualization,Blur Visualization,Polygon Visualization,Moondream2,Slack Notification
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Google Vision OCR in version v1 has.
Bindings
-
input
image(image): Image to run OCR.api_key(Union[secret,ROBOFLOW_MANAGED_KEY,string]): Your Google Vision API key.
-
output
text(string): String value.language(string): String value.predictions(object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object.
Example JSON definition of step Google Vision OCR in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/google_vision_ocr@v1",
"image": "$inputs.image",
"ocr_type": "<block_does_not_provide_example>",
"api_key": "xxx-xxx",
"language_hints": [
"en",
"fr"
]
}