Google Vision OCR¶
Class: GoogleVisionOCRBlockV1
Source: inference.core.workflows.core_steps.models.foundation.google_vision_ocr.v1.GoogleVisionOCRBlockV1
Detect text in images using Google Vision OCR.
Supported types of text detection:
text_detection: optimized for areas of text within a larger image.ocr_text_detection: optimized for dense text documents.
Provide your Google Vision API key or set the value to rf_key:account (or
rf_key:user:<id>) to proxy requests through Roboflow's API.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/google_vision_ocr@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
ocr_type |
str |
Type of OCR to use. | ❌ |
api_key |
str |
Your Google Vision API key. | ✅ |
language_hints |
List[str] |
Optional list of language codes to pass to the OCR API. If not provided, the API will attempt to detect the language automatically.If provided, language codes must be supported by the OCR API, visit https://cloud.google.com/vision/docs/languages for list of supported language codes.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Google Vision OCR in version v1.
- inputs:
Stability AI Outpainting,Background Color Visualization,Halo Visualization,Image Contours,Dynamic Crop,Keypoint Detection Model,Florence-2 Model,Single-Label Classification Model,Image Slicer,Stitch Images,Llama 3.2 Vision,Image Slicer,Multi-Label Classification Model,Color Visualization,LMM For Classification,Stitch OCR Detections,Polygon Visualization,Florence-2 Model,Ellipse Visualization,Model Monitoring Inference Aggregator,Trace Visualization,Triangle Visualization,LMM,Model Comparison Visualization,Blur Visualization,Roboflow Dataset Upload,OpenAI,Stability AI Inpainting,Google Gemini,Image Preprocessing,Image Blur,CogVLM,Relative Static Crop,Bounding Box Visualization,Camera Focus,Image Threshold,Roboflow Dataset Upload,Anthropic Claude,OpenAI,Google Gemini,Email Notification,Contrast Equalization,Polygon Zone Visualization,OCR Model,VLM as Detector,OpenAI,Twilio SMS/MMS Notification,Icon Visualization,EasyOCR,Mask Visualization,Object Detection Model,Email Notification,Depth Estimation,Dot Visualization,Webhook Sink,Grid Visualization,Image Convert Grayscale,VLM as Classifier,Reference Path Visualization,SIFT,Morphological Transformation,Stability AI Image Generation,Corner Visualization,CSV Formatter,Keypoint Visualization,Roboflow Custom Metadata,Absolute Static Crop,Background Subtraction,Label Visualization,Text Display,Crop Visualization,Pixelate Visualization,Perspective Correction,Camera Calibration,Anthropic Claude,Local File Sink,Anthropic Claude,Circle Visualization,Google Vision OCR,Line Counter Visualization,Instance Segmentation Model,Classification Label Visualization,OpenAI,QR Code Generator,Clip Comparison,Camera Focus,Slack Notification,Twilio SMS Notification,SIFT Comparison,Google Gemini - outputs:
Size Measurement,SAM 3,Florence-2 Model,Perception Encoder Embedding Model,Distance Measurement,Detections Filter,Detections Classes Replacement,Color Visualization,Model Monitoring Inference Aggregator,Trace Visualization,Seg Preview,Triangle Visualization,Blur Visualization,Roboflow Dataset Upload,Detections Stabilizer,CogVLM,Detections Combine,Velocity,OpenAI,Contrast Equalization,Polygon Zone Visualization,Twilio SMS/MMS Notification,Mask Visualization,Email Notification,Webhook Sink,Dot Visualization,Time in Zone,Pixel Color Count,Detections Consensus,Stability AI Image Generation,Byte Tracker,Roboflow Custom Metadata,Text Display,Pixelate Visualization,SAM 3,Perspective Correction,Local File Sink,Anthropic Claude,Anthropic Claude,Circle Visualization,Google Vision OCR,Line Counter Visualization,Classification Label Visualization,OpenAI,QR Code Generator,Detections Transformation,Twilio SMS Notification,YOLO-World Model,Google Gemini,Stability AI Outpainting,Halo Visualization,Background Color Visualization,Instance Segmentation Model,Dynamic Crop,Byte Tracker,Llama 3.2 Vision,Detections List Roll-Up,LMM For Classification,Stitch OCR Detections,Polygon Visualization,Florence-2 Model,Ellipse Visualization,CLIP Embedding Model,LMM,Model Comparison Visualization,PTZ Tracking (ONVIF).md),OpenAI,Cache Get,Stability AI Inpainting,Google Gemini,Image Preprocessing,Image Blur,Bounding Box Visualization,Camera Focus,SAM 3,Line Counter,Roboflow Dataset Upload,Moondream2,Image Threshold,Anthropic Claude,Overlap Filter,Path Deviation,Google Gemini,Email Notification,Time in Zone,Detections Merge,OpenAI,Icon Visualization,Byte Tracker,Line Counter,Depth Estimation,Path Deviation,Reference Path Visualization,Morphological Transformation,Detection Offset,Corner Visualization,Keypoint Visualization,Label Visualization,Crop Visualization,Detection Event Log,Time in Zone,Segment Anything 2 Model,Detections Stitch,Cache Set,Instance Segmentation Model,Clip Comparison,Slack Notification,SIFT Comparison
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Google Vision OCR in version v1 has.
Bindings
-
input
image(image): Image to run OCR.api_key(Union[secret,string,ROBOFLOW_MANAGED_KEY]): Your Google Vision API key.
-
output
text(string): String value.language(string): String value.predictions(object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object.
Example JSON definition of step Google Vision OCR in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/google_vision_ocr@v1",
"image": "$inputs.image",
"ocr_type": "<block_does_not_provide_example>",
"api_key": "xxx-xxx",
"language_hints": [
"en",
"fr"
]
}