Google Vision OCR¶
Class: GoogleVisionOCRBlockV1
Source: inference.core.workflows.core_steps.models.foundation.google_vision_ocr.v1.GoogleVisionOCRBlockV1
Detect text in images using Google Vision OCR.
Supported types of text detection:
text_detection: optimized for areas of text within a larger image.ocr_text_detection: optimized for dense text documents.
Provide your Google Vision API key or set the value to rf_key:account (or
rf_key:user:<id>) to proxy requests through Roboflow's API.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/google_vision_ocr@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
ocr_type |
str |
Type of OCR to use. | ❌ |
api_key |
str |
Your Google Vision API key. | ✅ |
language_hints |
List[str] |
Optional list of language codes to pass to the OCR API. If not provided, the API will attempt to detect the language automatically.If provided, language codes must be supported by the OCR API, visit https://cloud.google.com/vision/docs/languages for list of supported language codes.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Runtime compatibility¶
-
requires_internet— air-gapped / offline deployments - This block depends on a service that is not reachable from fully offline / air-gapped deployments.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Google Vision OCR in version v1.
- inputs:
Halo Visualization,Stitch OCR Detections,GLM-OCR,Image Threshold,Stitch Images,Morphological Transformation,Classification Label Visualization,Twilio SMS/MMS Notification,Crop Visualization,Icon Visualization,Stability AI Outpainting,Blur Visualization,VLM As Classifier,Reference Path Visualization,MoonshotAI Kimi,OpenAI,Google Gemini,Anthropic Claude,Webhook Sink,Camera Focus,QR Code Generator,Model Comparison Visualization,Florence-2 Model,MQTT Writer,Trace Visualization,Ellipse Visualization,Anthropic Claude,Dot Visualization,Perspective Correction,Label Visualization,Image Convert Grayscale,Florence-2 Model,Text Display,Qwen-VL,Llama 3.2 Vision,Roboflow Dataset Upload,Image Blur,Keypoint Detection Model,Absolute Static Crop,SIFT,CSV Formatter,LMM,Google Gemini,EasyOCR,Qwen 3.5 API,Local File Sink,Qwen 3.6 API,Triangle Visualization,Camera Focus,Contrast Equalization,Polygon Visualization,OpenAI,Heatmap Visualization,Clip Comparison,Google Gemma API,Contrast Enhancement,Google Gemini,Halo Visualization,Color Visualization,Morphological Transformation,MoonshotAI Kimi,Stitch OCR Detections,LMM For Classification,Event Writer,VLM As Detector,Llama 3.2 Vision,Polygon Visualization,Email Notification,Mask Visualization,Anthropic Claude,Stability AI Inpainting,Roboflow Asset Library Attributes,Microsoft SQL Server Sink,Keypoint Visualization,OpenAI,Background Subtraction,Multi-Label Classification Model,Roboflow Vision Events,Twilio SMS Notification,Email Notification,Image Slicer,Image Contours,Line Counter Visualization,CogVLM,Object Detection Model,Image Preprocessing,OPC UA Writer Sink,Dynamic Crop,Depth Estimation,Bounding Box Visualization,Qwen3.5-VL,Current Time,Corner Visualization,Polygon Zone Visualization,Camera Calibration,Roboflow Dataset Upload,Grid Visualization,Stability AI Image Generation,OpenAI,S3 Sink,Circle Visualization,Image Slicer,OCR Model,Single-Label Classification Model,Relative Static Crop,Roboflow Custom Metadata,Instance Segmentation Model,Model Monitoring Inference Aggregator,OpenAI-Compatible LLM,Slack Notification,OpenRouter,SIFT Comparison,Pixelate Visualization,Google Vision OCR,Background Color Visualization,Google Gemma - outputs:
Overlap Analysis,Morphological Transformation,Classification Label Visualization,Crop Visualization,Stability AI Outpainting,Detections Transformation,Blur Visualization,Reference Path Visualization,OpenAI,YOLO-World Model,Detections Classes Replacement,Anthropic Claude,Track Class Lock,Instance Segmentation Model,Size Measurement,Model Comparison Visualization,Florence-2 Model,Trace Visualization,Label Visualization,Florence-2 Model,Qwen-VL,Text Display,Llama 3.2 Vision,Image Blur,Velocity,Keypoint Detection Model,LMM,OC-SORT Tracker,Qwen 3.5 API,Qwen 3.6 API,Camera Focus,Line Counter,SORT Tracker,Detections Stitch,Clip Comparison,Google Gemma API,Halo Visualization,Stitch OCR Detections,MoonshotAI Kimi,Color Visualization,Morphological Transformation,Event Writer,Stability AI Inpainting,Cache Set,Microsoft SQL Server Sink,Time in Zone,Roboflow Asset Library Attributes,OpenAI,Roboflow Vision Events,Mask Area Measurement,Detection Offset,CogVLM,Detections Consensus,OPC UA Writer Sink,Semantic Segmentation Model,Dynamic Crop,Path Deviation,Byte Tracker,Bounding Box Visualization,Detections Combine,Qwen3.5-VL,SAM 3,Cache Get,OpenAI,Time in Zone,Slack Notification,OpenRouter,Detection Event Log,Google Vision OCR,SIFT Comparison,Pixelate Visualization,SAM3 Video Tracker,Google Gemma,Halo Visualization,CLIP Embedding Model,Stitch OCR Detections,GLM-OCR,Image Threshold,SAM 3 Interactive,Twilio SMS/MMS Notification,Icon Visualization,MoonshotAI Kimi,ByteTrack Tracker,Single-Label Classification Model,Google Gemini,Byte Tracker,Webhook Sink,Instance Segmentation Model,QR Code Generator,Path Deviation,MQTT Writer,Ellipse Visualization,Anthropic Claude,BoT-SORT Tracker,Dot Visualization,Perspective Correction,Instance Segmentation Model,Seg Preview,Per-Class Confidence Filter,Roboflow Dataset Upload,Detections Stabilizer,Detections Merge,Google Gemini,Local File Sink,SAM 3,Triangle Visualization,Contrast Equalization,Time in Zone,Polygon Visualization,OpenAI,SAM2 Video Tracker,Heatmap Visualization,Perception Encoder Embedding Model,Detections List Roll-Up,Google Gemini,LMM For Classification,Llama 3.2 Vision,Multi-Label Classification Model,Polygon Visualization,Email Notification,Mask Visualization,Anthropic Claude,Detections Filter,Distance Measurement,PTZ Tracking (ONVIF),Keypoint Visualization,Overlap Filter,Twilio SMS Notification,Email Notification,Line Counter Visualization,Image Preprocessing,SAM 3,Byte Tracker,Depth Estimation,Pixel Color Count,Current Time,Roboflow Dataset Upload,Polygon Zone Visualization,Moondream2,Segment Anything 2 Model,Corner Visualization,Stability AI Image Generation,S3 Sink,Circle Visualization,Roboflow Custom Metadata,Instance Segmentation Model,Model Monitoring Inference Aggregator,OpenAI-Compatible LLM,Object Detection Model,Background Color Visualization,Line Counter
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Google Vision OCR in version v1 has.
Bindings
-
input
image(image): Image to run OCR.api_key(Union[string,ROBOFLOW_MANAGED_KEY,secret]): Your Google Vision API key.
-
output
text(string): String value.language(string): String value.predictions(object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object.
Example JSON definition of step Google Vision OCR in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/google_vision_ocr@v1",
"image": "$inputs.image",
"ocr_type": "<block_does_not_provide_example>",
"api_key": "xxx-xxx",
"language_hints": [
"en",
"fr"
]
}