Google Vision OCR¶
Class: GoogleVisionOCRBlockV1
Source: inference.core.workflows.core_steps.models.foundation.google_vision_ocr.v1.GoogleVisionOCRBlockV1
Detect text in images using Google Vision OCR.
Supported types of text detection:
text_detection: optimized for areas of text within a larger image.ocr_text_detection: optimized for dense text documents.
Provide your Google Vision API key or set the value to rf_key:account (or
rf_key:user:<id>) to proxy requests through Roboflow's API.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/google_vision_ocr@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
ocr_type |
str |
Type of OCR to use. | ❌ |
api_key |
str |
Your Google Vision API key. | ✅ |
language_hints |
List[str] |
Optional list of language codes to pass to the OCR API. If not provided, the API will attempt to detect the language automatically.If provided, language codes must be supported by the OCR API, visit https://cloud.google.com/vision/docs/languages for list of supported language codes.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Runtime compatibility¶
-
requires_internet— air-gapped / offline deployments - This block depends on a service that is not reachable from fully offline / air-gapped deployments.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Google Vision OCR in version v1.
- inputs:
Image Slicer,Polygon Zone Visualization,VLM As Classifier,Contrast Enhancement,Google Gemma API,MoonshotAI Kimi,Stability AI Image Generation,Image Threshold,Line Counter Visualization,Trace Visualization,Stitch OCR Detections,Camera Calibration,QR Code Generator,Anthropic Claude,Icon Visualization,SIFT Comparison,Morphological Transformation,S3 Sink,Color Visualization,LMM For Classification,Perspective Correction,Microsoft SQL Server Sink,Corner Visualization,Roboflow Custom Metadata,Google Vision OCR,Twilio SMS Notification,Halo Visualization,Image Blur,Morphological Transformation,Qwen-VL,Camera Focus,Email Notification,Roboflow Vision Events,Halo Visualization,Stability AI Inpainting,Classification Label Visualization,Stitch OCR Detections,Google Gemma,Event Writer,Grid Visualization,Qwen3.5-VL,Background Color Visualization,Mask Visualization,Llama 3.2 Vision,Ellipse Visualization,Email Notification,Reference Path Visualization,Image Slicer,Label Visualization,Twilio SMS/MMS Notification,Text Display,OPC UA Writer Sink,Dot Visualization,Polygon Visualization,Crop Visualization,Dynamic Crop,Absolute Static Crop,Circle Visualization,Image Preprocessing,Llama 3.2 Vision,Model Monitoring Inference Aggregator,Relative Static Crop,Camera Focus,OpenRouter,OpenAI,Florence-2 Model,OpenAI-Compatible LLM,MoonshotAI Kimi,Heatmap Visualization,Single-Label Classification Model,OpenAI,OCR Model,CogVLM,Blur Visualization,Depth Estimation,Instance Segmentation Model,Stability AI Outpainting,Anthropic Claude,Google Gemini,Qwen 3.6 API,Clip Comparison,Google Gemini,Background Subtraction,Keypoint Visualization,CSV Formatter,Webhook Sink,Bounding Box Visualization,Multi-Label Classification Model,LMM,OpenAI,Stitch Images,Florence-2 Model,Image Convert Grayscale,Current Time,Contrast Equalization,OpenAI,VLM As Detector,Google Gemini,Roboflow Visual Search,Triangle Visualization,Slack Notification,EasyOCR,Roboflow Dataset Upload,Pixelate Visualization,Roboflow Dataset Upload,PLC Writer,SIFT,Qwen 3.5 API,Anthropic Claude,Object Detection Model,Local File Sink,MQTT Writer,Image Contours,Polygon Visualization,Keypoint Detection Model,GLM-OCR,Model Comparison Visualization,Roboflow Asset Library Attributes - outputs:
Line Counter,MoonshotAI Kimi,Stability AI Image Generation,Trace Visualization,Path Deviation,Anthropic Claude,Per-Class Confidence Filter,Icon Visualization,SIFT Comparison,Morphological Transformation,Color Visualization,LMM For Classification,Perspective Correction,Corner Visualization,Roboflow Custom Metadata,Detections Merge,Halo Visualization,Qwen-VL,Keypoint Detection Model,Email Notification,Halo Visualization,Google Gemma,Background Color Visualization,Email Notification,Ellipse Visualization,Twilio SMS/MMS Notification,Text Display,Polygon Visualization,Crop Visualization,Image Preprocessing,Model Monitoring Inference Aggregator,OpenRouter,OpenAI,Florence-2 Model,OpenAI,Heatmap Visualization,Detections Filter,Perception Encoder Embedding Model,Blur Visualization,Depth Estimation,Instance Segmentation Model,Stability AI Outpainting,Anthropic Claude,YOLO-World Model,Google Gemini,Clip Comparison,Google Gemini,Keypoint Visualization,Webhook Sink,Byte Tracker,Florence-2 Model,Current Time,Detections List Roll-Up,Contrast Equalization,OpenAI,Moondream2,Line Counter,Google Gemini,Triangle Visualization,Slack Notification,Overlap Filter,Time in Zone,CLIP Embedding Model,Detections Stabilizer,Multi-Label Classification Model,Local File Sink,Pixel Color Count,GLM-OCR,Roboflow Asset Library Attributes,Polygon Zone Visualization,Google Gemma API,Time in Zone,Stitch OCR Detections,Line Counter Visualization,Semantic Segmentation Model,Distance Measurement,Image Threshold,QR Code Generator,Detection Offset,ByteTrack Tracker,Detection Event Log,Detections Transformation,S3 Sink,Microsoft SQL Server Sink,Mask Area Measurement,Twilio SMS Notification,Google Vision OCR,Image Blur,Detections Combine,Morphological Transformation,Roboflow Vision Events,Size Measurement,PTZ Tracking (ONVIF),Stability AI Inpainting,Classification Label Visualization,Stitch OCR Detections,SAM2 Video Tracker,Event Writer,Qwen3.5-VL,Llama 3.2 Vision,Mask Visualization,Byte Tracker,Reference Path Visualization,Velocity,Label Visualization,Byte Tracker,OPC UA Writer Sink,Dot Visualization,Cache Set,Dynamic Crop,Detections Stitch,Circle Visualization,Llama 3.2 Vision,Path Deviation,SAM3 Video Tracker,BoT-SORT Tracker,Camera Focus,Segment Anything 2 Model,OpenAI-Compatible LLM,MoonshotAI Kimi,Overlap Analysis,CogVLM,Object Detection Model,SAM 3 Interactive,Qwen 3.6 API,Detections Consensus,Bounding Box Visualization,LMM,OpenAI,SAM 3,Instance Segmentation Model,Roboflow Visual Search,Roboflow Dataset Upload,SAM 3,Cache Get,Instance Segmentation Model,Detections Classes Replacement,Pixelate Visualization,Instance Segmentation Model,Roboflow Dataset Upload,SORT Tracker,Track Class Lock,Qwen 3.5 API,Anthropic Claude,Time in Zone,MQTT Writer,Polygon Visualization,OC-SORT Tracker,SAM 3,Model Comparison Visualization,Single-Label Classification Model,Seg Preview
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Google Vision OCR in version v1 has.
Bindings
-
input
image(image): Image to run OCR.api_key(Union[secret,ROBOFLOW_MANAGED_KEY,string]): Your Google Vision API key.
-
output
text(string): String value.language(string): String value.predictions(object_detection_prediction): Prediction with detected bounding boxes in form of sv.Detections(...) object.
Example JSON definition of step Google Vision OCR in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/google_vision_ocr@v1",
"image": "$inputs.image",
"ocr_type": "<block_does_not_provide_example>",
"api_key": "xxx-xxx",
"language_hints": [
"en",
"fr"
]
}