OpenAI¶
v4¶
Class: OpenAIBlockV4 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.openai.v4.OpenAIBlockV4
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Ask a question to OpenAI's GPT models with vision capabilities (including GPT-5 and GPT-4o).
You can specify arbitrary text prompts or predefined ones, the block supports the following types of prompt:
-
Open Prompt (
unconstrained) - Use any prompt to generate a raw response -
Text Recognition (OCR) (
ocr) - Model recognizes text in the image -
Visual Question Answering (
visual-question-answering) - Model answers the question you submit in the prompt -
Captioning (short) (
caption) - Model provides a short description of the image -
Captioning (
detailed-caption) - Model provides a long description of the image -
Single-Label Classification (
classification) - Model classifies the image content as one of the provided classes -
Multi-Label Classification (
multi-label-classification) - Model classifies the image content as one or more of the provided classes -
Unprompted Object Detection (
object-detection) - Model detects and returns the bounding boxes for prominent objects in the image -
Structured Output Generation (
structured-answering) - Model returns a JSON response with the specified fields
Provide your OpenAI API key or set the value to rf_key:account (or
rf_key:user:<id>) to proxy requests through Roboflow's API.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/open_ai@v4to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
task_type |
str |
Task type to be performed by model. Value determines required parameters and output response.. | ❌ |
prompt |
str |
Text prompt to the OpenAI model. | ✅ |
output_structure |
Dict[str, str] |
Dictionary with structure of expected JSON response. | ❌ |
classes |
List[str] |
List of classes to be used. | ✅ |
api_key |
str |
Your OpenAI API key. | ✅ |
model_version |
str |
Model to be used. | ✅ |
reasoning_effort |
str |
Control the effort on reasoning. Reducing reasoning effort can result in faster responses and fewer tokens used. GPT-5.1 defaults to 'none' (no reasoning) and supports 'none', 'low', 'medium', 'high'. GPT-5 models default to 'medium' and support 'minimal', 'low', 'medium', 'high'.. | ✅ |
image_detail |
str |
Indicates the image's quality, with 'high' suggesting it is of high resolution and should be processed or displayed with high fidelity.. | ✅ |
max_tokens |
int |
Maximum number of tokens the model can generate in its response. If not specified, the model will use its default limit. Minimum value is 16.. | ❌ |
temperature |
float |
Temperature to sample from the model - value in range 0.0-2.0, the higher - the more random / "creative" the generations are.. | ✅ |
max_concurrent_requests |
int |
Number of concurrent requests that can be executed by block when batch of input images provided. If not given - block defaults to value configured globally in Workflows Execution Engine. Please restrict if you hit OpenAI limits.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to OpenAI in version v4.
- inputs:
Google Vision OCR,Label Visualization,LMM For Classification,Blur Visualization,Background Color Visualization,Contrast Equalization,Bounding Box Visualization,Keypoint Visualization,Stability AI Outpainting,Reference Path Visualization,Image Slicer,Pixelate Visualization,Single-Label Classification Model,Clip Comparison,CSV Formatter,Image Preprocessing,Color Visualization,SIFT Comparison,Object Detection Model,Email Notification,Anthropic Claude,Circle Visualization,Image Contours,Polygon Zone Visualization,Ellipse Visualization,Clip Comparison,Email Notification,VLM as Classifier,Model Monitoring Inference Aggregator,OCR Model,Absolute Static Crop,Depth Estimation,LMM,Morphological Transformation,Roboflow Dataset Upload,Gaze Detection,Crop Visualization,OpenAI,Image Convert Grayscale,Florence-2 Model,CogVLM,Roboflow Custom Metadata,VLM as Detector,Classification Label Visualization,Buffer,Stitch OCR Detections,Keypoint Detection Model,Cosine Similarity,Camera Calibration,Polygon Visualization,Icon Visualization,Identify Changes,Triangle Visualization,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Corner Visualization,Florence-2 Model,Google Gemini,Google Gemini,EasyOCR,Line Counter Visualization,Grid Visualization,Halo Visualization,Size Measurement,Stability AI Image Generation,QR Code Generator,Dynamic Zone,Twilio SMS Notification,Relative Static Crop,Dot Visualization,Llama 3.2 Vision,Image Blur,Slack Notification,Dimension Collapse,OpenAI,Local File Sink,Multi-Label Classification Model,Image Slicer,OpenAI,Stability AI Inpainting,Dynamic Crop,Camera Focus,Webhook Sink,Image Threshold,Instance Segmentation Model,Perspective Correction,Mask Visualization,Trace Visualization,OpenAI,Stitch Images,SIFT - outputs:
Label Visualization,Background Color Visualization,Contrast Equalization,Reference Path Visualization,Stability AI Outpainting,Clip Comparison,Perception Encoder Embedding Model,Seg Preview,Image Preprocessing,Color Visualization,SIFT Comparison,Email Notification,Cache Set,Circle Visualization,Object Detection Model,Moondream2,VLM as Classifier,Model Monitoring Inference Aggregator,Path Deviation,LMM,Time in Zone,Morphological Transformation,Detections Consensus,Crop Visualization,OpenAI,Florence-2 Model,Classification Label Visualization,Segment Anything 2 Model,Time in Zone,YOLO-World Model,PTZ Tracking (ONVIF).md),Icon Visualization,Distance Measurement,VLM as Detector,Line Counter Visualization,Grid Visualization,Halo Visualization,Size Measurement,Twilio SMS Notification,Time in Zone,Detections Stitch,Llama 3.2 Vision,Image Blur,Slack Notification,OpenAI,OpenAI,Dynamic Crop,Pixel Color Count,Mask Visualization,Google Vision OCR,LMM For Classification,Keypoint Visualization,Bounding Box Visualization,SAM 3,SAM 3,Object Detection Model,Path Deviation,Anthropic Claude,Polygon Zone Visualization,Ellipse Visualization,Line Counter,Email Notification,Clip Comparison,Roboflow Dataset Upload,SAM 3,CogVLM,Roboflow Custom Metadata,VLM as Detector,Buffer,Stitch OCR Detections,Keypoint Detection Model,Keypoint Detection Model,Line Counter,JSON Parser,Polygon Visualization,CLIP Embedding Model,Detections Classes Replacement,Cache Get,Triangle Visualization,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Florence-2 Model,Corner Visualization,Google Gemini,Google Gemini,Stability AI Image Generation,QR Code Generator,Dot Visualization,Local File Sink,Instance Segmentation Model,Stability AI Inpainting,Webhook Sink,Instance Segmentation Model,Image Threshold,OpenAI,Perspective Correction,VLM as Classifier,Trace Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
OpenAI in version v4 has.
Bindings
-
input
images(image): The image to infer on..prompt(string): Text prompt to the OpenAI model.classes(list_of_values): List of classes to be used.api_key(Union[string,ROBOFLOW_MANAGED_KEY,secret]): Your OpenAI API key.model_version(string): Model to be used.reasoning_effort(string): Control the effort on reasoning. Reducing reasoning effort can result in faster responses and fewer tokens used. GPT-5.1 defaults to 'none' (no reasoning) and supports 'none', 'low', 'medium', 'high'. GPT-5 models default to 'medium' and support 'minimal', 'low', 'medium', 'high'..image_detail(string): Indicates the image's quality, with 'high' suggesting it is of high resolution and should be processed or displayed with high fidelity..temperature(float): Temperature to sample from the model - value in range 0.0-2.0, the higher - the more random / "creative" the generations are..
-
output
output(Union[string,language_model_output]): String value ifstringor LLM / VLM output iflanguage_model_output.classes(list_of_values): List of values of any type.
Example JSON definition of step OpenAI in version v4
{
"name": "<your_step_name_here>",
"type": "roboflow_core/open_ai@v4",
"images": "$inputs.image",
"task_type": "<block_does_not_provide_example>",
"prompt": "my prompt",
"output_structure": {
"my_key": "description"
},
"classes": [
"class-a",
"class-b"
],
"api_key": "xxx-xxx",
"model_version": "gpt-5.1",
"reasoning_effort": "<block_does_not_provide_example>",
"image_detail": "auto",
"max_tokens": "<block_does_not_provide_example>",
"temperature": "<block_does_not_provide_example>",
"max_concurrent_requests": "<block_does_not_provide_example>"
}
v3¶
Class: OpenAIBlockV3 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.openai.v3.OpenAIBlockV3
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Ask a question to OpenAI's GPT models with vision capabilities (including GPT-5 and GPT-4o).
You can specify arbitrary text prompts or predefined ones, the block supports the following types of prompt:
-
Open Prompt (
unconstrained) - Use any prompt to generate a raw response -
Text Recognition (OCR) (
ocr) - Model recognizes text in the image -
Visual Question Answering (
visual-question-answering) - Model answers the question you submit in the prompt -
Captioning (short) (
caption) - Model provides a short description of the image -
Captioning (
detailed-caption) - Model provides a long description of the image -
Single-Label Classification (
classification) - Model classifies the image content as one of the provided classes -
Multi-Label Classification (
multi-label-classification) - Model classifies the image content as one or more of the provided classes -
Structured Output Generation (
structured-answering) - Model returns a JSON response with the specified fields
Provide your OpenAI API key or set the value to rf_key:account (or
rf_key:user:<id>) to proxy requests through Roboflow's API.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/open_ai@v3to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
task_type |
str |
Task type to be performed by model. Value determines required parameters and output response.. | ❌ |
prompt |
str |
Text prompt to the OpenAI model. | ✅ |
output_structure |
Dict[str, str] |
Dictionary with structure of expected JSON response. | ❌ |
classes |
List[str] |
List of classes to be used. | ✅ |
api_key |
str |
Your OpenAI API key. | ✅ |
model_version |
str |
Model to be used. | ✅ |
image_detail |
str |
Indicates the image's quality, with 'high' suggesting it is of high resolution and should be processed or displayed with high fidelity.. | ✅ |
max_tokens |
int |
Maximum number of tokens the model can generate in it's response.. | ❌ |
temperature |
float |
Temperature to sample from the model - value in range 0.0-2.0, the higher - the more random / "creative" the generations are.. | ✅ |
max_concurrent_requests |
int |
Number of concurrent requests that can be executed by block when batch of input images provided. If not given - block defaults to value configured globally in Workflows Execution Engine. Please restrict if you hit OpenAI limits.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to OpenAI in version v3.
- inputs:
Google Vision OCR,Label Visualization,LMM For Classification,Blur Visualization,Background Color Visualization,Contrast Equalization,Bounding Box Visualization,Keypoint Visualization,Stability AI Outpainting,Reference Path Visualization,Image Slicer,Pixelate Visualization,Single-Label Classification Model,Clip Comparison,CSV Formatter,Image Preprocessing,Color Visualization,SIFT Comparison,Object Detection Model,Email Notification,Anthropic Claude,Circle Visualization,Image Contours,Polygon Zone Visualization,Ellipse Visualization,Clip Comparison,Email Notification,VLM as Classifier,Model Monitoring Inference Aggregator,OCR Model,Absolute Static Crop,Depth Estimation,LMM,Morphological Transformation,Roboflow Dataset Upload,Gaze Detection,Crop Visualization,OpenAI,Image Convert Grayscale,Florence-2 Model,CogVLM,Roboflow Custom Metadata,VLM as Detector,Classification Label Visualization,Buffer,Stitch OCR Detections,Keypoint Detection Model,Cosine Similarity,Camera Calibration,Polygon Visualization,Icon Visualization,Identify Changes,Triangle Visualization,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Corner Visualization,Florence-2 Model,Google Gemini,Google Gemini,EasyOCR,Line Counter Visualization,Grid Visualization,Halo Visualization,Size Measurement,Stability AI Image Generation,QR Code Generator,Dynamic Zone,Twilio SMS Notification,Relative Static Crop,Dot Visualization,Llama 3.2 Vision,Image Blur,Slack Notification,Dimension Collapse,OpenAI,Local File Sink,Multi-Label Classification Model,Image Slicer,OpenAI,Stability AI Inpainting,Dynamic Crop,Camera Focus,Webhook Sink,Image Threshold,Instance Segmentation Model,Perspective Correction,Mask Visualization,Trace Visualization,OpenAI,Stitch Images,SIFT - outputs:
Label Visualization,Background Color Visualization,Contrast Equalization,Reference Path Visualization,Stability AI Outpainting,Clip Comparison,Perception Encoder Embedding Model,Seg Preview,Image Preprocessing,Color Visualization,SIFT Comparison,Email Notification,Cache Set,Circle Visualization,Object Detection Model,Moondream2,VLM as Classifier,Model Monitoring Inference Aggregator,Path Deviation,LMM,Time in Zone,Morphological Transformation,Detections Consensus,Crop Visualization,OpenAI,Florence-2 Model,Classification Label Visualization,Segment Anything 2 Model,Time in Zone,YOLO-World Model,PTZ Tracking (ONVIF).md),Icon Visualization,Distance Measurement,VLM as Detector,Line Counter Visualization,Grid Visualization,Halo Visualization,Size Measurement,Twilio SMS Notification,Time in Zone,Detections Stitch,Llama 3.2 Vision,Image Blur,Slack Notification,OpenAI,OpenAI,Dynamic Crop,Pixel Color Count,Mask Visualization,Google Vision OCR,LMM For Classification,Keypoint Visualization,Bounding Box Visualization,SAM 3,SAM 3,Object Detection Model,Path Deviation,Anthropic Claude,Polygon Zone Visualization,Ellipse Visualization,Line Counter,Email Notification,Clip Comparison,Roboflow Dataset Upload,SAM 3,CogVLM,Roboflow Custom Metadata,VLM as Detector,Buffer,Stitch OCR Detections,Keypoint Detection Model,Keypoint Detection Model,Line Counter,JSON Parser,Polygon Visualization,CLIP Embedding Model,Detections Classes Replacement,Cache Get,Triangle Visualization,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Florence-2 Model,Corner Visualization,Google Gemini,Google Gemini,Stability AI Image Generation,QR Code Generator,Dot Visualization,Local File Sink,Instance Segmentation Model,Stability AI Inpainting,Webhook Sink,Instance Segmentation Model,Image Threshold,OpenAI,Perspective Correction,VLM as Classifier,Trace Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
OpenAI in version v3 has.
Bindings
-
input
images(image): The image to infer on..prompt(string): Text prompt to the OpenAI model.classes(list_of_values): List of classes to be used.api_key(Union[string,ROBOFLOW_MANAGED_KEY,secret]): Your OpenAI API key.model_version(string): Model to be used.image_detail(string): Indicates the image's quality, with 'high' suggesting it is of high resolution and should be processed or displayed with high fidelity..temperature(float): Temperature to sample from the model - value in range 0.0-2.0, the higher - the more random / "creative" the generations are..
-
output
output(Union[string,language_model_output]): String value ifstringor LLM / VLM output iflanguage_model_output.classes(list_of_values): List of values of any type.
Example JSON definition of step OpenAI in version v3
{
"name": "<your_step_name_here>",
"type": "roboflow_core/open_ai@v3",
"images": "$inputs.image",
"task_type": "<block_does_not_provide_example>",
"prompt": "my prompt",
"output_structure": {
"my_key": "description"
},
"classes": [
"class-a",
"class-b"
],
"api_key": "xxx-xxx",
"model_version": "gpt-5",
"image_detail": "auto",
"max_tokens": "<block_does_not_provide_example>",
"temperature": "<block_does_not_provide_example>",
"max_concurrent_requests": "<block_does_not_provide_example>"
}
v2¶
Class: OpenAIBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.openai.v2.OpenAIBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Ask a question to OpenAI's GPT models with vision capabilities (including GPT-4o and GPT-5).
You can specify arbitrary text prompts or predefined ones, the block supports the following types of prompt:
-
Open Prompt (
unconstrained) - Use any prompt to generate a raw response -
Text Recognition (OCR) (
ocr) - Model recognizes text in the image -
Visual Question Answering (
visual-question-answering) - Model answers the question you submit in the prompt -
Captioning (short) (
caption) - Model provides a short description of the image -
Captioning (
detailed-caption) - Model provides a long description of the image -
Single-Label Classification (
classification) - Model classifies the image content as one of the provided classes -
Multi-Label Classification (
multi-label-classification) - Model classifies the image content as one or more of the provided classes -
Structured Output Generation (
structured-answering) - Model returns a JSON response with the specified fields
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/open_ai@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
task_type |
str |
Task type to be performed by model. Value determines required parameters and output response.. | ❌ |
prompt |
str |
Text prompt to the OpenAI model. | ✅ |
output_structure |
Dict[str, str] |
Dictionary with structure of expected JSON response. | ❌ |
classes |
List[str] |
List of classes to be used. | ✅ |
api_key |
str |
Your OpenAI API key. | ✅ |
model_version |
str |
Model to be used. | ✅ |
image_detail |
str |
Indicates the image's quality, with 'high' suggesting it is of high resolution and should be processed or displayed with high fidelity.. | ✅ |
max_tokens |
int |
Maximum number of tokens the model can generate in it's response.. | ❌ |
temperature |
float |
Temperature to sample from the model - value in range 0.0-2.0, the higher - the more random / "creative" the generations are.. | ✅ |
max_concurrent_requests |
int |
Number of concurrent requests that can be executed by block when batch of input images provided. If not given - block defaults to value configured globally in Workflows Execution Engine. Please restrict if you hit OpenAI limits.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to OpenAI in version v2.
- inputs:
Google Vision OCR,Label Visualization,LMM For Classification,Blur Visualization,Background Color Visualization,Contrast Equalization,Bounding Box Visualization,Keypoint Visualization,Stability AI Outpainting,Reference Path Visualization,Image Slicer,Pixelate Visualization,Single-Label Classification Model,Clip Comparison,CSV Formatter,Image Preprocessing,Color Visualization,SIFT Comparison,Object Detection Model,Email Notification,Anthropic Claude,Circle Visualization,Image Contours,Polygon Zone Visualization,Ellipse Visualization,Clip Comparison,Email Notification,VLM as Classifier,Model Monitoring Inference Aggregator,OCR Model,Absolute Static Crop,Depth Estimation,LMM,Morphological Transformation,Roboflow Dataset Upload,Gaze Detection,Crop Visualization,OpenAI,Image Convert Grayscale,Florence-2 Model,CogVLM,Roboflow Custom Metadata,VLM as Detector,Classification Label Visualization,Buffer,Stitch OCR Detections,Keypoint Detection Model,Cosine Similarity,Camera Calibration,Polygon Visualization,Icon Visualization,Identify Changes,Triangle Visualization,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Corner Visualization,Florence-2 Model,Google Gemini,Google Gemini,EasyOCR,Line Counter Visualization,Grid Visualization,Halo Visualization,Size Measurement,Stability AI Image Generation,QR Code Generator,Dynamic Zone,Twilio SMS Notification,Relative Static Crop,Dot Visualization,Llama 3.2 Vision,Image Blur,Slack Notification,Dimension Collapse,OpenAI,Local File Sink,Multi-Label Classification Model,Image Slicer,OpenAI,Stability AI Inpainting,Dynamic Crop,Camera Focus,Webhook Sink,Image Threshold,Instance Segmentation Model,Perspective Correction,Mask Visualization,Trace Visualization,OpenAI,Stitch Images,SIFT - outputs:
Label Visualization,Background Color Visualization,Contrast Equalization,Reference Path Visualization,Stability AI Outpainting,Clip Comparison,Perception Encoder Embedding Model,Seg Preview,Image Preprocessing,Color Visualization,SIFT Comparison,Email Notification,Cache Set,Circle Visualization,Object Detection Model,Moondream2,VLM as Classifier,Model Monitoring Inference Aggregator,Path Deviation,LMM,Time in Zone,Morphological Transformation,Detections Consensus,Crop Visualization,OpenAI,Florence-2 Model,Classification Label Visualization,Segment Anything 2 Model,Time in Zone,YOLO-World Model,PTZ Tracking (ONVIF).md),Icon Visualization,Distance Measurement,VLM as Detector,Line Counter Visualization,Grid Visualization,Halo Visualization,Size Measurement,Twilio SMS Notification,Time in Zone,Detections Stitch,Llama 3.2 Vision,Image Blur,Slack Notification,OpenAI,OpenAI,Dynamic Crop,Pixel Color Count,Mask Visualization,Google Vision OCR,LMM For Classification,Keypoint Visualization,Bounding Box Visualization,SAM 3,SAM 3,Object Detection Model,Path Deviation,Anthropic Claude,Polygon Zone Visualization,Ellipse Visualization,Line Counter,Email Notification,Clip Comparison,Roboflow Dataset Upload,SAM 3,CogVLM,Roboflow Custom Metadata,VLM as Detector,Buffer,Stitch OCR Detections,Keypoint Detection Model,Keypoint Detection Model,Line Counter,JSON Parser,Polygon Visualization,CLIP Embedding Model,Detections Classes Replacement,Cache Get,Triangle Visualization,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Florence-2 Model,Corner Visualization,Google Gemini,Google Gemini,Stability AI Image Generation,QR Code Generator,Dot Visualization,Local File Sink,Instance Segmentation Model,Stability AI Inpainting,Webhook Sink,Instance Segmentation Model,Image Threshold,OpenAI,Perspective Correction,VLM as Classifier,Trace Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
OpenAI in version v2 has.
Bindings
-
input
images(image): The image to infer on..prompt(string): Text prompt to the OpenAI model.classes(list_of_values): List of classes to be used.api_key(Union[string,secret]): Your OpenAI API key.model_version(string): Model to be used.image_detail(string): Indicates the image's quality, with 'high' suggesting it is of high resolution and should be processed or displayed with high fidelity..temperature(float): Temperature to sample from the model - value in range 0.0-2.0, the higher - the more random / "creative" the generations are..
-
output
output(Union[string,language_model_output]): String value ifstringor LLM / VLM output iflanguage_model_output.classes(list_of_values): List of values of any type.
Example JSON definition of step OpenAI in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/open_ai@v2",
"images": "$inputs.image",
"task_type": "<block_does_not_provide_example>",
"prompt": "my prompt",
"output_structure": {
"my_key": "description"
},
"classes": [
"class-a",
"class-b"
],
"api_key": "xxx-xxx",
"model_version": "gpt-4o",
"image_detail": "auto",
"max_tokens": "<block_does_not_provide_example>",
"temperature": "<block_does_not_provide_example>",
"max_concurrent_requests": "<block_does_not_provide_example>"
}
v1¶
Class: OpenAIBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.openai.v1.OpenAIBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Ask a question to OpenAI's GPT-4 with Vision model.
You can specify arbitrary text prompts to the OpenAIBlock.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/open_ai@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the OpenAI model. | ✅ |
openai_api_key |
str |
Your OpenAI API key. | ✅ |
openai_model |
str |
Model to be used. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
image_detail |
str |
Indicates the image's quality, with 'high' suggesting it is of high resolution and should be processed or displayed with high fidelity.. | ✅ |
max_tokens |
int |
Maximum number of tokens the model can generate in it's response.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to OpenAI in version v1.
- inputs:
Google Vision OCR,Label Visualization,LMM For Classification,Blur Visualization,Background Color Visualization,Contrast Equalization,Bounding Box Visualization,Keypoint Visualization,Stability AI Outpainting,Reference Path Visualization,Image Slicer,Pixelate Visualization,Single-Label Classification Model,CSV Formatter,Image Preprocessing,Color Visualization,SIFT Comparison,Object Detection Model,Email Notification,Anthropic Claude,Circle Visualization,Image Contours,Polygon Zone Visualization,Ellipse Visualization,Clip Comparison,Email Notification,VLM as Classifier,Model Monitoring Inference Aggregator,OCR Model,Absolute Static Crop,Depth Estimation,LMM,Morphological Transformation,Roboflow Dataset Upload,Crop Visualization,OpenAI,Image Convert Grayscale,Florence-2 Model,CogVLM,Roboflow Custom Metadata,VLM as Detector,Classification Label Visualization,Stitch OCR Detections,Keypoint Detection Model,Camera Calibration,Polygon Visualization,Icon Visualization,Triangle Visualization,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Corner Visualization,Florence-2 Model,Google Gemini,Google Gemini,EasyOCR,Line Counter Visualization,Grid Visualization,Halo Visualization,Stability AI Image Generation,QR Code Generator,Twilio SMS Notification,Relative Static Crop,Dot Visualization,Llama 3.2 Vision,Image Blur,Slack Notification,OpenAI,Local File Sink,Multi-Label Classification Model,Image Slicer,OpenAI,Stability AI Inpainting,Dynamic Crop,Camera Focus,Webhook Sink,Image Threshold,Instance Segmentation Model,Perspective Correction,Mask Visualization,Trace Visualization,OpenAI,Stitch Images,SIFT - outputs:
Label Visualization,Blur Visualization,Background Color Visualization,Contrast Equalization,Reference Path Visualization,Detections Filter,Stability AI Outpainting,Image Slicer,Pixelate Visualization,Single-Label Classification Model,Clip Comparison,CSV Formatter,Perception Encoder Embedding Model,Seg Preview,Overlap Filter,Image Preprocessing,Rate Limiter,Color Visualization,SIFT Comparison,Email Notification,Cache Set,Dominant Color,Property Definition,Circle Visualization,Object Detection Model,QR Code Detection,Moondream2,VLM as Classifier,Model Monitoring Inference Aggregator,OCR Model,Absolute Static Crop,Path Deviation,LMM,Time in Zone,Morphological Transformation,Gaze Detection,Detections Consensus,Crop Visualization,OpenAI,Florence-2 Model,Barcode Detection,Classification Label Visualization,Byte Tracker,Segment Anything 2 Model,Cosine Similarity,SIFT Comparison,Time in Zone,YOLO-World Model,PTZ Tracking (ONVIF).md),Detection Offset,Icon Visualization,Detections Transformation,Distance Measurement,Data Aggregator,VLM as Detector,Line Counter Visualization,Grid Visualization,Halo Visualization,Size Measurement,Dynamic Zone,Twilio SMS Notification,Time in Zone,Detections Stitch,Llama 3.2 Vision,Image Blur,Slack Notification,Velocity,OpenAI,Byte Tracker,First Non Empty Or Default,Multi-Label Classification Model,Image Slicer,OpenAI,Dynamic Crop,Pixel Color Count,Mask Visualization,Detections Merge,Stitch Images,SIFT,Google Vision OCR,LMM For Classification,Keypoint Visualization,Bounding Box Visualization,SAM 3,Byte Tracker,SAM 3,Qwen2.5-VL,Object Detection Model,Path Deviation,Detections Combine,Anthropic Claude,Image Contours,Polygon Zone Visualization,Ellipse Visualization,Line Counter,Email Notification,Clip Comparison,Expression,Depth Estimation,Roboflow Dataset Upload,SAM 3,CogVLM,Roboflow Custom Metadata,VLM as Detector,Multi-Label Classification Model,Image Convert Grayscale,Buffer,Stitch OCR Detections,Keypoint Detection Model,Bounding Rectangle,Keypoint Detection Model,Line Counter,JSON Parser,Polygon Visualization,CLIP Embedding Model,Camera Calibration,Detections Classes Replacement,Cache Get,Identify Changes,Triangle Visualization,Template Matching,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Florence-2 Model,Corner Visualization,Google Gemini,Google Gemini,EasyOCR,Delta Filter,SmolVLM2,Stability AI Image Generation,Identify Outliers,QR Code Generator,Dot Visualization,Relative Static Crop,Dimension Collapse,Continue If,Local File Sink,Instance Segmentation Model,Stability AI Inpainting,Single-Label Classification Model,Camera Focus,Detections Stabilizer,Webhook Sink,Instance Segmentation Model,Image Threshold,OpenAI,Perspective Correction,VLM as Classifier,Trace Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
OpenAI in version v1 has.
Bindings
-
input
images(image): The image to infer on..prompt(string): Text prompt to the OpenAI model.openai_api_key(Union[string,secret]): Your OpenAI API key.openai_model(string): Model to be used.image_detail(string): Indicates the image's quality, with 'high' suggesting it is of high resolution and should be processed or displayed with high fidelity..
-
output
parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.structured_output(dictionary): Dictionary.raw_output(string): String value.*(*): Equivalent of any element.
Example JSON definition of step OpenAI in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/open_ai@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"openai_api_key": "xxx-xxx",
"openai_model": "gpt-4o",
"json_output_format": {
"count": "number of cats in the picture"
},
"image_detail": "auto",
"max_tokens": 450
}