Anthropic Claude¶
v2¶
Class: AnthropicClaudeBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.anthropic_claude.v2.AnthropicClaudeBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Ask a question to Anthropic Claude model with vision capabilities.
You can specify arbitrary text prompts or predefined ones, the block supports the following types of prompt:
-
Open Prompt (
unconstrained) - Use any prompt to generate a raw response -
Text Recognition (OCR) (
ocr) - Model recognizes text in the image -
Visual Question Answering (
visual-question-answering) - Model answers the question you submit in the prompt -
Captioning (short) (
caption) - Model provides a short description of the image -
Captioning (
detailed-caption) - Model provides a long description of the image -
Single-Label Classification (
classification) - Model classifies the image content as one of the provided classes -
Multi-Label Classification (
multi-label-classification) - Model classifies the image content as one or more of the provided classes -
Unprompted Object Detection (
object-detection) - Model detects and returns the bounding boxes for prominent objects in the image -
Structured Output Generation (
structured-answering) - Model returns a JSON response with the specified fields
You need to provide your Anthropic API key to use the Claude model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/anthropic_claude@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
task_type |
str |
Task type to be performed by model. Value determines required parameters and output response.. | ❌ |
prompt |
str |
Text prompt to the Claude model. | ✅ |
output_structure |
Dict[str, str] |
Dictionary with structure of expected JSON response. | ❌ |
classes |
List[str] |
List of classes to be used. | ✅ |
api_key |
str |
Your Anthropic API key. | ✅ |
model_version |
str |
Model to be used. | ✅ |
extended_thinking |
bool |
Enable extended thinking for deeper reasoning on complex tasks. Note: temperature cannot be used when extended thinking is enabled.. | ❌ |
thinking_budget_tokens |
int |
Maximum number of tokens for internal thinking when extended thinking is enabled. Higher values allow deeper reasoning but increase latency and cost. Must be less than max_tokens. Minimum: 1024.. | ❌ |
max_tokens |
int |
Maximum number of tokens the model can generate in its response.. | ❌ |
temperature |
float |
Temperature to sample from the model - value in range 0.0-1.0, the higher - the more random / "creative" the generations are. Cannot be used when extended_thinking is enabled.. | ✅ |
max_image_size |
int |
Maximum size of the image - if input has larger side, it will be downscaled, keeping aspect ratio. | ✅ |
max_concurrent_requests |
int |
Number of concurrent requests that can be executed by block when batch of input images provided. If not given - block defaults to value configured globally in Workflows Execution Engine. Please restrict if you hit Anthropic API limits.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Anthropic Claude in version v2.
- inputs:
Line Counter,Llama 3.2 Vision,Cosine Similarity,Blur Visualization,Dimension Collapse,Perspective Correction,Polygon Zone Visualization,Bounding Box Visualization,QR Code Generator,Pixelate Visualization,Distance Measurement,Trace Visualization,Roboflow Custom Metadata,Image Threshold,Polygon Visualization,Dynamic Crop,Icon Visualization,Image Slicer,Stability AI Outpainting,Model Comparison Visualization,Dynamic Zone,Clip Comparison,LMM,OpenAI,Classification Label Visualization,Stitch Images,Florence-2 Model,Mask Visualization,Single-Label Classification Model,Size Measurement,Relative Static Crop,Absolute Static Crop,SIFT Comparison,Google Gemini,Circle Visualization,Florence-2 Model,LMM For Classification,Ellipse Visualization,Image Convert Grayscale,Object Detection Model,OCR Model,Image Preprocessing,Color Visualization,Image Blur,Stability AI Image Generation,Anthropic Claude,Google Vision OCR,Keypoint Visualization,Camera Calibration,Local File Sink,EasyOCR,Image Slicer,Email Notification,VLM as Detector,Line Counter,Roboflow Dataset Upload,Gaze Detection,Background Color Visualization,Triangle Visualization,Slack Notification,Keypoint Detection Model,Halo Visualization,Corner Visualization,Google Gemini,Model Monitoring Inference Aggregator,Roboflow Dataset Upload,Dot Visualization,Image Contours,Multi-Label Classification Model,Twilio SMS Notification,Instance Segmentation Model,VLM as Classifier,CSV Formatter,Reference Path Visualization,Morphological Transformation,Motion Detection,OpenAI,Webhook Sink,Contrast Equalization,Camera Focus,Stitch OCR Detections,Stability AI Inpainting,CogVLM,Clip Comparison,Line Counter Visualization,Identify Changes,Template Matching,Email Notification,Crop Visualization,Grid Visualization,OpenAI,Buffer,SIFT,Depth Estimation,Background Subtraction,Label Visualization,Anthropic Claude,Pixel Color Count,SIFT Comparison,OpenAI - outputs:
Llama 3.2 Vision,SAM 3,Polygon Zone Visualization,Distance Measurement,Trace Visualization,Roboflow Custom Metadata,Image Threshold,Icon Visualization,Stability AI Outpainting,Model Comparison Visualization,Clip Comparison,Cache Get,Size Measurement,Florence-2 Model,SAM 3,SIFT Comparison,Moondream2,Florence-2 Model,LMM For Classification,Anthropic Claude,Image Blur,Stability AI Image Generation,Local File Sink,VLM as Detector,Keypoint Detection Model,Background Color Visualization,Keypoint Detection Model,Google Gemini,Model Monitoring Inference Aggregator,Roboflow Dataset Upload,Instance Segmentation Model,VLM as Classifier,Morphological Transformation,Motion Detection,OpenAI,YOLO-World Model,JSON Parser,Clip Comparison,CogVLM,Path Deviation,CLIP Embedding Model,Crop Visualization,Grid Visualization,Buffer,SAM 3,Anthropic Claude,Time in Zone,OpenAI,Line Counter,Path Deviation,Detections Consensus,Perception Encoder Embedding Model,Perspective Correction,Bounding Box Visualization,QR Code Generator,Segment Anything 2 Model,Polygon Visualization,Dynamic Crop,LMM,OpenAI,Classification Label Visualization,Mask Visualization,Time in Zone,Google Gemini,Circle Visualization,Time in Zone,Ellipse Visualization,Object Detection Model,Image Preprocessing,Color Visualization,Google Vision OCR,Keypoint Visualization,Line Counter,Email Notification,VLM as Detector,Roboflow Dataset Upload,Triangle Visualization,Slack Notification,Halo Visualization,Object Detection Model,Corner Visualization,Dot Visualization,Twilio SMS Notification,Seg Preview,Reference Path Visualization,Webhook Sink,PTZ Tracking (ONVIF).md),Detections Classes Replacement,Instance Segmentation Model,Detections Stitch,Contrast Equalization,Stitch OCR Detections,Stability AI Inpainting,Line Counter Visualization,Cache Set,Email Notification,VLM as Classifier,OpenAI,Label Visualization,Pixel Color Count
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Anthropic Claude in version v2 has.
Bindings
-
input
images(image): The image to infer on..prompt(string): Text prompt to the Claude model.classes(list_of_values): List of classes to be used.api_key(Union[secret,string]): Your Anthropic API key.model_version(string): Model to be used.temperature(float): Temperature to sample from the model - value in range 0.0-1.0, the higher - the more random / "creative" the generations are. Cannot be used when extended_thinking is enabled..max_image_size(integer): Maximum size of the image - if input has larger side, it will be downscaled, keeping aspect ratio.
-
output
output(Union[string,language_model_output]): String value ifstringor LLM / VLM output iflanguage_model_output.classes(list_of_values): List of values of any type.
Example JSON definition of step Anthropic Claude in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/anthropic_claude@v2",
"images": "$inputs.image",
"task_type": "<block_does_not_provide_example>",
"prompt": "my prompt",
"output_structure": {
"my_key": "description"
},
"classes": [
"class-a",
"class-b"
],
"api_key": "xxx-xxx",
"model_version": "claude-sonnet-4-5",
"extended_thinking": "<block_does_not_provide_example>",
"thinking_budget_tokens": "<block_does_not_provide_example>",
"max_tokens": "<block_does_not_provide_example>",
"temperature": "<block_does_not_provide_example>",
"max_image_size": "<block_does_not_provide_example>",
"max_concurrent_requests": "<block_does_not_provide_example>"
}
v1¶
Class: AnthropicClaudeBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.anthropic_claude.v1.AnthropicClaudeBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Ask a question to Anthropic Claude model with vision capabilities.
You can specify arbitrary text prompts or predefined ones, the block supports the following types of prompt:
-
Open Prompt (
unconstrained) - Use any prompt to generate a raw response -
Text Recognition (OCR) (
ocr) - Model recognizes text in the image -
Visual Question Answering (
visual-question-answering) - Model answers the question you submit in the prompt -
Captioning (short) (
caption) - Model provides a short description of the image -
Captioning (
detailed-caption) - Model provides a long description of the image -
Single-Label Classification (
classification) - Model classifies the image content as one of the provided classes -
Multi-Label Classification (
multi-label-classification) - Model classifies the image content as one or more of the provided classes -
Unprompted Object Detection (
object-detection) - Model detects and returns the bounding boxes for prominent objects in the image -
Structured Output Generation (
structured-answering) - Model returns a JSON response with the specified fields
You need to provide your Anthropic API key to use the Claude model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/anthropic_claude@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
task_type |
str |
Task type to be performed by model. Value determines required parameters and output response.. | ❌ |
prompt |
str |
Text prompt to the Claude model. | ✅ |
output_structure |
Dict[str, str] |
Dictionary with structure of expected JSON response. | ❌ |
classes |
List[str] |
List of classes to be used. | ✅ |
api_key |
str |
Your Anthropic API key. | ✅ |
model_version |
str |
Model to be used. | ✅ |
max_tokens |
int |
Maximum number of tokens the model can generate in it's response.. | ❌ |
temperature |
float |
Temperature to sample from the model - value in range 0.0-2.0, the higher - the more random / "creative" the generations are.. | ✅ |
max_image_size |
int |
Maximum size of the image - if input has larger side, it will be downscaled, keeping aspect ratio. | ✅ |
max_concurrent_requests |
int |
Number of concurrent requests that can be executed by block when batch of input images provided. If not given - block defaults to value configured globally in Workflows Execution Engine. Please restrict if you hit Anthropic API limits.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Anthropic Claude in version v1.
- inputs:
Line Counter,Llama 3.2 Vision,Cosine Similarity,Blur Visualization,Dimension Collapse,Perspective Correction,Polygon Zone Visualization,Bounding Box Visualization,QR Code Generator,Pixelate Visualization,Distance Measurement,Trace Visualization,Roboflow Custom Metadata,Image Threshold,Polygon Visualization,Dynamic Crop,Icon Visualization,Image Slicer,Stability AI Outpainting,Model Comparison Visualization,Dynamic Zone,Clip Comparison,LMM,OpenAI,Classification Label Visualization,Stitch Images,Florence-2 Model,Mask Visualization,Single-Label Classification Model,Size Measurement,Relative Static Crop,Absolute Static Crop,SIFT Comparison,Google Gemini,Circle Visualization,Florence-2 Model,LMM For Classification,Ellipse Visualization,Image Convert Grayscale,Object Detection Model,OCR Model,Image Preprocessing,Color Visualization,Image Blur,Stability AI Image Generation,Anthropic Claude,Google Vision OCR,Keypoint Visualization,Camera Calibration,Local File Sink,EasyOCR,Image Slicer,Email Notification,VLM as Detector,Line Counter,Roboflow Dataset Upload,Gaze Detection,Background Color Visualization,Triangle Visualization,Slack Notification,Keypoint Detection Model,Halo Visualization,Corner Visualization,Google Gemini,Model Monitoring Inference Aggregator,Roboflow Dataset Upload,Dot Visualization,Image Contours,Multi-Label Classification Model,Twilio SMS Notification,Instance Segmentation Model,VLM as Classifier,CSV Formatter,Reference Path Visualization,Morphological Transformation,Motion Detection,OpenAI,Webhook Sink,Contrast Equalization,Camera Focus,Stitch OCR Detections,Stability AI Inpainting,CogVLM,Clip Comparison,Line Counter Visualization,Identify Changes,Template Matching,Email Notification,Crop Visualization,Grid Visualization,OpenAI,Buffer,SIFT,Depth Estimation,Background Subtraction,Label Visualization,Anthropic Claude,Pixel Color Count,SIFT Comparison,OpenAI - outputs:
Llama 3.2 Vision,SAM 3,Polygon Zone Visualization,Distance Measurement,Trace Visualization,Roboflow Custom Metadata,Image Threshold,Icon Visualization,Stability AI Outpainting,Model Comparison Visualization,Clip Comparison,Cache Get,Size Measurement,Florence-2 Model,SAM 3,SIFT Comparison,Moondream2,Florence-2 Model,LMM For Classification,Anthropic Claude,Image Blur,Stability AI Image Generation,Local File Sink,VLM as Detector,Keypoint Detection Model,Background Color Visualization,Keypoint Detection Model,Google Gemini,Model Monitoring Inference Aggregator,Roboflow Dataset Upload,Instance Segmentation Model,VLM as Classifier,Morphological Transformation,Motion Detection,OpenAI,YOLO-World Model,JSON Parser,Clip Comparison,CogVLM,Path Deviation,CLIP Embedding Model,Crop Visualization,Grid Visualization,Buffer,SAM 3,Anthropic Claude,Time in Zone,OpenAI,Line Counter,Path Deviation,Detections Consensus,Perception Encoder Embedding Model,Perspective Correction,Bounding Box Visualization,QR Code Generator,Segment Anything 2 Model,Polygon Visualization,Dynamic Crop,LMM,OpenAI,Classification Label Visualization,Mask Visualization,Time in Zone,Google Gemini,Circle Visualization,Time in Zone,Ellipse Visualization,Object Detection Model,Image Preprocessing,Color Visualization,Google Vision OCR,Keypoint Visualization,Line Counter,Email Notification,VLM as Detector,Roboflow Dataset Upload,Triangle Visualization,Slack Notification,Halo Visualization,Object Detection Model,Corner Visualization,Dot Visualization,Twilio SMS Notification,Seg Preview,Reference Path Visualization,Webhook Sink,PTZ Tracking (ONVIF).md),Detections Classes Replacement,Instance Segmentation Model,Detections Stitch,Contrast Equalization,Stitch OCR Detections,Stability AI Inpainting,Line Counter Visualization,Cache Set,Email Notification,VLM as Classifier,OpenAI,Label Visualization,Pixel Color Count
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Anthropic Claude in version v1 has.
Bindings
-
input
images(image): The image to infer on..prompt(string): Text prompt to the Claude model.classes(list_of_values): List of classes to be used.api_key(Union[secret,string]): Your Anthropic API key.model_version(string): Model to be used.temperature(float): Temperature to sample from the model - value in range 0.0-2.0, the higher - the more random / "creative" the generations are..max_image_size(integer): Maximum size of the image - if input has larger side, it will be downscaled, keeping aspect ratio.
-
output
output(Union[string,language_model_output]): String value ifstringor LLM / VLM output iflanguage_model_output.classes(list_of_values): List of values of any type.
Example JSON definition of step Anthropic Claude in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/anthropic_claude@v1",
"images": "$inputs.image",
"task_type": "<block_does_not_provide_example>",
"prompt": "my prompt",
"output_structure": {
"my_key": "description"
},
"classes": [
"class-a",
"class-b"
],
"api_key": "xxx-xxx",
"model_version": "claude-sonnet-4",
"max_tokens": "<block_does_not_provide_example>",
"temperature": "<block_does_not_provide_example>",
"max_image_size": "<block_does_not_provide_example>",
"max_concurrent_requests": "<block_does_not_provide_example>"
}