Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v2.
- inputs:
Roboflow Dataset Upload,Line Counter Visualization,Stability AI Outpainting,Google Gemma API,Object Detection Model,Email Notification,Image Slicer,OCR Model,Google Vision OCR,Image Preprocessing,Google Gemini,Instance Segmentation Model,EasyOCR,Color Visualization,OpenAI,Ellipse Visualization,Polygon Visualization,Anthropic Claude,Relative Static Crop,Webhook Sink,Model Comparison Visualization,Trace Visualization,Stitch OCR Detections,Camera Focus,Qwen 3.5 API,Roboflow Custom Metadata,OpenAI,Buffer,Single-Label Classification Model,Detections List Roll-Up,Size Measurement,Image Threshold,VLM As Classifier,Stitch Images,Heatmap Visualization,Qwen 3.6 API,SIFT Comparison,Morphological Transformation,Florence-2 Model,Halo Visualization,CogVLM,Crop Visualization,Camera Calibration,Florence-2 Model,GLM-OCR,Dot Visualization,S3 Sink,Twilio SMS Notification,Icon Visualization,Model Monitoring Inference Aggregator,Google Gemini,Local File Sink,Roboflow Dataset Upload,Dynamic Zone,Clip Comparison,Image Contours,Pixelate Visualization,Twilio SMS/MMS Notification,Polygon Zone Visualization,Reference Path Visualization,Dimension Collapse,Motion Detection,Blur Visualization,Anthropic Claude,Background Subtraction,Text Display,Clip Comparison,CSV Formatter,VLM As Detector,LMM,Stability AI Image Generation,Perspective Correction,Anthropic Claude,Bounding Box Visualization,Depth Estimation,Classification Label Visualization,Image Slicer,Absolute Static Crop,Image Blur,Stability AI Inpainting,Multi-Label Classification Model,Polygon Visualization,Image Convert Grayscale,SIFT,Roboflow Vision Events,OpenAI,Google Gemini,Label Visualization,Corner Visualization,Grid Visualization,Dynamic Crop,Contrast Equalization,Keypoint Visualization,Triangle Visualization,Qwen3.5-VL,QR Code Generator,Halo Visualization,Circle Visualization,Camera Focus,Mask Visualization,LMM For Classification,Morphological Transformation,OpenAI,Contrast Enhancement,MoonshotAI Kimi,Keypoint Detection Model,Llama 3.2 Vision,Background Color Visualization,Email Notification,Slack Notification,Stitch OCR Detections - outputs:
Roboflow Dataset Upload,Line Counter Visualization,Image Slicer,Instance Segmentation Model,Distance Measurement,Color Visualization,Multi-Label Classification Model,Ellipse Visualization,Polygon Visualization,ByteTrack Tracker,Single-Label Classification Model,Relative Static Crop,Byte Tracker,Detections Consensus,Detections Classes Replacement,Cache Set,Webhook Sink,Trace Visualization,Object Detection Model,Qwen 3.5 API,Stitch OCR Detections,OpenAI,Buffer,SAM 3,Size Measurement,Image Threshold,Heatmap Visualization,SORT Tracker,Florence-2 Model,Halo Visualization,Path Deviation,GLM-OCR,Dot Visualization,S3 Sink,Path Deviation,Semantic Segmentation Model,Seg Preview,Twilio SMS Notification,Model Monitoring Inference Aggregator,Google Gemini,Roboflow Dataset Upload,Clip Comparison,Dynamic Zone,VLM As Classifier,Line Counter,Twilio SMS/MMS Notification,Polygon Zone Visualization,Motion Detection,Text Display,Stability AI Image Generation,Perspective Correction,Anthropic Claude,Line Counter,Bounding Box Visualization,Depth Estimation,Stability AI Inpainting,Polygon Visualization,Roboflow Vision Events,VLM As Detector,Google Gemini,Label Visualization,Grid Visualization,Per-Class Confidence Filter,Contrast Equalization,Triangle Visualization,Halo Visualization,Circle Visualization,Segment Anything 2 Model,Mask Visualization,OpenAI,MoonshotAI Kimi,Llama 3.2 Vision,Email Notification,Slack Notification,CLIP Embedding Model,Detections Stitch,Detections Stabilizer,Object Detection Model,Email Notification,Google Gemma API,Stability AI Outpainting,Google Vision OCR,Identify Outliers,Google Gemini,Image Preprocessing,Object Detection Model,OpenAI,Byte Tracker,Anthropic Claude,Time in Zone,Model Comparison Visualization,Roboflow Custom Metadata,YOLO-World Model,Instance Segmentation Model,Perception Encoder Embedding Model,Single-Label Classification Model,Detections List Roll-Up,VLM As Classifier,Template Matching,Stitch Images,Qwen 3.6 API,SIFT Comparison,Morphological Transformation,Instance Segmentation Model,CogVLM,Crop Visualization,Florence-2 Model,Multi-Label Classification Model,Time in Zone,OC-SORT Tracker,SAM 3,Local File Sink,Icon Visualization,Keypoint Detection Model,Time in Zone,Reference Path Visualization,Anthropic Claude,Clip Comparison,VLM As Detector,LMM,Pixel Color Count,Identify Changes,Classification Label Visualization,Multi-Label Classification Model,Byte Tracker,Image Slicer,Image Blur,SAM 3,Single-Label Classification Model,OpenAI,Corner Visualization,Keypoint Detection Model,Dynamic Crop,Keypoint Visualization,Moondream2,QR Code Generator,LMM For Classification,Morphological Transformation,Keypoint Detection Model,Background Color Visualization,PTZ Tracking (ONVIF),Stitch OCR Detections,Cache Get
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v2 has.
Bindings
-
input
images(image): The image to infer on..classes(list_of_values): List of classes to calculate similarity against each input image.version(string): Variant of CLIP model.
-
output
similarities(list_of_values): List of values of any type.max_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].most_similar_class(string): String value.min_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].least_similar_class(string): String value.classification_predictions(classification_prediction): Predictions from classifier.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.
Example JSON definition of step Clip Comparison in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v1.
- inputs:
Line Counter Visualization,Stability AI Outpainting,Google Gemma API,Image Slicer,Image Preprocessing,Google Gemini,Color Visualization,OpenAI,Ellipse Visualization,Polygon Visualization,Anthropic Claude,Relative Static Crop,Model Comparison Visualization,Trace Visualization,Camera Focus,Qwen 3.5 API,Buffer,Detections List Roll-Up,Size Measurement,Image Threshold,Stitch Images,Heatmap Visualization,Qwen 3.6 API,SIFT Comparison,Morphological Transformation,Florence-2 Model,Halo Visualization,Crop Visualization,Camera Calibration,Florence-2 Model,Dot Visualization,Icon Visualization,Google Gemini,Dynamic Zone,Clip Comparison,Image Contours,Pixelate Visualization,Polygon Zone Visualization,Reference Path Visualization,Dimension Collapse,Motion Detection,Blur Visualization,Anthropic Claude,Background Subtraction,Text Display,Clip Comparison,Stability AI Image Generation,Perspective Correction,Anthropic Claude,Bounding Box Visualization,Depth Estimation,Classification Label Visualization,Image Slicer,Absolute Static Crop,Image Blur,Stability AI Inpainting,Polygon Visualization,Image Convert Grayscale,SIFT,OpenAI,Google Gemini,Label Visualization,Corner Visualization,Grid Visualization,Dynamic Crop,Contrast Equalization,Keypoint Visualization,Triangle Visualization,QR Code Generator,Halo Visualization,Circle Visualization,Camera Focus,Mask Visualization,Morphological Transformation,OpenAI,Contrast Enhancement,MoonshotAI Kimi,Llama 3.2 Vision,Background Color Visualization - outputs:
Roboflow Dataset Upload,Line Counter Visualization,Object Detection Model,Email Notification,Google Gemma API,Instance Segmentation Model,Google Gemini,Color Visualization,Object Detection Model,OpenAI,Ellipse Visualization,Polygon Visualization,Anthropic Claude,Detections Consensus,Detections Classes Replacement,Time in Zone,Cache Set,Webhook Sink,Trace Visualization,Object Detection Model,Qwen 3.5 API,YOLO-World Model,Buffer,SAM 3,Instance Segmentation Model,Detections List Roll-Up,Size Measurement,VLM As Classifier,Qwen 3.6 API,Florence-2 Model,Halo Visualization,Instance Segmentation Model,Crop Visualization,Florence-2 Model,Path Deviation,Time in Zone,Dot Visualization,Path Deviation,SAM 3,Seg Preview,Google Gemini,Roboflow Dataset Upload,Clip Comparison,VLM As Classifier,Keypoint Detection Model,Line Counter,Twilio SMS/MMS Notification,Time in Zone,Polygon Zone Visualization,Reference Path Visualization,Motion Detection,Anthropic Claude,Clip Comparison,VLM As Detector,Perspective Correction,Anthropic Claude,Line Counter,Bounding Box Visualization,Classification Label Visualization,Polygon Visualization,SAM 3,VLM As Detector,Google Gemini,Label Visualization,OpenAI,Corner Visualization,Grid Visualization,Keypoint Detection Model,Keypoint Visualization,Triangle Visualization,Halo Visualization,Circle Visualization,Mask Visualization,LMM For Classification,OpenAI,Keypoint Detection Model,MoonshotAI Kimi,Llama 3.2 Vision,Email Notification
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v1 has.
Bindings
-
input
images(image): The image to infer on..texts(list_of_values): List of texts to calculate similarity against each input image.
-
output
similarity(list_of_values): List of values of any type.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step Clip Comparison in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}