Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v2.
- inputs:
Stitch Images,Image Threshold,Email Notification,Corner Visualization,Image Blur,Ellipse Visualization,OpenAI,Roboflow Dataset Upload,Object Detection Model,Depth Estimation,Stitch OCR Detections,Dimension Collapse,Absolute Static Crop,EasyOCR,CogVLM,Google Gemini,Stability AI Image Generation,Grid Visualization,Dynamic Crop,Image Slicer,Image Preprocessing,Relative Static Crop,SIFT,Morphological Transformation,Instance Segmentation Model,Line Counter Visualization,Trace Visualization,LMM For Classification,Halo Visualization,Dot Visualization,GLM-OCR,Model Monitoring Inference Aggregator,Roboflow Custom Metadata,Pixelate Visualization,Circle Visualization,Image Convert Grayscale,Icon Visualization,QR Code Generator,S3 Sink,Keypoint Detection Model,Twilio SMS Notification,Halo Visualization,Camera Focus,Anthropic Claude,OCR Model,Polygon Visualization,Text Display,Reference Path Visualization,Llama 3.2 Vision,CSV Formatter,Crop Visualization,Roboflow Dataset Upload,Mask Visualization,Heatmap Visualization,Webhook Sink,Label Visualization,Classification Label Visualization,Detections List Roll-Up,Google Vision OCR,Florence-2 Model,Florence-2 Model,VLM As Detector,Polygon Zone Visualization,Stability AI Inpainting,Google Gemini,Perspective Correction,Camera Calibration,Anthropic Claude,OpenAI,OpenAI,Qwen3.5-VL,Background Color Visualization,Anthropic Claude,Size Measurement,Email Notification,Background Subtraction,Contrast Equalization,SIFT Comparison,Multi-Label Classification Model,Keypoint Visualization,Stitch OCR Detections,LMM,Color Visualization,Motion Detection,Dynamic Zone,Single-Label Classification Model,OpenAI,Roboflow Vision Events,Local File Sink,VLM As Classifier,Clip Comparison,Twilio SMS/MMS Notification,Buffer,Triangle Visualization,Clip Comparison,Blur Visualization,Bounding Box Visualization,Camera Focus,Polygon Visualization,Google Gemini,Image Slicer,Image Contours,Model Comparison Visualization,Stability AI Outpainting,Slack Notification - outputs:
Image Threshold,Email Notification,Corner Visualization,Roboflow Dataset Upload,Object Detection Model,Stitch OCR Detections,Stability AI Image Generation,Time in Zone,Grid Visualization,Dynamic Crop,Instance Segmentation Model,Image Slicer,Image Preprocessing,Line Counter Visualization,Trace Visualization,Halo Visualization,ByteTrack Tracker,Cache Get,Roboflow Custom Metadata,Circle Visualization,S3 Sink,Detections Classes Replacement,Keypoint Detection Model,Twilio SMS Notification,Halo Visualization,Anthropic Claude,OC-SORT Tracker,Detections Consensus,Polygon Visualization,Identify Changes,Crop Visualization,Roboflow Dataset Upload,Mask Visualization,CLIP Embedding Model,Heatmap Visualization,Webhook Sink,Cache Set,Detections List Roll-Up,Google Vision OCR,Florence-2 Model,Florence-2 Model,VLM As Classifier,OpenAI,Anthropic Claude,VLM As Detector,OpenAI,PTZ Tracking (ONVIF),Background Color Visualization,Template Matching,Anthropic Claude,SIFT Comparison,Multi-Label Classification Model,Keypoint Visualization,Time in Zone,Stitch OCR Detections,LMM,Identify Outliers,Perception Encoder Embedding Model,SAM 3,Motion Detection,Seg Preview,Dynamic Zone,Single-Label Classification Model,Object Detection Model,Roboflow Vision Events,VLM As Classifier,Detections Stitch,Triangle Visualization,Distance Measurement,Google Gemini,Path Deviation,Image Slicer,Model Comparison Visualization,Stability AI Outpainting,Stitch Images,Image Blur,Ellipse Visualization,OpenAI,Time in Zone,Depth Estimation,Multi-Label Classification Model,CogVLM,Google Gemini,Relative Static Crop,Morphological Transformation,LMM For Classification,Dot Visualization,GLM-OCR,Model Monitoring Inference Aggregator,Keypoint Detection Model,Pixel Color Count,Icon Visualization,QR Code Generator,Detections Stabilizer,SAM 3,Text Display,Reference Path Visualization,Instance Segmentation Model,Llama 3.2 Vision,SORT Tracker,Byte Tracker,Label Visualization,Classification Label Visualization,Byte Tracker,Segment Anything 2 Model,Polygon Zone Visualization,Stability AI Inpainting,Google Gemini,SAM 3,Perspective Correction,Size Measurement,Email Notification,Contrast Equalization,Line Counter,Path Deviation,Single-Label Classification Model,Byte Tracker,Line Counter,Color Visualization,OpenAI,Local File Sink,Clip Comparison,YOLO-World Model,Buffer,Clip Comparison,Twilio SMS/MMS Notification,Bounding Box Visualization,Polygon Visualization,Moondream2,VLM As Detector,Slack Notification
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v2 has.
Bindings
-
input
images(image): The image to infer on..classes(list_of_values): List of classes to calculate similarity against each input image.version(string): Variant of CLIP model.
-
output
similarities(list_of_values): List of values of any type.max_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].most_similar_class(string): String value.min_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].least_similar_class(string): String value.classification_predictions(classification_prediction): Predictions from classifier.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.
Example JSON definition of step Clip Comparison in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v1.
- inputs:
Stitch Images,Image Threshold,Stability AI Inpainting,Corner Visualization,Image Blur,Ellipse Visualization,Google Gemini,Perspective Correction,OpenAI,Depth Estimation,Camera Calibration,Anthropic Claude,Absolute Static Crop,OpenAI,Dimension Collapse,OpenAI,Google Gemini,Stability AI Image Generation,Grid Visualization,Background Color Visualization,Dynamic Crop,Anthropic Claude,Image Slicer,Image Preprocessing,Relative Static Crop,Contrast Equalization,Background Subtraction,Morphological Transformation,Line Counter Visualization,SIFT,SIFT Comparison,Trace Visualization,Keypoint Visualization,Halo Visualization,Dot Visualization,Pixelate Visualization,Circle Visualization,Image Convert Grayscale,Icon Visualization,QR Code Generator,Size Measurement,Florence-2 Model,Halo Visualization,Camera Focus,Anthropic Claude,Polygon Visualization,Color Visualization,Text Display,Motion Detection,Reference Path Visualization,Dynamic Zone,Llama 3.2 Vision,Florence-2 Model,Crop Visualization,Mask Visualization,Clip Comparison,Buffer,Heatmap Visualization,Triangle Visualization,Clip Comparison,Blur Visualization,Bounding Box Visualization,Label Visualization,Classification Label Visualization,Detections List Roll-Up,Camera Focus,Polygon Visualization,Google Gemini,Image Slicer,Image Contours,Model Comparison Visualization,Stability AI Outpainting,Polygon Zone Visualization - outputs:
Email Notification,Corner Visualization,Ellipse Visualization,OpenAI,Time in Zone,Object Detection Model,Roboflow Dataset Upload,Google Gemini,Time in Zone,Grid Visualization,Instance Segmentation Model,Line Counter Visualization,Trace Visualization,LMM For Classification,Halo Visualization,Dot Visualization,Keypoint Detection Model,Circle Visualization,Detections Classes Replacement,Keypoint Detection Model,Halo Visualization,Anthropic Claude,SAM 3,Detections Consensus,Polygon Visualization,Reference Path Visualization,Instance Segmentation Model,Llama 3.2 Vision,Crop Visualization,Roboflow Dataset Upload,Mask Visualization,Webhook Sink,Label Visualization,Cache Set,Classification Label Visualization,Detections List Roll-Up,Florence-2 Model,Florence-2 Model,Polygon Zone Visualization,VLM As Classifier,Google Gemini,SAM 3,Perspective Correction,OpenAI,Anthropic Claude,VLM As Detector,OpenAI,Size Measurement,Anthropic Claude,Email Notification,Keypoint Visualization,Line Counter,Time in Zone,Path Deviation,Line Counter,SAM 3,Motion Detection,Seg Preview,Color Visualization,Object Detection Model,VLM As Classifier,Clip Comparison,YOLO-World Model,Buffer,Triangle Visualization,Clip Comparison,Twilio SMS/MMS Notification,Bounding Box Visualization,Polygon Visualization,Google Gemini,Path Deviation,VLM As Detector
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v1 has.
Bindings
-
input
images(image): The image to infer on..texts(list_of_values): List of texts to calculate similarity against each input image.
-
output
similarity(list_of_values): List of values of any type.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step Clip Comparison in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}