Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v2.
- inputs:
Anthropic Claude,Halo Visualization,Clip Comparison,Image Blur,Email Notification,Camera Focus,Text Display,CSV Formatter,Contrast Equalization,Object Detection Model,Blur Visualization,Dimension Collapse,Corner Visualization,Dynamic Crop,Classification Label Visualization,Relative Static Crop,Roboflow Dataset Upload,Trace Visualization,Multi-Label Classification Model,Camera Focus,Mask Visualization,SIFT,Stitch OCR Detections,Background Color Visualization,Polygon Visualization,Local File Sink,Florence-2 Model,Size Measurement,Stitch OCR Detections,Ellipse Visualization,Pixelate Visualization,SIFT Comparison,Roboflow Custom Metadata,LMM,Detections List Roll-Up,OpenAI,Circle Visualization,Dot Visualization,Twilio SMS Notification,Absolute Static Crop,Morphological Transformation,Crop Visualization,Single-Label Classification Model,Dynamic Zone,Google Gemini,Keypoint Visualization,Polygon Visualization,Google Vision OCR,Icon Visualization,Llama 3.2 Vision,Color Visualization,Image Contours,Stitch Images,OpenAI,VLM As Classifier,LMM For Classification,Email Notification,VLM As Detector,Anthropic Claude,OpenAI,Bounding Box Visualization,Triangle Visualization,Background Subtraction,Grid Visualization,Model Comparison Visualization,Reference Path Visualization,Line Counter Visualization,EasyOCR,Halo Visualization,OCR Model,Instance Segmentation Model,Slack Notification,Stability AI Inpainting,Image Slicer,Google Gemini,Florence-2 Model,Image Convert Grayscale,Stability AI Image Generation,Heatmap Visualization,Polygon Zone Visualization,Image Slicer,Label Visualization,Google Gemini,Depth Estimation,Image Preprocessing,OpenAI,Stability AI Outpainting,Buffer,CogVLM,Image Threshold,Anthropic Claude,Webhook Sink,Camera Calibration,Perspective Correction,QR Code Generator,Motion Detection,Keypoint Detection Model,Model Monitoring Inference Aggregator,Twilio SMS/MMS Notification,Roboflow Dataset Upload,Clip Comparison - outputs:
Anthropic Claude,CLIP Embedding Model,Text Display,Segment Anything 2 Model,Dynamic Crop,Relative Static Crop,SAM 3,Mask Visualization,VLM As Classifier,Path Deviation,Background Color Visualization,PTZ Tracking (ONVIF).md),Polygon Visualization,Moondream2,Size Measurement,Stitch OCR Detections,Line Counter,SIFT Comparison,Byte Tracker,Detections List Roll-Up,OpenAI,Identify Outliers,Crop Visualization,Single-Label Classification Model,Keypoint Visualization,Multi-Label Classification Model,Icon Visualization,Llama 3.2 Vision,OpenAI,VLM As Classifier,Stitch Images,Anthropic Claude,Distance Measurement,OpenAI,Triangle Visualization,Detections Classes Replacement,Reference Path Visualization,Instance Segmentation Model,Line Counter Visualization,Slack Notification,Time in Zone,Line Counter,Google Gemini,Stability AI Image Generation,Pixel Color Count,Seg Preview,Time in Zone,OpenAI,Object Detection Model,Stability AI Outpainting,Buffer,Webhook Sink,Image Threshold,Instance Segmentation Model,Anthropic Claude,Perception Encoder Embedding Model,Perspective Correction,Cache Set,SAM 3,Clip Comparison,Clip Comparison,Halo Visualization,Image Blur,Email Notification,Contrast Equalization,Object Detection Model,VLM As Detector,Detections Stabilizer,Roboflow Dataset Upload,Corner Visualization,Classification Label Visualization,Keypoint Detection Model,Trace Visualization,Multi-Label Classification Model,Stitch OCR Detections,YOLO-World Model,Local File Sink,Florence-2 Model,Ellipse Visualization,Roboflow Custom Metadata,Single-Label Classification Model,Identify Changes,LMM,SAM 3,Dot Visualization,Circle Visualization,Twilio SMS Notification,Byte Tracker,Template Matching,Morphological Transformation,Google Gemini,Dynamic Zone,Polygon Visualization,Google Vision OCR,Color Visualization,LMM For Classification,Email Notification,VLM As Detector,Bounding Box Visualization,Model Comparison Visualization,Grid Visualization,Detections Consensus,Halo Visualization,Stability AI Inpainting,Image Slicer,Florence-2 Model,Heatmap Visualization,Polygon Zone Visualization,Image Slicer,Label Visualization,Google Gemini,Depth Estimation,Image Preprocessing,Cache Get,Detections Stitch,Time in Zone,CogVLM,Byte Tracker,Path Deviation,QR Code Generator,Motion Detection,Keypoint Detection Model,Twilio SMS/MMS Notification,Model Monitoring Inference Aggregator,Roboflow Dataset Upload
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v2 has.
Bindings
-
input
images(image): The image to infer on..classes(list_of_values): List of classes to calculate similarity against each input image.version(string): Variant of CLIP model.
-
output
similarities(list_of_values): List of values of any type.max_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].most_similar_class(string): String value.min_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].least_similar_class(string): String value.classification_predictions(classification_prediction): Predictions from classifier.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.
Example JSON definition of step Clip Comparison in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v1.
- inputs:
Polygon Visualization,Anthropic Claude,Halo Visualization,Icon Visualization,Image Blur,Clip Comparison,Llama 3.2 Vision,Camera Focus,Color Visualization,Image Contours,Stitch Images,Text Display,Contrast Equalization,Blur Visualization,Dimension Collapse,Corner Visualization,Dynamic Crop,Classification Label Visualization,Relative Static Crop,Anthropic Claude,Trace Visualization,OpenAI,Bounding Box Visualization,Triangle Visualization,Background Subtraction,Camera Focus,Grid Visualization,Mask Visualization,Line Counter Visualization,Model Comparison Visualization,Halo Visualization,SIFT,Reference Path Visualization,Stability AI Inpainting,Image Slicer,Google Gemini,Florence-2 Model,Background Color Visualization,Image Convert Grayscale,Stability AI Image Generation,Heatmap Visualization,Polygon Visualization,Polygon Zone Visualization,Image Slicer,Florence-2 Model,Label Visualization,Size Measurement,Ellipse Visualization,Google Gemini,Depth Estimation,Image Preprocessing,Pixelate Visualization,SIFT Comparison,OpenAI,Dynamic Zone,Detections List Roll-Up,Stability AI Outpainting,Buffer,OpenAI,Circle Visualization,Dot Visualization,Image Threshold,Anthropic Claude,Camera Calibration,Perspective Correction,Absolute Static Crop,Morphological Transformation,QR Code Generator,Motion Detection,Crop Visualization,Clip Comparison,Google Gemini,Keypoint Visualization - outputs:
Clip Comparison,Halo Visualization,Anthropic Claude,Email Notification,Object Detection Model,VLM As Detector,Roboflow Dataset Upload,Corner Visualization,Classification Label Visualization,Keypoint Detection Model,Trace Visualization,SAM 3,Mask Visualization,VLM As Classifier,Path Deviation,YOLO-World Model,Polygon Visualization,Florence-2 Model,Size Measurement,Line Counter,Ellipse Visualization,SAM 3,Detections List Roll-Up,OpenAI,Dot Visualization,Circle Visualization,Crop Visualization,Google Gemini,Keypoint Visualization,Polygon Visualization,Llama 3.2 Vision,Color Visualization,VLM As Classifier,LMM For Classification,Email Notification,VLM As Detector,Anthropic Claude,OpenAI,Triangle Visualization,Detections Classes Replacement,Bounding Box Visualization,Grid Visualization,Reference Path Visualization,Instance Segmentation Model,Line Counter Visualization,Detections Consensus,Halo Visualization,Time in Zone,Line Counter,Google Gemini,Florence-2 Model,Seg Preview,Polygon Zone Visualization,Label Visualization,Google Gemini,Time in Zone,OpenAI,Time in Zone,Object Detection Model,Buffer,Webhook Sink,Instance Segmentation Model,Anthropic Claude,Path Deviation,Perspective Correction,Cache Set,SAM 3,Motion Detection,Keypoint Detection Model,Twilio SMS/MMS Notification,Roboflow Dataset Upload,Clip Comparison
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v1 has.
Bindings
-
input
images(image): The image to infer on..texts(list_of_values): List of texts to calculate similarity against each input image.
-
output
similarity(list_of_values): List of values of any type.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step Clip Comparison in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}