Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v2.
- inputs:
Anthropic Claude,Mask Visualization,Classification Label Visualization,Instance Segmentation Model,Webhook Sink,Multi-Label Classification Model,Dynamic Zone,Email Notification,QR Code Generator,Dynamic Crop,VLM As Detector,Google Gemini,LMM,Image Blur,Corner Visualization,Image Convert Grayscale,Stability AI Outpainting,Halo Visualization,Stability AI Inpainting,Object Detection Model,Image Contours,Trace Visualization,Google Vision OCR,Morphological Transformation,Triangle Visualization,Clip Comparison,Relative Static Crop,CSV Formatter,Text Display,Stitch Images,Google Gemini,Camera Calibration,Grid Visualization,Local File Sink,Slack Notification,VLM As Classifier,Roboflow Dataset Upload,Camera Focus,Color Visualization,Dot Visualization,Image Slicer,Polygon Visualization,Anthropic Claude,Llama 3.2 Vision,Line Counter Visualization,LMM For Classification,Buffer,Keypoint Detection Model,Contrast Equalization,SIFT Comparison,Dimension Collapse,Camera Focus,Background Subtraction,Image Slicer,Circle Visualization,Halo Visualization,Florence-2 Model,Blur Visualization,Label Visualization,Twilio SMS/MMS Notification,Clip Comparison,Email Notification,Ellipse Visualization,OpenAI,SIFT,Image Preprocessing,Model Monitoring Inference Aggregator,Single-Label Classification Model,Detections List Roll-Up,OpenAI,Image Threshold,Background Color Visualization,Model Comparison Visualization,Depth Estimation,Size Measurement,Motion Detection,OpenAI,CogVLM,Absolute Static Crop,Roboflow Custom Metadata,EasyOCR,Stitch OCR Detections,Perspective Correction,Anthropic Claude,Pixelate Visualization,Stability AI Image Generation,Reference Path Visualization,Keypoint Visualization,Polygon Visualization,Twilio SMS Notification,Bounding Box Visualization,Polygon Zone Visualization,OCR Model,Icon Visualization,Crop Visualization,Stitch OCR Detections,Google Gemini,OpenAI,Florence-2 Model,Roboflow Dataset Upload - outputs:
Mask Visualization,Classification Label Visualization,Instance Segmentation Model,Detections Consensus,Webhook Sink,Multi-Label Classification Model,Email Notification,QR Code Generator,VLM As Detector,Multi-Label Classification Model,LMM,SAM 3,Corner Visualization,Stability AI Outpainting,Segment Anything 2 Model,Halo Visualization,Object Detection Model,Single-Label Classification Model,Trace Visualization,Google Vision OCR,Clip Comparison,Instance Segmentation Model,Text Display,Stitch Images,Google Gemini,Slack Notification,Local File Sink,VLM As Classifier,Roboflow Dataset Upload,PTZ Tracking (ONVIF).md),Color Visualization,Dot Visualization,Polygon Visualization,Object Detection Model,Anthropic Claude,Buffer,Byte Tracker,Contrast Equalization,Identify Changes,Detections Classes Replacement,Perception Encoder Embedding Model,Moondream2,Halo Visualization,Florence-2 Model,Label Visualization,Twilio SMS/MMS Notification,Ellipse Visualization,OpenAI,Model Monitoring Inference Aggregator,Single-Label Classification Model,Detections List Roll-Up,OpenAI,Model Comparison Visualization,Background Color Visualization,Image Threshold,Size Measurement,OpenAI,Keypoint Detection Model,Polygon Visualization,SAM 3,Twilio SMS Notification,Bounding Box Visualization,Time in Zone,Icon Visualization,Google Gemini,Florence-2 Model,Roboflow Dataset Upload,Anthropic Claude,Dynamic Zone,Dynamic Crop,CLIP Embedding Model,VLM As Detector,Google Gemini,Path Deviation,Image Blur,Line Counter,Byte Tracker,Cache Set,Stability AI Inpainting,Template Matching,Path Deviation,Morphological Transformation,Triangle Visualization,Detections Stitch,Relative Static Crop,Grid Visualization,Detections Stabilizer,Image Slicer,Llama 3.2 Vision,LMM For Classification,Keypoint Detection Model,Line Counter Visualization,Distance Measurement,SIFT Comparison,Time in Zone,Image Slicer,Circle Visualization,Seg Preview,Identify Outliers,Clip Comparison,Email Notification,Byte Tracker,Image Preprocessing,SAM 3,Depth Estimation,Cache Get,Time in Zone,Line Counter,CogVLM,Roboflow Custom Metadata,Stitch OCR Detections,Perspective Correction,Anthropic Claude,Reference Path Visualization,Stability AI Image Generation,Keypoint Visualization,VLM As Classifier,Polygon Zone Visualization,YOLO-World Model,Stitch OCR Detections,Crop Visualization,Pixel Color Count,Motion Detection,OpenAI
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v2 has.
Bindings
-
input
images(image): The image to infer on..classes(list_of_values): List of classes to calculate similarity against each input image.version(string): Variant of CLIP model.
-
output
similarities(list_of_values): List of values of any type.max_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].most_similar_class(string): String value.min_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].least_similar_class(string): String value.classification_predictions(classification_prediction): Predictions from classifier.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.
Example JSON definition of step Clip Comparison in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v1.
- inputs:
Anthropic Claude,Mask Visualization,Circle Visualization,Classification Label Visualization,Halo Visualization,Dynamic Zone,Florence-2 Model,Blur Visualization,QR Code Generator,Dynamic Crop,Google Gemini,Label Visualization,Image Blur,Clip Comparison,Corner Visualization,Image Convert Grayscale,Ellipse Visualization,OpenAI,SIFT,Image Preprocessing,Stability AI Outpainting,Halo Visualization,Stability AI Inpainting,Detections List Roll-Up,OpenAI,Image Threshold,Background Color Visualization,Image Contours,Depth Estimation,Model Comparison Visualization,Trace Visualization,Size Measurement,Morphological Transformation,Triangle Visualization,Clip Comparison,Absolute Static Crop,Relative Static Crop,Text Display,Stitch Images,Google Gemini,Camera Calibration,Grid Visualization,Camera Focus,Perspective Correction,Color Visualization,Anthropic Claude,Dot Visualization,Image Slicer,Pixelate Visualization,Polygon Visualization,Stability AI Image Generation,Reference Path Visualization,Keypoint Visualization,Polygon Visualization,Anthropic Claude,Google Gemini,Llama 3.2 Vision,Line Counter Visualization,Bounding Box Visualization,Buffer,Contrast Equalization,Polygon Zone Visualization,SIFT Comparison,Dimension Collapse,Camera Focus,OpenAI,Icon Visualization,Crop Visualization,Motion Detection,Background Subtraction,Florence-2 Model,Image Slicer - outputs:
Anthropic Claude,Mask Visualization,Classification Label Visualization,Instance Segmentation Model,Detections Consensus,Webhook Sink,Email Notification,VLM As Detector,VLM As Detector,Google Gemini,SAM 3,Path Deviation,Corner Visualization,Line Counter,Cache Set,Halo Visualization,Object Detection Model,Path Deviation,Trace Visualization,Triangle Visualization,Clip Comparison,Instance Segmentation Model,Google Gemini,Grid Visualization,VLM As Classifier,Roboflow Dataset Upload,Color Visualization,Dot Visualization,Polygon Visualization,Object Detection Model,Anthropic Claude,Llama 3.2 Vision,LMM For Classification,Keypoint Detection Model,Buffer,Line Counter Visualization,Time in Zone,Circle Visualization,Seg Preview,Halo Visualization,Florence-2 Model,Label Visualization,Twilio SMS/MMS Notification,Clip Comparison,Email Notification,Ellipse Visualization,OpenAI,SAM 3,Detections List Roll-Up,OpenAI,Size Measurement,Time in Zone,Line Counter,Keypoint Detection Model,Perspective Correction,Anthropic Claude,Google Gemini,Reference Path Visualization,Keypoint Visualization,Polygon Visualization,SAM 3,VLM As Classifier,Bounding Box Visualization,Polygon Zone Visualization,YOLO-World Model,Time in Zone,Crop Visualization,Motion Detection,OpenAI,Florence-2 Model,Roboflow Dataset Upload
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v1 has.
Bindings
-
input
images(image): The image to infer on..texts(list_of_values): List of texts to calculate similarity against each input image.
-
output
similarity(list_of_values): List of values of any type.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step Clip Comparison in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}