Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v2.
- inputs:
LMM,Background Color Visualization,Stitch Images,Image Slicer,Size Measurement,Dynamic Zone,Model Monitoring Inference Aggregator,Corner Visualization,Camera Calibration,Mask Visualization,Object Detection Model,Local File Sink,Model Comparison Visualization,Email Notification,Pixelate Visualization,Anthropic Claude,Relative Static Crop,Google Gemini,Florence-2 Model,Multi-Label Classification Model,Ellipse Visualization,Triangle Visualization,Camera Focus,OCR Model,QR Code Generator,Label Visualization,Roboflow Custom Metadata,Florence-2 Model,LMM For Classification,Blur Visualization,Dot Visualization,Stability AI Image Generation,Perspective Correction,Google Vision OCR,Llama 3.2 Vision,EasyOCR,Absolute Static Crop,Slack Notification,Morphological Transformation,Image Blur,Image Threshold,Clip Comparison,Stitch OCR Detections,Depth Estimation,Stability AI Outpainting,Halo Visualization,Stability AI Inpainting,Polygon Visualization,OpenAI,Grid Visualization,Roboflow Dataset Upload,CogVLM,Classification Label Visualization,Email Notification,VLM as Detector,Instance Segmentation Model,Bounding Box Visualization,Image Convert Grayscale,Polygon Zone Visualization,OpenAI,Clip Comparison,Keypoint Detection Model,Crop Visualization,Image Slicer,Icon Visualization,Color Visualization,Roboflow Dataset Upload,Dimension Collapse,Keypoint Visualization,Contrast Equalization,Buffer,Image Contours,Circle Visualization,OpenAI,VLM as Classifier,CSV Formatter,Reference Path Visualization,Twilio SMS Notification,Dynamic Crop,Webhook Sink,Single-Label Classification Model,SIFT,Line Counter Visualization,Image Preprocessing,Trace Visualization,SIFT Comparison - outputs:
Background Color Visualization,Stitch Images,Size Measurement,Image Slicer,VLM as Classifier,Identify Outliers,Corner Visualization,Mask Visualization,CLIP Embedding Model,Line Counter,Local File Sink,Model Comparison Visualization,Email Notification,Time in Zone,Florence-2 Model,Multi-Label Classification Model,Ellipse Visualization,Label Visualization,LMM For Classification,Dot Visualization,Google Vision OCR,Line Counter,Llama 3.2 Vision,Detections Stabilizer,Slack Notification,Image Blur,Stitch OCR Detections,Stability AI Outpainting,Halo Visualization,Stability AI Inpainting,CogVLM,Classification Label Visualization,VLM as Detector,Instance Segmentation Model,Byte Tracker,Polygon Zone Visualization,Perception Encoder Embedding Model,Clip Comparison,Detections Stitch,Crop Visualization,Image Slicer,Cache Get,YOLO-World Model,Multi-Label Classification Model,Seg Preview,Icon Visualization,Color Visualization,Path Deviation,Buffer,Circle Visualization,Time in Zone,Dynamic Crop,Single-Label Classification Model,Line Counter Visualization,PTZ Tracking (ONVIF).md),Image Preprocessing,Trace Visualization,SIFT Comparison,LMM,Dynamic Zone,Model Monitoring Inference Aggregator,Detections Classes Replacement,Object Detection Model,Keypoint Detection Model,Anthropic Claude,Relative Static Crop,Google Gemini,Triangle Visualization,Segment Anything 2 Model,QR Code Generator,Byte Tracker,Time in Zone,Pixel Color Count,Roboflow Custom Metadata,Cache Set,Florence-2 Model,Identify Changes,Single-Label Classification Model,Stability AI Image Generation,SAM 3,Morphological Transformation,Clip Comparison,Image Threshold,Byte Tracker,Polygon Visualization,OpenAI,Grid Visualization,Roboflow Dataset Upload,Path Deviation,Template Matching,Distance Measurement,Email Notification,Bounding Box Visualization,OpenAI,Keypoint Detection Model,Object Detection Model,Moondream2,Roboflow Dataset Upload,Keypoint Visualization,Contrast Equalization,Instance Segmentation Model,OpenAI,VLM as Classifier,Reference Path Visualization,Twilio SMS Notification,VLM as Detector,Webhook Sink,Detections Consensus,Perspective Correction
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v2 has.
Bindings
-
input
images(image): The image to infer on..classes(list_of_values): List of classes to calculate similarity against each input image.version(string): Variant of CLIP model.
-
output
similarities(list_of_values): List of values of any type.max_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].most_similar_class(string): String value.min_similarity(float_zero_to_one):floatvalue in range[0.0, 1.0].least_similar_class(string): String value.classification_predictions(classification_prediction): Predictions from classifier.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.
Example JSON definition of step Clip Comparison in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/clip_comparison@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison in version v1.
- inputs:
Polygon Visualization,OpenAI,Grid Visualization,Background Color Visualization,Stitch Images,Image Slicer,Size Measurement,Dynamic Zone,Corner Visualization,Trace Visualization,Classification Label Visualization,Camera Calibration,Mask Visualization,Bounding Box Visualization,Model Comparison Visualization,Image Convert Grayscale,Pixelate Visualization,Polygon Zone Visualization,Anthropic Claude,Relative Static Crop,Google Gemini,Clip Comparison,Florence-2 Model,Crop Visualization,Ellipse Visualization,Triangle Visualization,Image Slicer,Camera Focus,Icon Visualization,QR Code Generator,Color Visualization,Dimension Collapse,Label Visualization,Keypoint Visualization,Contrast Equalization,Buffer,Florence-2 Model,Image Contours,Blur Visualization,Dot Visualization,Circle Visualization,Stability AI Image Generation,OpenAI,Perspective Correction,Llama 3.2 Vision,Absolute Static Crop,Reference Path Visualization,Morphological Transformation,Image Blur,Image Threshold,Clip Comparison,SIFT,Line Counter Visualization,Depth Estimation,Stability AI Outpainting,Halo Visualization,Stability AI Inpainting,Image Preprocessing,Dynamic Crop,SIFT Comparison - outputs:
Polygon Visualization,OpenAI,Grid Visualization,Roboflow Dataset Upload,Size Measurement,Path Deviation,VLM as Classifier,Corner Visualization,Trace Visualization,Classification Label Visualization,Email Notification,VLM as Detector,Object Detection Model,Instance Segmentation Model,Line Counter,Bounding Box Visualization,Mask Visualization,Email Notification,Keypoint Detection Model,Polygon Zone Visualization,Anthropic Claude,Time in Zone,Google Gemini,Clip Comparison,Florence-2 Model,Keypoint Detection Model,Crop Visualization,Object Detection Model,Ellipse Visualization,Triangle Visualization,YOLO-World Model,Seg Preview,Color Visualization,Roboflow Dataset Upload,Path Deviation,Label Visualization,Time in Zone,Keypoint Visualization,Cache Set,Florence-2 Model,Buffer,LMM For Classification,Instance Segmentation Model,Dot Visualization,Circle Visualization,OpenAI,VLM as Classifier,Line Counter,Llama 3.2 Vision,Time in Zone,Reference Path Visualization,SAM 3,VLM as Detector,Webhook Sink,Detections Consensus,Clip Comparison,Line Counter Visualization,Halo Visualization,Perspective Correction
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison in version v1 has.
Bindings
-
input
images(image): The image to infer on..texts(list_of_values): List of texts to calculate similarity against each input image.
-
output
similarity(list_of_values): List of values of any type.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step Clip Comparison in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}