Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v2
.
- inputs:
Keypoint Visualization
,Google Gemini
,Keypoint Detection Model
,Image Contours
,Circle Visualization
,Image Threshold
,Absolute Static Crop
,Perspective Correction
,Color Visualization
,Instance Segmentation Model
,Reference Path Visualization
,Stitch Images
,Image Blur
,Florence-2 Model
,Blur Visualization
,Local File Sink
,Relative Static Crop
,Clip Comparison
,Halo Visualization
,Stability AI Inpainting
,SIFT Comparison
,Icon Visualization
,Roboflow Custom Metadata
,Dimension Collapse
,Polygon Zone Visualization
,Depth Estimation
,Stability AI Image Generation
,Dynamic Zone
,Dynamic Crop
,Grid Visualization
,Crop Visualization
,Stitch OCR Detections
,Camera Calibration
,VLM as Classifier
,QR Code Generator
,SIFT
,Size Measurement
,Camera Focus
,Model Comparison Visualization
,Twilio SMS Notification
,Llama 3.2 Vision
,Triangle Visualization
,Line Counter Visualization
,Clip Comparison
,Email Notification
,LMM
,Roboflow Dataset Upload
,CSV Formatter
,Image Slicer
,Mask Visualization
,Single-Label Classification Model
,OCR Model
,Pixelate Visualization
,Webhook Sink
,Object Detection Model
,Slack Notification
,Dot Visualization
,Image Slicer
,Roboflow Dataset Upload
,Classification Label Visualization
,OpenAI
,Model Monitoring Inference Aggregator
,Polygon Visualization
,OpenAI
,Buffer
,LMM For Classification
,Stability AI Outpainting
,Trace Visualization
,Bounding Box Visualization
,Image Preprocessing
,Multi-Label Classification Model
,Image Convert Grayscale
,Google Vision OCR
,Label Visualization
,CogVLM
,Corner Visualization
,Background Color Visualization
,Florence-2 Model
,VLM as Detector
,Ellipse Visualization
,OpenAI
,Anthropic Claude
- outputs:
Keypoint Visualization
,Google Gemini
,Detections Stabilizer
,Path Deviation
,Reference Path Visualization
,Stitch Images
,Image Blur
,Florence-2 Model
,Local File Sink
,Relative Static Crop
,Clip Comparison
,Icon Visualization
,Time in Zone
,Polygon Zone Visualization
,Identify Outliers
,Instance Segmentation Model
,Dynamic Zone
,Grid Visualization
,Dynamic Crop
,Detections Consensus
,VLM as Classifier
,Single-Label Classification Model
,Perception Encoder Embedding Model
,VLM as Classifier
,Pixel Color Count
,QR Code Generator
,Llama 3.2 Vision
,Triangle Visualization
,Line Counter Visualization
,Multi-Label Classification Model
,Email Notification
,Roboflow Dataset Upload
,Time in Zone
,Image Slicer
,Byte Tracker
,Single-Label Classification Model
,Byte Tracker
,Object Detection Model
,Dot Visualization
,Image Slicer
,Roboflow Dataset Upload
,OpenAI
,Model Monitoring Inference Aggregator
,VLM as Detector
,Buffer
,Trace Visualization
,Stability AI Outpainting
,Multi-Label Classification Model
,CogVLM
,Corner Visualization
,Background Color Visualization
,Ellipse Visualization
,Halo Visualization
,OpenAI
,Anthropic Claude
,Keypoint Detection Model
,Circle Visualization
,Image Threshold
,Perspective Correction
,Color Visualization
,Instance Segmentation Model
,PTZ Tracking (ONVIF)
.md),Keypoint Detection Model
,Stability AI Inpainting
,SIFT Comparison
,Cache Get
,Roboflow Custom Metadata
,Template Matching
,Stability AI Image Generation
,Crop Visualization
,Time in Zone
,Segment Anything 2 Model
,Size Measurement
,Object Detection Model
,Model Comparison Visualization
,Twilio SMS Notification
,CLIP Embedding Model
,Clip Comparison
,LMM
,Path Deviation
,Mask Visualization
,Byte Tracker
,Webhook Sink
,Cache Set
,Slack Notification
,Detections Classes Replacement
,YOLO-World Model
,Classification Label Visualization
,Polygon Visualization
,OpenAI
,LMM For Classification
,Line Counter
,Moondream2
,Bounding Box Visualization
,Distance Measurement
,Image Preprocessing
,Google Vision OCR
,Label Visualization
,Line Counter
,Detections Stitch
,Florence-2 Model
,Identify Changes
,VLM as Detector
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v2
has.
Bindings
-
input
images
(image
): The image to infer on..classes
(list_of_values
): List of classes to calculate similarity against each input image.version
(string
): Variant of CLIP model.
-
output
similarities
(list_of_values
): List of values of any type.max_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.most_similar_class
(string
): String value.min_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.least_similar_class
(string
): String value.classification_predictions
(classification_prediction
): Predictions from classifier.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.
Example JSON definition of step Clip Comparison
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v1
.
- inputs:
Keypoint Visualization
,Google Gemini
,Image Contours
,Circle Visualization
,Image Threshold
,Absolute Static Crop
,Image Slicer
,Perspective Correction
,Color Visualization
,Mask Visualization
,Reference Path Visualization
,Stitch Images
,Image Blur
,Florence-2 Model
,Blur Visualization
,Pixelate Visualization
,Relative Static Crop
,Clip Comparison
,Dot Visualization
,Image Slicer
,Stability AI Inpainting
,SIFT Comparison
,Classification Label Visualization
,Icon Visualization
,Dimension Collapse
,Polygon Zone Visualization
,Depth Estimation
,Anthropic Claude
,Polygon Visualization
,Stability AI Image Generation
,Dynamic Zone
,Dynamic Crop
,Grid Visualization
,Crop Visualization
,OpenAI
,Ellipse Visualization
,Buffer
,Stability AI Outpainting
,Trace Visualization
,Bounding Box Visualization
,Camera Calibration
,Image Preprocessing
,Image Convert Grayscale
,Label Visualization
,Corner Visualization
,SIFT
,QR Code Generator
,Background Color Visualization
,Size Measurement
,Camera Focus
,Model Comparison Visualization
,Triangle Visualization
,Florence-2 Model
,Llama 3.2 Vision
,Halo Visualization
,OpenAI
,Line Counter Visualization
,Clip Comparison
- outputs:
Keypoint Visualization
,Google Gemini
,Keypoint Detection Model
,Email Notification
,Roboflow Dataset Upload
,Circle Visualization
,Path Deviation
,Time in Zone
,Path Deviation
,Perspective Correction
,Color Visualization
,Instance Segmentation Model
,Mask Visualization
,Reference Path Visualization
,Florence-2 Model
,Keypoint Detection Model
,Webhook Sink
,Halo Visualization
,Clip Comparison
,Object Detection Model
,Cache Set
,Dot Visualization
,Roboflow Dataset Upload
,YOLO-World Model
,Classification Label Visualization
,Time in Zone
,Polygon Zone Visualization
,Anthropic Claude
,VLM as Detector
,Instance Segmentation Model
,Polygon Visualization
,Grid Visualization
,Crop Visualization
,OpenAI
,Time in Zone
,LMM For Classification
,Buffer
,Line Counter
,Detections Consensus
,VLM as Classifier
,Bounding Box Visualization
,Trace Visualization
,Label Visualization
,Line Counter
,VLM as Classifier
,Corner Visualization
,Size Measurement
,Florence-2 Model
,Triangle Visualization
,Object Detection Model
,VLM as Detector
,Llama 3.2 Vision
,Ellipse Visualization
,OpenAI
,Line Counter Visualization
,Clip Comparison
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v1
has.
Bindings
-
input
images
(image
): The image to infer on..texts
(list_of_values
): List of texts to calculate similarity against each input image.
-
output
similarity
(list_of_values
): List of values of any type.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.prediction_type
(prediction_type
): String value with type of prediction.
Example JSON definition of step Clip Comparison
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}