Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v2
.
- inputs:
OpenAI
,VLM as Detector
,Keypoint Detection Model
,Circle Visualization
,Roboflow Dataset Upload
,Multi-Label Classification Model
,Roboflow Custom Metadata
,Depth Estimation
,SIFT
,Florence-2 Model
,Buffer
,Dimension Collapse
,Grid Visualization
,Dynamic Zone
,Instance Segmentation Model
,Color Visualization
,CSV Formatter
,Perspective Correction
,Model Monitoring Inference Aggregator
,Image Slicer
,OpenAI
,Model Comparison Visualization
,Clip Comparison
,Stitch Images
,Dynamic Crop
,Image Contours
,Webhook Sink
,Pixelate Visualization
,Llama 3.2 Vision
,Camera Calibration
,Reference Path Visualization
,Image Blur
,Local File Sink
,Blur Visualization
,OCR Model
,Ellipse Visualization
,Trace Visualization
,Corner Visualization
,Camera Focus
,Polygon Zone Visualization
,Google Gemini
,OpenAI
,Triangle Visualization
,Stability AI Inpainting
,Classification Label Visualization
,Single-Label Classification Model
,Bounding Box Visualization
,CogVLM
,Image Convert Grayscale
,Halo Visualization
,LMM
,Email Notification
,Polygon Visualization
,Absolute Static Crop
,Object Detection Model
,Slack Notification
,Dot Visualization
,Label Visualization
,Stability AI Outpainting
,Crop Visualization
,Google Vision OCR
,Stability AI Image Generation
,Image Threshold
,Stitch OCR Detections
,Image Preprocessing
,SIFT Comparison
,Mask Visualization
,Florence-2 Model
,Twilio SMS Notification
,Clip Comparison
,Roboflow Dataset Upload
,Line Counter Visualization
,VLM as Classifier
,Anthropic Claude
,Background Color Visualization
,LMM For Classification
,Image Slicer
,Keypoint Visualization
,Size Measurement
,Relative Static Crop
- outputs:
PTZ Tracking (ONVIF)
.md),Perception Encoder Embedding Model
,Cache Set
,Detections Stitch
,Color Visualization
,Perspective Correction
,Path Deviation
,Model Monitoring Inference Aggregator
,Image Slicer
,Model Comparison Visualization
,Clip Comparison
,Stitch Images
,Dynamic Crop
,Moondream2
,Webhook Sink
,Llama 3.2 Vision
,Line Counter
,Reference Path Visualization
,Time in Zone
,Image Blur
,Ellipse Visualization
,Trace Visualization
,Polygon Zone Visualization
,Corner Visualization
,Single-Label Classification Model
,Email Notification
,Object Detection Model
,Label Visualization
,Dot Visualization
,Byte Tracker
,Image Preprocessing
,SIFT Comparison
,Mask Visualization
,Florence-2 Model
,Roboflow Dataset Upload
,Byte Tracker
,Instance Segmentation Model
,Background Color Visualization
,Line Counter
,Keypoint Visualization
,Multi-Label Classification Model
,VLM as Detector
,YOLO-World Model
,VLM as Detector
,OpenAI
,Keypoint Detection Model
,Roboflow Dataset Upload
,Circle Visualization
,Roboflow Custom Metadata
,Florence-2 Model
,Buffer
,Single-Label Classification Model
,Template Matching
,Grid Visualization
,Dynamic Zone
,Instance Segmentation Model
,Detections Consensus
,Object Detection Model
,Identify Outliers
,OpenAI
,Keypoint Detection Model
,Byte Tracker
,Local File Sink
,Cache Get
,CLIP Embedding Model
,Google Gemini
,Detections Classes Replacement
,OpenAI
,Triangle Visualization
,Stability AI Inpainting
,Classification Label Visualization
,Bounding Box Visualization
,Distance Measurement
,Detections Stabilizer
,CogVLM
,Halo Visualization
,LMM
,Polygon Visualization
,Slack Notification
,Stability AI Outpainting
,Crop Visualization
,Google Vision OCR
,Stability AI Image Generation
,Image Threshold
,Pixel Color Count
,VLM as Classifier
,Identify Changes
,Time in Zone
,Twilio SMS Notification
,Segment Anything 2 Model
,Clip Comparison
,Line Counter Visualization
,Path Deviation
,VLM as Classifier
,Anthropic Claude
,LMM For Classification
,Multi-Label Classification Model
,Image Slicer
,Size Measurement
,Relative Static Crop
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v2
has.
Bindings
-
input
images
(image
): The image to infer on..classes
(list_of_values
): List of classes to calculate similarity against each input image.version
(string
): Variant of CLIP model.
-
output
similarities
(list_of_values
): List of values of any type.max_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.most_similar_class
(string
): String value.min_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.least_similar_class
(string
): String value.classification_predictions
(classification_prediction
): Predictions from classifier.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.
Example JSON definition of step Clip Comparison
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v1
.
- inputs:
Corner Visualization
,Camera Focus
,Polygon Zone Visualization
,Google Gemini
,OpenAI
,Circle Visualization
,Triangle Visualization
,Stability AI Inpainting
,Classification Label Visualization
,Bounding Box Visualization
,Depth Estimation
,SIFT
,Size Measurement
,Florence-2 Model
,Buffer
,Image Convert Grayscale
,Halo Visualization
,Dimension Collapse
,Grid Visualization
,Dynamic Zone
,Polygon Visualization
,Absolute Static Crop
,Dot Visualization
,Color Visualization
,Label Visualization
,Stability AI Outpainting
,Crop Visualization
,Perspective Correction
,Stability AI Image Generation
,Image Slicer
,OpenAI
,Image Threshold
,Image Preprocessing
,Model Comparison Visualization
,Clip Comparison
,SIFT Comparison
,Dynamic Crop
,Stitch Images
,Image Contours
,Mask Visualization
,Florence-2 Model
,Pixelate Visualization
,Clip Comparison
,Llama 3.2 Vision
,Camera Calibration
,Reference Path Visualization
,Image Blur
,Line Counter Visualization
,Anthropic Claude
,Background Color Visualization
,Image Slicer
,Keypoint Visualization
,Blur Visualization
,Ellipse Visualization
,Trace Visualization
,Relative Static Crop
- outputs:
YOLO-World Model
,VLM as Detector
,Polygon Zone Visualization
,Google Gemini
,Corner Visualization
,Keypoint Detection Model
,OpenAI
,Trace Visualization
,Roboflow Dataset Upload
,Circle Visualization
,Triangle Visualization
,Classification Label Visualization
,Bounding Box Visualization
,Florence-2 Model
,Cache Set
,Buffer
,Halo Visualization
,Grid Visualization
,Email Notification
,Polygon Visualization
,Object Detection Model
,Instance Segmentation Model
,Label Visualization
,Color Visualization
,Dot Visualization
,Detections Consensus
,Crop Visualization
,Object Detection Model
,Perspective Correction
,Path Deviation
,OpenAI
,Keypoint Detection Model
,Clip Comparison
,VLM as Classifier
,Mask Visualization
,Time in Zone
,Webhook Sink
,Florence-2 Model
,Clip Comparison
,Roboflow Dataset Upload
,Llama 3.2 Vision
,Line Counter
,Reference Path Visualization
,Time in Zone
,Line Counter Visualization
,Path Deviation
,VLM as Classifier
,Instance Segmentation Model
,Anthropic Claude
,LMM For Classification
,Line Counter
,Keypoint Visualization
,Ellipse Visualization
,Size Measurement
,VLM as Detector
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v1
has.
Bindings
-
input
images
(image
): The image to infer on..texts
(list_of_values
): List of texts to calculate similarity against each input image.
-
output
similarity
(list_of_values
): List of values of any type.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.prediction_type
(prediction_type
): String value with type of prediction.
Example JSON definition of step Clip Comparison
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}