Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v2
.
- inputs:
Grid Visualization
,Image Blur
,Image Preprocessing
,Image Slicer
,OpenAI
,Dynamic Crop
,Absolute Static Crop
,Roboflow Dataset Upload
,Color Visualization
,LMM
,Corner Visualization
,Google Gemini
,Depth Estimation
,Stability AI Outpainting
,Keypoint Visualization
,Trace Visualization
,Clip Comparison
,Google Vision OCR
,Keypoint Detection Model
,Dimension Collapse
,Single-Label Classification Model
,Model Comparison Visualization
,Email Notification
,Mask Visualization
,Image Slicer
,Model Monitoring Inference Aggregator
,Clip Comparison
,Size Measurement
,Multi-Label Classification Model
,Buffer
,Image Threshold
,Contrast Equalization
,OpenAI
,Morphological Transformation
,Classification Label Visualization
,Relative Static Crop
,Camera Calibration
,Dynamic Zone
,Florence-2 Model
,Blur Visualization
,Stitch Images
,Roboflow Dataset Upload
,Triangle Visualization
,Perspective Correction
,SIFT
,Icon Visualization
,Label Visualization
,Stability AI Image Generation
,Stitch OCR Detections
,Llama 3.2 Vision
,Ellipse Visualization
,CogVLM
,VLM as Detector
,Line Counter Visualization
,Florence-2 Model
,SIFT Comparison
,Local File Sink
,Slack Notification
,Image Convert Grayscale
,Roboflow Custom Metadata
,Twilio SMS Notification
,Background Color Visualization
,VLM as Classifier
,QR Code Generator
,Polygon Zone Visualization
,Anthropic Claude
,Polygon Visualization
,Camera Focus
,Dot Visualization
,LMM For Classification
,Instance Segmentation Model
,Circle Visualization
,Bounding Box Visualization
,Image Contours
,OpenAI
,Object Detection Model
,OCR Model
,Halo Visualization
,Reference Path Visualization
,CSV Formatter
,Pixelate Visualization
,Webhook Sink
,EasyOCR
,Stability AI Inpainting
,Crop Visualization
- outputs:
OpenAI
,Image Slicer
,Image Preprocessing
,Dynamic Crop
,Multi-Label Classification Model
,Roboflow Dataset Upload
,Moondream2
,Corner Visualization
,Google Gemini
,Keypoint Detection Model
,PTZ Tracking (ONVIF)
.md),Email Notification
,Keypoint Detection Model
,Time in Zone
,Single-Label Classification Model
,Model Comparison Visualization
,Mask Visualization
,Model Monitoring Inference Aggregator
,Line Counter
,OpenAI
,Morphological Transformation
,Classification Label Visualization
,Time in Zone
,Dynamic Zone
,Florence-2 Model
,Stitch Images
,Cache Set
,Triangle Visualization
,Stability AI Image Generation
,Pixel Color Count
,Llama 3.2 Vision
,Ellipse Visualization
,CogVLM
,Detections Stabilizer
,Florence-2 Model
,Local File Sink
,Byte Tracker
,Distance Measurement
,Background Color Visualization
,QR Code Generator
,Segment Anything 2 Model
,Anthropic Claude
,VLM as Detector
,Byte Tracker
,Polygon Visualization
,Instance Segmentation Model
,Detections Classes Replacement
,Identify Outliers
,OpenAI
,Object Detection Model
,Halo Visualization
,VLM as Classifier
,Stability AI Inpainting
,Grid Visualization
,Image Blur
,Instance Segmentation Model
,Time in Zone
,LMM
,Color Visualization
,Stability AI Outpainting
,Keypoint Visualization
,Trace Visualization
,Clip Comparison
,Google Vision OCR
,YOLO-World Model
,Image Slicer
,Clip Comparison
,Size Measurement
,Multi-Label Classification Model
,Buffer
,Detections Consensus
,Image Threshold
,Contrast Equalization
,Path Deviation
,Relative Static Crop
,Path Deviation
,Roboflow Dataset Upload
,Perspective Correction
,Icon Visualization
,Object Detection Model
,Label Visualization
,Stitch OCR Detections
,VLM as Detector
,Byte Tracker
,Single-Label Classification Model
,Line Counter Visualization
,Line Counter
,SIFT Comparison
,Slack Notification
,Roboflow Custom Metadata
,Perception Encoder Embedding Model
,Twilio SMS Notification
,VLM as Classifier
,Identify Changes
,Cache Get
,Polygon Zone Visualization
,Detections Stitch
,Dot Visualization
,LMM For Classification
,Template Matching
,CLIP Embedding Model
,Circle Visualization
,Bounding Box Visualization
,Reference Path Visualization
,Webhook Sink
,Crop Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v2
has.
Bindings
-
input
images
(image
): The image to infer on..classes
(list_of_values
): List of classes to calculate similarity against each input image.version
(string
): Variant of CLIP model.
-
output
similarities
(list_of_values
): List of values of any type.max_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.most_similar_class
(string
): String value.min_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.least_similar_class
(string
): String value.classification_predictions
(classification_prediction
): Predictions from classifier.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.
Example JSON definition of step Clip Comparison
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v1
.
- inputs:
Grid Visualization
,Ellipse Visualization
,Image Blur
,Image Preprocessing
,Image Slicer
,OpenAI
,Llama 3.2 Vision
,Dynamic Crop
,Absolute Static Crop
,Color Visualization
,Line Counter Visualization
,Corner Visualization
,Florence-2 Model
,Google Gemini
,Depth Estimation
,SIFT Comparison
,Stability AI Outpainting
,Keypoint Visualization
,Image Convert Grayscale
,Trace Visualization
,Clip Comparison
,Background Color Visualization
,Dimension Collapse
,QR Code Generator
,Model Comparison Visualization
,Mask Visualization
,Image Slicer
,Polygon Zone Visualization
,Clip Comparison
,Anthropic Claude
,Size Measurement
,Buffer
,Image Threshold
,Camera Focus
,Contrast Equalization
,Polygon Visualization
,Stability AI Inpainting
,Dot Visualization
,Morphological Transformation
,Classification Label Visualization
,Relative Static Crop
,Circle Visualization
,Bounding Box Visualization
,Camera Calibration
,Dynamic Zone
,Florence-2 Model
,Blur Visualization
,Image Contours
,Stitch Images
,OpenAI
,Halo Visualization
,Reference Path Visualization
,Triangle Visualization
,Pixelate Visualization
,Perspective Correction
,SIFT
,Icon Visualization
,Label Visualization
,Stability AI Image Generation
,Crop Visualization
- outputs:
Grid Visualization
,Ellipse Visualization
,Llama 3.2 Vision
,OpenAI
,VLM as Detector
,Instance Segmentation Model
,Time in Zone
,Roboflow Dataset Upload
,Line Counter Visualization
,Color Visualization
,Florence-2 Model
,Corner Visualization
,Google Gemini
,Keypoint Detection Model
,Line Counter
,Keypoint Visualization
,Trace Visualization
,Clip Comparison
,YOLO-World Model
,Email Notification
,Time in Zone
,Keypoint Detection Model
,VLM as Classifier
,Mask Visualization
,Polygon Zone Visualization
,Clip Comparison
,Anthropic Claude
,VLM as Detector
,Size Measurement
,Buffer
,Detections Consensus
,Polygon Visualization
,Line Counter
,Dot Visualization
,Path Deviation
,Object Detection Model
,LMM For Classification
,Classification Label Visualization
,Time in Zone
,Path Deviation
,Instance Segmentation Model
,Circle Visualization
,Bounding Box Visualization
,Florence-2 Model
,OpenAI
,Object Detection Model
,Cache Set
,Halo Visualization
,Reference Path Visualization
,VLM as Classifier
,Roboflow Dataset Upload
,Triangle Visualization
,Perspective Correction
,Webhook Sink
,Label Visualization
,Crop Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v1
has.
Bindings
-
input
images
(image
): The image to infer on..texts
(list_of_values
): List of texts to calculate similarity against each input image.
-
output
similarity
(list_of_values
): List of values of any type.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.prediction_type
(prediction_type
): String value with type of prediction.
Example JSON definition of step Clip Comparison
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}