Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v2
.
- inputs:
Blur Visualization
,Triangle Visualization
,Anthropic Claude
,Trace Visualization
,Label Visualization
,LMM
,Model Monitoring Inference Aggregator
,Roboflow Dataset Upload
,Absolute Static Crop
,Image Preprocessing
,Relative Static Crop
,Image Threshold
,Reference Path Visualization
,Slack Notification
,Stability AI Outpainting
,SIFT
,Roboflow Dataset Upload
,Google Vision OCR
,Dimension Collapse
,Stability AI Inpainting
,Background Color Visualization
,CSV Formatter
,Circle Visualization
,Image Blur
,Keypoint Visualization
,VLM as Detector
,Google Gemini
,OpenAI
,Image Convert Grayscale
,Line Counter Visualization
,Model Comparison Visualization
,Dynamic Zone
,Roboflow Custom Metadata
,Image Slicer
,Stitch OCR Detections
,Crop Visualization
,Corner Visualization
,Multi-Label Classification Model
,Pixelate Visualization
,Local File Sink
,Image Slicer
,Mask Visualization
,VLM as Classifier
,Clip Comparison
,Color Visualization
,Polygon Visualization
,Email Notification
,Keypoint Detection Model
,Size Measurement
,Perspective Correction
,Camera Calibration
,OpenAI
,Bounding Box Visualization
,Buffer
,Camera Focus
,CogVLM
,Instance Segmentation Model
,Twilio SMS Notification
,OpenAI
,Dynamic Crop
,Depth Estimation
,Halo Visualization
,Florence-2 Model
,Dot Visualization
,Classification Label Visualization
,Webhook Sink
,SIFT Comparison
,Stability AI Image Generation
,Florence-2 Model
,LMM For Classification
,Ellipse Visualization
,Image Contours
,Llama 3.2 Vision
,Clip Comparison
,Single-Label Classification Model
,Grid Visualization
,Stitch Images
,OCR Model
,Object Detection Model
,Polygon Zone Visualization
- outputs:
Triangle Visualization
,Anthropic Claude
,Distance Measurement
,Roboflow Dataset Upload
,Segment Anything 2 Model
,Identify Outliers
,Byte Tracker
,Detections Stabilizer
,Keypoint Visualization
,VLM as Detector
,PTZ Tracking (ONVIF)
.md),Multi-Label Classification Model
,Path Deviation
,Model Comparison Visualization
,Dynamic Zone
,Perception Encoder Embedding Model
,Roboflow Custom Metadata
,Image Slicer
,Crop Visualization
,Corner Visualization
,Multi-Label Classification Model
,Mask Visualization
,Clip Comparison
,Keypoint Detection Model
,Bounding Box Visualization
,CogVLM
,Instance Segmentation Model
,Dynamic Crop
,Halo Visualization
,Florence-2 Model
,Time in Zone
,Florence-2 Model
,VLM as Detector
,LMM For Classification
,Ellipse Visualization
,Llama 3.2 Vision
,Line Counter
,Clip Comparison
,Single-Label Classification Model
,Stitch Images
,Byte Tracker
,Object Detection Model
,Trace Visualization
,YOLO-World Model
,Label Visualization
,LMM
,Model Monitoring Inference Aggregator
,Time in Zone
,Pixel Color Count
,Image Preprocessing
,Keypoint Detection Model
,Relative Static Crop
,Byte Tracker
,Image Threshold
,Reference Path Visualization
,Slack Notification
,Stability AI Outpainting
,Instance Segmentation Model
,Roboflow Dataset Upload
,Google Vision OCR
,Stability AI Inpainting
,Background Color Visualization
,Circle Visualization
,Image Blur
,Google Gemini
,OpenAI
,Line Counter Visualization
,Line Counter
,VLM as Classifier
,Local File Sink
,Image Slicer
,Cache Set
,VLM as Classifier
,Color Visualization
,Polygon Visualization
,Email Notification
,Size Measurement
,Single-Label Classification Model
,Perspective Correction
,Detections Consensus
,Path Deviation
,OpenAI
,Buffer
,Detections Stitch
,Twilio SMS Notification
,OpenAI
,Detections Classes Replacement
,Dot Visualization
,Template Matching
,Classification Label Visualization
,Webhook Sink
,SIFT Comparison
,Stability AI Image Generation
,Object Detection Model
,Identify Changes
,Grid Visualization
,CLIP Embedding Model
,Cache Get
,Polygon Zone Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v2
has.
Bindings
-
input
images
(image
): The image to infer on..classes
(list_of_values
): List of classes to calculate similarity against each input image.version
(string
): Variant of CLIP model.
-
output
similarities
(list_of_values
): List of values of any type.max_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.most_similar_class
(string
): String value.min_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.least_similar_class
(string
): String value.classification_predictions
(classification_prediction
): Predictions from classifier.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.
Example JSON definition of step Clip Comparison
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v1
.
- inputs:
Blur Visualization
,Polygon Visualization
,Triangle Visualization
,Anthropic Claude
,Trace Visualization
,Size Measurement
,Label Visualization
,Clip Comparison
,Perspective Correction
,Camera Calibration
,OpenAI
,Absolute Static Crop
,Image Preprocessing
,Relative Static Crop
,Bounding Box Visualization
,Image Threshold
,Reference Path Visualization
,Buffer
,Stability AI Outpainting
,SIFT
,Camera Focus
,OpenAI
,Dynamic Crop
,Depth Estimation
,Halo Visualization
,Dimension Collapse
,Stability AI Inpainting
,Background Color Visualization
,Dot Visualization
,Florence-2 Model
,Classification Label Visualization
,SIFT Comparison
,Circle Visualization
,Image Blur
,Keypoint Visualization
,Stability AI Image Generation
,Florence-2 Model
,Google Gemini
,Image Convert Grayscale
,Line Counter Visualization
,Model Comparison Visualization
,Ellipse Visualization
,Dynamic Zone
,Image Contours
,Llama 3.2 Vision
,Image Slicer
,Crop Visualization
,Corner Visualization
,Clip Comparison
,Grid Visualization
,Pixelate Visualization
,Stitch Images
,Image Slicer
,Mask Visualization
,Color Visualization
,Polygon Zone Visualization
- outputs:
Polygon Visualization
,Triangle Visualization
,Anthropic Claude
,Email Notification
,Trace Visualization
,YOLO-World Model
,Keypoint Detection Model
,Size Measurement
,Label Visualization
,Perspective Correction
,Detections Consensus
,Path Deviation
,Roboflow Dataset Upload
,Time in Zone
,OpenAI
,Keypoint Detection Model
,Bounding Box Visualization
,Buffer
,Reference Path Visualization
,Color Visualization
,Instance Segmentation Model
,Roboflow Dataset Upload
,Instance Segmentation Model
,OpenAI
,Halo Visualization
,Florence-2 Model
,Dot Visualization
,Classification Label Visualization
,Time in Zone
,Webhook Sink
,Circle Visualization
,Keypoint Visualization
,Florence-2 Model
,VLM as Detector
,LMM For Classification
,Object Detection Model
,VLM as Detector
,Google Gemini
,Line Counter Visualization
,Path Deviation
,Ellipse Visualization
,Mask Visualization
,Llama 3.2 Vision
,Line Counter
,Line Counter
,Clip Comparison
,Crop Visualization
,Corner Visualization
,VLM as Classifier
,Grid Visualization
,Cache Set
,Object Detection Model
,VLM as Classifier
,Clip Comparison
,Polygon Zone Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v1
has.
Bindings
-
input
images
(image
): The image to infer on..texts
(list_of_values
): List of texts to calculate similarity against each input image.
-
output
similarity
(list_of_values
): List of values of any type.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.prediction_type
(prediction_type
): String value with type of prediction.
Example JSON definition of step Clip Comparison
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}