Clip Comparison¶
v2¶
Class: ClipComparisonBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v2.ClipComparisonBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
classes |
List[str] |
List of classes to calculate similarity against each input image. | ✅ |
version |
str |
Variant of CLIP model. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v2
.
- inputs:
Relative Static Crop
,Background Color Visualization
,Line Counter Visualization
,OpenAI
,Roboflow Dataset Upload
,Grid Visualization
,LMM For Classification
,Object Detection Model
,Llama 3.2 Vision
,Camera Focus
,Color Visualization
,Roboflow Dataset Upload
,Label Visualization
,CSV Formatter
,Dot Visualization
,Bounding Box Visualization
,Single-Label Classification Model
,OCR Model
,Google Vision OCR
,Dynamic Crop
,Clip Comparison
,Model Comparison Visualization
,Anthropic Claude
,Depth Estimation
,Corner Visualization
,SIFT Comparison
,Camera Calibration
,Keypoint Visualization
,Mask Visualization
,Image Threshold
,Florence-2 Model
,Halo Visualization
,VLM as Detector
,Polygon Zone Visualization
,Clip Comparison
,Google Gemini
,Polygon Visualization
,Image Preprocessing
,Instance Segmentation Model
,SIFT
,Dynamic Zone
,Twilio SMS Notification
,Reference Path Visualization
,Roboflow Custom Metadata
,Blur Visualization
,Image Contours
,Pixelate Visualization
,Keypoint Detection Model
,OpenAI
,Ellipse Visualization
,Crop Visualization
,Stitch Images
,Florence-2 Model
,Stitch OCR Detections
,CogVLM
,Image Slicer
,LMM
,Webhook Sink
,Perspective Correction
,Size Measurement
,Dimension Collapse
,Email Notification
,Multi-Label Classification Model
,Image Blur
,Stability AI Image Generation
,Image Convert Grayscale
,Stability AI Inpainting
,Model Monitoring Inference Aggregator
,Triangle Visualization
,Classification Label Visualization
,Trace Visualization
,Buffer
,VLM as Classifier
,Absolute Static Crop
,Circle Visualization
,Slack Notification
,Local File Sink
,Image Slicer
- outputs:
Identify Changes
,Line Counter Visualization
,Relative Static Crop
,Background Color Visualization
,Detections Stabilizer
,Object Detection Model
,Cache Set
,Color Visualization
,Roboflow Dataset Upload
,Object Detection Model
,Bounding Box Visualization
,Path Deviation
,Clip Comparison
,Identify Outliers
,Model Comparison Visualization
,Anthropic Claude
,Corner Visualization
,Line Counter
,Multi-Label Classification Model
,Keypoint Visualization
,Mask Visualization
,VLM as Detector
,Florence-2 Model
,Image Threshold
,Polygon Zone Visualization
,Google Gemini
,Image Preprocessing
,Twilio SMS Notification
,Reference Path Visualization
,Roboflow Custom Metadata
,Detections Classes Replacement
,Keypoint Detection Model
,Ellipse Visualization
,Byte Tracker
,Crop Visualization
,Stitch Images
,Line Counter
,LMM
,Webhook Sink
,Email Notification
,Size Measurement
,Image Blur
,Stability AI Inpainting
,Triangle Visualization
,Classification Label Visualization
,Keypoint Detection Model
,Path Deviation
,Time in Zone
,VLM as Classifier
,Local File Sink
,OpenAI
,Roboflow Dataset Upload
,Grid Visualization
,LMM For Classification
,Llama 3.2 Vision
,Instance Segmentation Model
,Pixel Color Count
,Label Visualization
,YOLO-World Model
,Dot Visualization
,Byte Tracker
,Detections Consensus
,Template Matching
,Single-Label Classification Model
,Google Vision OCR
,Dynamic Crop
,Byte Tracker
,SIFT Comparison
,Halo Visualization
,Clip Comparison
,Polygon Visualization
,Instance Segmentation Model
,Dynamic Zone
,CLIP Embedding Model
,VLM as Detector
,Detections Stitch
,OpenAI
,Florence-2 Model
,CogVLM
,Image Slicer
,Cache Get
,Perspective Correction
,Multi-Label Classification Model
,Time in Zone
,Stability AI Image Generation
,Model Monitoring Inference Aggregator
,Trace Visualization
,Distance Measurement
,Single-Label Classification Model
,Buffer
,VLM as Classifier
,Segment Anything 2 Model
,Circle Visualization
,Slack Notification
,Image Slicer
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v2
has.
Bindings
-
input
images
(image
): The image to infer on..classes
(list_of_values
): List of classes to calculate similarity against each input image.version
(string
): Variant of CLIP model.
-
output
similarities
(list_of_values
): List of values of any type.max_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.most_similar_class
(string
): String value.min_similarity
(float_zero_to_one
):float
value in range[0.0, 1.0]
.least_similar_class
(string
): String value.classification_predictions
(classification_prediction
): Predictions from classifier.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.
Example JSON definition of step Clip Comparison
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v2",
"images": "$inputs.image",
"classes": [
"a",
"b",
"c"
],
"version": "ViT-B-16"
}
v1¶
Class: ClipComparisonBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.models.foundation.clip_comparison.v1.ClipComparisonBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
Use the OpenAI CLIP zero-shot classification model to classify images.
This block accepts an image and a list of text prompts. The block then returns the similarity of each text label to the provided image.
This block is useful for classifying images without having to train a fine-tuned classification model. For example, you could use CLIP to classify the type of vehicle in an image, or if an image contains NSFW material.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/clip_comparison@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Unique name of step in workflows. | ❌ |
texts |
List[str] |
List of texts to calculate similarity against each input image. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Clip Comparison
in version v1
.
- inputs:
Image Preprocessing
,Relative Static Crop
,Background Color Visualization
,Line Counter Visualization
,SIFT
,Dynamic Zone
,OpenAI
,Reference Path Visualization
,Grid Visualization
,Blur Visualization
,Image Contours
,Pixelate Visualization
,Llama 3.2 Vision
,Camera Focus
,Color Visualization
,Ellipse Visualization
,Label Visualization
,Image Slicer
,Crop Visualization
,Stitch Images
,Polygon Zone Visualization
,Dot Visualization
,Bounding Box Visualization
,Florence-2 Model
,Image Slicer
,Perspective Correction
,Size Measurement
,Dimension Collapse
,Image Blur
,Stability AI Image Generation
,Dynamic Crop
,Image Convert Grayscale
,Stability AI Inpainting
,Clip Comparison
,Clip Comparison
,Model Comparison Visualization
,Triangle Visualization
,Classification Label Visualization
,Trace Visualization
,Anthropic Claude
,Buffer
,Depth Estimation
,Corner Visualization
,SIFT Comparison
,Camera Calibration
,Absolute Static Crop
,Keypoint Visualization
,Mask Visualization
,Image Threshold
,Florence-2 Model
,Halo Visualization
,Circle Visualization
,Google Gemini
,Polygon Visualization
- outputs:
Instance Segmentation Model
,Line Counter Visualization
,OpenAI
,Reference Path Visualization
,Grid Visualization
,VLM as Detector
,LMM For Classification
,Object Detection Model
,Keypoint Detection Model
,Cache Set
,Llama 3.2 Vision
,Color Visualization
,Instance Segmentation Model
,Ellipse Visualization
,Label Visualization
,YOLO-World Model
,Crop Visualization
,Dot Visualization
,Line Counter
,Object Detection Model
,Florence-2 Model
,Bounding Box Visualization
,Detections Consensus
,Webhook Sink
,Email Notification
,Size Measurement
,Perspective Correction
,Path Deviation
,Time in Zone
,Clip Comparison
,Clip Comparison
,Anthropic Claude
,Classification Label Visualization
,Triangle Visualization
,Keypoint Detection Model
,Buffer
,Trace Visualization
,VLM as Classifier
,Corner Visualization
,Line Counter
,Path Deviation
,Time in Zone
,Circle Visualization
,Keypoint Visualization
,Mask Visualization
,VLM as Detector
,Florence-2 Model
,Halo Visualization
,VLM as Classifier
,Polygon Zone Visualization
,Google Gemini
,Polygon Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Clip Comparison
in version v1
has.
Bindings
-
input
images
(image
): The image to infer on..texts
(list_of_values
): List of texts to calculate similarity against each input image.
-
output
similarity
(list_of_values
): List of values of any type.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.prediction_type
(prediction_type
): String value with type of prediction.
Example JSON definition of step Clip Comparison
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/clip_comparison@v1",
"images": "$inputs.image",
"texts": [
"a",
"b",
"c"
]
}