VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v2
.
- inputs:
Keypoint Visualization
,Google Gemini
,Image Contours
,Circle Visualization
,Image Threshold
,Absolute Static Crop
,Image Slicer
,Perspective Correction
,Color Visualization
,Mask Visualization
,Reference Path Visualization
,Stitch Images
,Image Blur
,Florence-2 Model
,Blur Visualization
,Pixelate Visualization
,Relative Static Crop
,Clip Comparison
,Clip Comparison
,Dot Visualization
,Image Slicer
,Stability AI Inpainting
,SIFT Comparison
,Classification Label Visualization
,Icon Visualization
,Dimension Collapse
,Polygon Zone Visualization
,Depth Estimation
,Polygon Visualization
,Stability AI Image Generation
,Dynamic Zone
,Dynamic Crop
,Grid Visualization
,Crop Visualization
,OpenAI
,Ellipse Visualization
,Buffer
,Stability AI Outpainting
,Trace Visualization
,Bounding Box Visualization
,Camera Calibration
,Image Preprocessing
,Image Convert Grayscale
,Label Visualization
,Corner Visualization
,SIFT
,QR Code Generator
,Background Color Visualization
,Size Measurement
,Camera Focus
,Model Comparison Visualization
,Triangle Visualization
,Florence-2 Model
,Llama 3.2 Vision
,Halo Visualization
,OpenAI
,Line Counter Visualization
,Anthropic Claude
- outputs:
Multi-Label Classification Model
,Keypoint Detection Model
,Keypoint Visualization
,Email Notification
,Roboflow Dataset Upload
,Circle Visualization
,Time in Zone
,Perspective Correction
,Color Visualization
,Gaze Detection
,Instance Segmentation Model
,Mask Visualization
,Reference Path Visualization
,Single-Label Classification Model
,PTZ Tracking (ONVIF)
.md),Blur Visualization
,Pixelate Visualization
,Keypoint Detection Model
,Webhook Sink
,Halo Visualization
,Object Detection Model
,Slack Notification
,Dot Visualization
,Stability AI Inpainting
,Roboflow Dataset Upload
,Detections Classes Replacement
,SIFT Comparison
,Classification Label Visualization
,Icon Visualization
,Time in Zone
,Model Monitoring Inference Aggregator
,Roboflow Custom Metadata
,Polygon Zone Visualization
,Instance Segmentation Model
,Template Matching
,Polygon Visualization
,Dynamic Zone
,Crop Visualization
,Time in Zone
,Detections Consensus
,Trace Visualization
,Single-Label Classification Model
,Bounding Box Visualization
,Multi-Label Classification Model
,Segment Anything 2 Model
,Label Visualization
,Corner Visualization
,Background Color Visualization
,Triangle Visualization
,Object Detection Model
,Twilio SMS Notification
,Model Comparison Visualization
,Ellipse Visualization
,Line Counter Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v2
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(inference_id
): Inference identifier.
Example JSON definition of step VLM as Classifier
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v1
.
- inputs:
Keypoint Visualization
,Google Gemini
,Image Contours
,Circle Visualization
,Image Threshold
,Absolute Static Crop
,Image Slicer
,Perspective Correction
,Color Visualization
,Mask Visualization
,Reference Path Visualization
,Stitch Images
,Image Blur
,Florence-2 Model
,Blur Visualization
,Pixelate Visualization
,Relative Static Crop
,Clip Comparison
,Clip Comparison
,Dot Visualization
,Image Slicer
,Stability AI Inpainting
,SIFT Comparison
,Classification Label Visualization
,Icon Visualization
,Dimension Collapse
,Polygon Zone Visualization
,Depth Estimation
,Polygon Visualization
,Stability AI Image Generation
,Dynamic Zone
,Dynamic Crop
,Grid Visualization
,Crop Visualization
,OpenAI
,Ellipse Visualization
,Buffer
,Stability AI Outpainting
,Trace Visualization
,Bounding Box Visualization
,Camera Calibration
,Image Preprocessing
,Image Convert Grayscale
,Label Visualization
,Corner Visualization
,SIFT
,QR Code Generator
,Background Color Visualization
,Size Measurement
,Camera Focus
,Model Comparison Visualization
,Triangle Visualization
,Florence-2 Model
,Llama 3.2 Vision
,Halo Visualization
,OpenAI
,Line Counter Visualization
,Anthropic Claude
- outputs:
Keypoint Visualization
,Google Gemini
,Path Deviation
,Gaze Detection
,Reference Path Visualization
,Image Blur
,Florence-2 Model
,Local File Sink
,Clip Comparison
,Icon Visualization
,Time in Zone
,Polygon Zone Visualization
,Instance Segmentation Model
,Dynamic Zone
,Dynamic Crop
,Detections Consensus
,Single-Label Classification Model
,Perception Encoder Embedding Model
,Pixel Color Count
,QR Code Generator
,Llama 3.2 Vision
,Triangle Visualization
,Line Counter Visualization
,Multi-Label Classification Model
,Email Notification
,Roboflow Dataset Upload
,Time in Zone
,Single-Label Classification Model
,Pixelate Visualization
,Object Detection Model
,Dot Visualization
,Roboflow Dataset Upload
,OpenAI
,Model Monitoring Inference Aggregator
,Trace Visualization
,Stability AI Outpainting
,Multi-Label Classification Model
,CogVLM
,Corner Visualization
,Background Color Visualization
,Halo Visualization
,Ellipse Visualization
,OpenAI
,Anthropic Claude
,Keypoint Detection Model
,Circle Visualization
,Image Threshold
,Perspective Correction
,Color Visualization
,Instance Segmentation Model
,PTZ Tracking (ONVIF)
.md),Blur Visualization
,Keypoint Detection Model
,Stability AI Inpainting
,SIFT Comparison
,Cache Get
,Roboflow Custom Metadata
,Template Matching
,Stability AI Image Generation
,Crop Visualization
,Time in Zone
,Segment Anything 2 Model
,Size Measurement
,Object Detection Model
,Twilio SMS Notification
,Model Comparison Visualization
,CLIP Embedding Model
,LMM
,Path Deviation
,Mask Visualization
,Webhook Sink
,Slack Notification
,Cache Set
,Detections Classes Replacement
,YOLO-World Model
,Classification Label Visualization
,Polygon Visualization
,OpenAI
,LMM For Classification
,Line Counter
,Moondream2
,Bounding Box Visualization
,Distance Measurement
,Image Preprocessing
,Google Vision OCR
,Label Visualization
,Line Counter
,Detections Stitch
,Florence-2 Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v1
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(string
): String value.
Example JSON definition of step VLM as Classifier
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}