VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v2
.
- inputs:
Grid Visualization
,Ellipse Visualization
,Image Blur
,Image Preprocessing
,Image Slicer
,Llama 3.2 Vision
,OpenAI
,Dynamic Crop
,Absolute Static Crop
,Color Visualization
,Line Counter Visualization
,Corner Visualization
,Florence-2 Model
,Google Gemini
,Depth Estimation
,SIFT Comparison
,Stability AI Outpainting
,Keypoint Visualization
,Image Convert Grayscale
,Trace Visualization
,Clip Comparison
,Background Color Visualization
,Dimension Collapse
,QR Code Generator
,Model Comparison Visualization
,Mask Visualization
,Image Slicer
,Polygon Zone Visualization
,Clip Comparison
,Anthropic Claude
,Size Measurement
,Buffer
,Image Threshold
,Camera Focus
,Contrast Equalization
,Polygon Visualization
,Stability AI Inpainting
,Dot Visualization
,Morphological Transformation
,Classification Label Visualization
,Relative Static Crop
,Circle Visualization
,Bounding Box Visualization
,Camera Calibration
,Dynamic Zone
,Florence-2 Model
,Blur Visualization
,Image Contours
,Stitch Images
,OpenAI
,Halo Visualization
,Reference Path Visualization
,Triangle Visualization
,Pixelate Visualization
,Perspective Correction
,SIFT
,Icon Visualization
,Label Visualization
,Stability AI Image Generation
,Crop Visualization
- outputs:
Ellipse Visualization
,Instance Segmentation Model
,Single-Label Classification Model
,Time in Zone
,Multi-Label Classification Model
,Roboflow Dataset Upload
,Line Counter Visualization
,Color Visualization
,Corner Visualization
,Keypoint Detection Model
,Slack Notification
,Keypoint Visualization
,SIFT Comparison
,PTZ Tracking (ONVIF)
.md),Trace Visualization
,Gaze Detection
,Roboflow Custom Metadata
,Twilio SMS Notification
,Background Color Visualization
,Keypoint Detection Model
,Email Notification
,Time in Zone
,Model Comparison Visualization
,Single-Label Classification Model
,Segment Anything 2 Model
,Mask Visualization
,Model Monitoring Inference Aggregator
,Polygon Zone Visualization
,Multi-Label Classification Model
,Detections Consensus
,Polygon Visualization
,Stability AI Inpainting
,Dot Visualization
,Object Detection Model
,Template Matching
,Classification Label Visualization
,Time in Zone
,Instance Segmentation Model
,Detections Classes Replacement
,Circle Visualization
,Bounding Box Visualization
,Dynamic Zone
,Blur Visualization
,Object Detection Model
,Halo Visualization
,Reference Path Visualization
,Roboflow Dataset Upload
,Triangle Visualization
,Pixelate Visualization
,Perspective Correction
,Webhook Sink
,Icon Visualization
,Label Visualization
,Crop Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v2
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(inference_id
): Inference identifier.
Example JSON definition of step VLM as Classifier
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v1
.
- inputs:
Grid Visualization
,Ellipse Visualization
,Image Blur
,Image Preprocessing
,Image Slicer
,Llama 3.2 Vision
,OpenAI
,Dynamic Crop
,Absolute Static Crop
,Color Visualization
,Line Counter Visualization
,Corner Visualization
,Florence-2 Model
,Google Gemini
,Depth Estimation
,SIFT Comparison
,Stability AI Outpainting
,Keypoint Visualization
,Image Convert Grayscale
,Trace Visualization
,Clip Comparison
,Background Color Visualization
,Dimension Collapse
,QR Code Generator
,Model Comparison Visualization
,Mask Visualization
,Image Slicer
,Polygon Zone Visualization
,Clip Comparison
,Anthropic Claude
,Size Measurement
,Buffer
,Image Threshold
,Camera Focus
,Contrast Equalization
,Polygon Visualization
,Stability AI Inpainting
,Dot Visualization
,Morphological Transformation
,Classification Label Visualization
,Relative Static Crop
,Circle Visualization
,Bounding Box Visualization
,Camera Calibration
,Dynamic Zone
,Florence-2 Model
,Blur Visualization
,Image Contours
,Stitch Images
,OpenAI
,Halo Visualization
,Reference Path Visualization
,Triangle Visualization
,Pixelate Visualization
,Perspective Correction
,SIFT
,Icon Visualization
,Label Visualization
,Stability AI Image Generation
,Crop Visualization
- outputs:
OpenAI
,Image Preprocessing
,Dynamic Crop
,Multi-Label Classification Model
,Roboflow Dataset Upload
,Moondream2
,Corner Visualization
,Google Gemini
,Keypoint Detection Model
,PTZ Tracking (ONVIF)
.md),Keypoint Detection Model
,Email Notification
,Time in Zone
,Model Comparison Visualization
,Single-Label Classification Model
,Mask Visualization
,Model Monitoring Inference Aggregator
,Line Counter
,OpenAI
,Morphological Transformation
,Classification Label Visualization
,Time in Zone
,Dynamic Zone
,Florence-2 Model
,Cache Set
,Triangle Visualization
,Pixel Color Count
,Stability AI Image Generation
,Llama 3.2 Vision
,Ellipse Visualization
,CogVLM
,Florence-2 Model
,Local File Sink
,Distance Measurement
,Background Color Visualization
,QR Code Generator
,Segment Anything 2 Model
,Anthropic Claude
,Polygon Visualization
,Instance Segmentation Model
,Detections Classes Replacement
,OpenAI
,Object Detection Model
,Halo Visualization
,Stability AI Inpainting
,Image Blur
,Instance Segmentation Model
,Time in Zone
,LMM
,Color Visualization
,Stability AI Outpainting
,Keypoint Visualization
,Trace Visualization
,Clip Comparison
,Google Vision OCR
,YOLO-World Model
,Size Measurement
,Multi-Label Classification Model
,Detections Consensus
,Image Threshold
,Contrast Equalization
,Path Deviation
,Path Deviation
,Blur Visualization
,Roboflow Dataset Upload
,Perspective Correction
,Icon Visualization
,Object Detection Model
,Label Visualization
,Stitch OCR Detections
,Single-Label Classification Model
,Line Counter Visualization
,Line Counter
,Slack Notification
,SIFT Comparison
,Roboflow Custom Metadata
,Gaze Detection
,Perception Encoder Embedding Model
,Twilio SMS Notification
,Cache Get
,Polygon Zone Visualization
,Detections Stitch
,Dot Visualization
,LMM For Classification
,Template Matching
,CLIP Embedding Model
,Circle Visualization
,Bounding Box Visualization
,Reference Path Visualization
,Pixelate Visualization
,Webhook Sink
,Crop Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v1
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(string
): String value.
Example JSON definition of step VLM as Classifier
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}