VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v2
.
- inputs:
Anthropic Claude
,Crop Visualization
,SIFT
,Stability AI Image Generation
,Triangle Visualization
,Blur Visualization
,Background Color Visualization
,Relative Static Crop
,Color Visualization
,Image Contours
,Camera Focus
,Corner Visualization
,Line Counter Visualization
,Icon Visualization
,Mask Visualization
,Image Convert Grayscale
,Circle Visualization
,Image Blur
,Pixelate Visualization
,Google Gemini
,Absolute Static Crop
,Model Comparison Visualization
,Llama 3.2 Vision
,Dynamic Zone
,Image Threshold
,Reference Path Visualization
,Image Slicer
,Stitch Images
,Depth Estimation
,Trace Visualization
,OpenAI
,Image Preprocessing
,Classification Label Visualization
,Polygon Visualization
,Stability AI Outpainting
,Keypoint Visualization
,Dot Visualization
,Grid Visualization
,Clip Comparison
,Bounding Box Visualization
,Camera Calibration
,Polygon Zone Visualization
,Ellipse Visualization
,QR Code Generator
,Size Measurement
,Halo Visualization
,OpenAI
,Perspective Correction
,Florence-2 Model
,Stability AI Inpainting
,Florence-2 Model
,Buffer
,Image Slicer
,Dimension Collapse
,SIFT Comparison
,Label Visualization
,Clip Comparison
,Dynamic Crop
- outputs:
Crop Visualization
,Segment Anything 2 Model
,Keypoint Detection Model
,Triangle Visualization
,Blur Visualization
,PTZ Tracking (ONVIF)
.md),Line Counter Visualization
,Slack Notification
,Background Color Visualization
,Color Visualization
,Corner Visualization
,Multi-Label Classification Model
,Mask Visualization
,Icon Visualization
,Pixelate Visualization
,Circle Visualization
,Gaze Detection
,Model Comparison Visualization
,Instance Segmentation Model
,Time in Zone
,Multi-Label Classification Model
,Object Detection Model
,Dynamic Zone
,Keypoint Detection Model
,Reference Path Visualization
,Detections Consensus
,Roboflow Dataset Upload
,Roboflow Dataset Upload
,Single-Label Classification Model
,Trace Visualization
,Classification Label Visualization
,Polygon Visualization
,Roboflow Custom Metadata
,Keypoint Visualization
,Time in Zone
,Dot Visualization
,Email Notification
,Object Detection Model
,Single-Label Classification Model
,Bounding Box Visualization
,Polygon Zone Visualization
,Detections Classes Replacement
,Ellipse Visualization
,Label Visualization
,Halo Visualization
,Perspective Correction
,Stability AI Inpainting
,Model Monitoring Inference Aggregator
,Template Matching
,Instance Segmentation Model
,Twilio SMS Notification
,SIFT Comparison
,Webhook Sink
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v2
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(inference_id
): Inference identifier.
Example JSON definition of step VLM as Classifier
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v1
.
- inputs:
Anthropic Claude
,Crop Visualization
,SIFT
,Stability AI Image Generation
,Triangle Visualization
,Blur Visualization
,Background Color Visualization
,Relative Static Crop
,Color Visualization
,Image Contours
,Camera Focus
,Corner Visualization
,Line Counter Visualization
,Icon Visualization
,Mask Visualization
,Image Convert Grayscale
,Circle Visualization
,Image Blur
,Pixelate Visualization
,Google Gemini
,Absolute Static Crop
,Model Comparison Visualization
,Llama 3.2 Vision
,Dynamic Zone
,Image Threshold
,Reference Path Visualization
,Image Slicer
,Stitch Images
,Depth Estimation
,Trace Visualization
,OpenAI
,Image Preprocessing
,Classification Label Visualization
,Polygon Visualization
,Stability AI Outpainting
,Keypoint Visualization
,Dot Visualization
,Grid Visualization
,Clip Comparison
,Bounding Box Visualization
,Camera Calibration
,Polygon Zone Visualization
,Ellipse Visualization
,QR Code Generator
,Size Measurement
,Halo Visualization
,OpenAI
,Perspective Correction
,Florence-2 Model
,Stability AI Inpainting
,Florence-2 Model
,Buffer
,Image Slicer
,Dimension Collapse
,SIFT Comparison
,Label Visualization
,Clip Comparison
,Dynamic Crop
- outputs:
Anthropic Claude
,Crop Visualization
,Line Counter
,Line Counter
,LMM For Classification
,Blur Visualization
,PTZ Tracking (ONVIF)
.md),Line Counter Visualization
,Color Visualization
,Cache Set
,Mask Visualization
,Circle Visualization
,Google Gemini
,Object Detection Model
,Multi-Label Classification Model
,Dynamic Zone
,Keypoint Detection Model
,Detections Consensus
,Trace Visualization
,Image Preprocessing
,Roboflow Custom Metadata
,Object Detection Model
,Cache Get
,Polygon Zone Visualization
,LMM
,QR Code Generator
,YOLO-World Model
,Size Measurement
,Halo Visualization
,CLIP Embedding Model
,Perspective Correction
,Florence-2 Model
,Stability AI Inpainting
,Moondream2
,Template Matching
,Label Visualization
,Webhook Sink
,Distance Measurement
,Pixel Color Count
,Segment Anything 2 Model
,Perception Encoder Embedding Model
,Stability AI Image Generation
,Keypoint Detection Model
,Triangle Visualization
,Background Color Visualization
,Slack Notification
,Corner Visualization
,Multi-Label Classification Model
,Path Deviation
,Icon Visualization
,Pixelate Visualization
,Image Blur
,Gaze Detection
,Model Comparison Visualization
,Llama 3.2 Vision
,Instance Segmentation Model
,Time in Zone
,Image Threshold
,Google Vision OCR
,Reference Path Visualization
,Roboflow Dataset Upload
,CogVLM
,Roboflow Dataset Upload
,Single-Label Classification Model
,OpenAI
,Classification Label Visualization
,Polygon Visualization
,Keypoint Visualization
,Stability AI Outpainting
,Time in Zone
,Dot Visualization
,Email Notification
,Local File Sink
,OpenAI
,Single-Label Classification Model
,Bounding Box Visualization
,Detections Classes Replacement
,Ellipse Visualization
,OpenAI
,Florence-2 Model
,Path Deviation
,Model Monitoring Inference Aggregator
,Twilio SMS Notification
,Instance Segmentation Model
,SIFT Comparison
,Detections Stitch
,Clip Comparison
,Dynamic Crop
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v1
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(string
): String value.
Example JSON definition of step VLM as Classifier
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}