VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v2
.
- inputs:
Stability AI Inpainting
,Florence-2 Model
,Label Visualization
,Depth Estimation
,Corner Visualization
,Triangle Visualization
,Florence-2 Model
,Background Color Visualization
,Image Blur
,Polygon Zone Visualization
,Model Comparison Visualization
,Line Counter Visualization
,Camera Focus
,Circle Visualization
,Perspective Correction
,Relative Static Crop
,Grid Visualization
,Stability AI Image Generation
,Trace Visualization
,Image Slicer
,Clip Comparison
,Blur Visualization
,Dynamic Zone
,Classification Label Visualization
,Image Convert Grayscale
,Image Preprocessing
,Clip Comparison
,Dimension Collapse
,SIFT Comparison
,OpenAI
,Stitch Images
,Reference Path Visualization
,Stability AI Outpainting
,Llama 3.2 Vision
,Anthropic Claude
,Size Measurement
,Polygon Visualization
,Camera Calibration
,Buffer
,Mask Visualization
,SIFT
,Bounding Box Visualization
,Image Threshold
,Keypoint Visualization
,Ellipse Visualization
,Crop Visualization
,Color Visualization
,Pixelate Visualization
,Image Slicer
,Google Gemini
,Dynamic Crop
,Image Contours
,Absolute Static Crop
,OpenAI
,Halo Visualization
,Dot Visualization
- outputs:
Keypoint Detection Model
,Stability AI Inpainting
,Single-Label Classification Model
,Template Matching
,Model Monitoring Inference Aggregator
,Label Visualization
,Corner Visualization
,Triangle Visualization
,Background Color Visualization
,Polygon Zone Visualization
,Model Comparison Visualization
,Line Counter Visualization
,Circle Visualization
,Perspective Correction
,Trace Visualization
,Blur Visualization
,Dynamic Zone
,Multi-Label Classification Model
,Classification Label Visualization
,Object Detection Model
,Time in Zone
,Slack Notification
,SIFT Comparison
,Detections Consensus
,Gaze Detection
,Reference Path Visualization
,Polygon Visualization
,Roboflow Dataset Upload
,Webhook Sink
,Segment Anything 2 Model
,Time in Zone
,Mask Visualization
,Single-Label Classification Model
,Bounding Box Visualization
,Roboflow Custom Metadata
,Keypoint Visualization
,Ellipse Visualization
,Crop Visualization
,Color Visualization
,Twilio SMS Notification
,Email Notification
,PTZ Tracking (ONVIF)
.md),Object Detection Model
,Pixelate Visualization
,Detections Classes Replacement
,Instance Segmentation Model
,Halo Visualization
,Dot Visualization
,Instance Segmentation Model
,Roboflow Dataset Upload
,Multi-Label Classification Model
,Keypoint Detection Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v2
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(inference_id
): Inference identifier.
Example JSON definition of step VLM as Classifier
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v1
.
- inputs:
Stability AI Inpainting
,Florence-2 Model
,Label Visualization
,Depth Estimation
,Corner Visualization
,Triangle Visualization
,Florence-2 Model
,Background Color Visualization
,Image Blur
,Polygon Zone Visualization
,Model Comparison Visualization
,Line Counter Visualization
,Camera Focus
,Circle Visualization
,Perspective Correction
,Relative Static Crop
,Grid Visualization
,Stability AI Image Generation
,Trace Visualization
,Image Slicer
,Clip Comparison
,Blur Visualization
,Dynamic Zone
,Classification Label Visualization
,Image Convert Grayscale
,Image Preprocessing
,Clip Comparison
,Dimension Collapse
,SIFT Comparison
,OpenAI
,Stitch Images
,Reference Path Visualization
,Stability AI Outpainting
,Llama 3.2 Vision
,Anthropic Claude
,Size Measurement
,Polygon Visualization
,Camera Calibration
,Buffer
,Mask Visualization
,SIFT
,Bounding Box Visualization
,Image Threshold
,Keypoint Visualization
,Ellipse Visualization
,Crop Visualization
,Color Visualization
,Pixelate Visualization
,Image Slicer
,Google Gemini
,Dynamic Crop
,Image Contours
,Absolute Static Crop
,OpenAI
,Halo Visualization
,Dot Visualization
- outputs:
Florence-2 Model
,Model Monitoring Inference Aggregator
,Label Visualization
,Florence-2 Model
,Triangle Visualization
,CogVLM
,Image Blur
,Model Comparison Visualization
,Cache Set
,Line Counter Visualization
,Circle Visualization
,Detections Stitch
,Trace Visualization
,Multi-Label Classification Model
,Object Detection Model
,Path Deviation
,Detections Consensus
,Gaze Detection
,Reference Path Visualization
,Llama 3.2 Vision
,Polygon Visualization
,Roboflow Dataset Upload
,Segment Anything 2 Model
,Time in Zone
,Roboflow Custom Metadata
,Single-Label Classification Model
,Image Threshold
,Path Deviation
,CLIP Embedding Model
,Keypoint Visualization
,Ellipse Visualization
,Crop Visualization
,Color Visualization
,Local File Sink
,Google Gemini
,Dynamic Crop
,OpenAI
,Instance Segmentation Model
,Multi-Label Classification Model
,Dot Visualization
,Instance Segmentation Model
,Roboflow Dataset Upload
,Keypoint Detection Model
,Keypoint Detection Model
,Stability AI Inpainting
,Line Counter
,Google Vision OCR
,Single-Label Classification Model
,Template Matching
,Corner Visualization
,Background Color Visualization
,Polygon Zone Visualization
,Stability AI Image Generation
,Perspective Correction
,Cache Get
,Line Counter
,OpenAI
,Clip Comparison
,Blur Visualization
,Dynamic Zone
,Classification Label Visualization
,Time in Zone
,Image Preprocessing
,Slack Notification
,SIFT Comparison
,OpenAI
,Pixel Color Count
,YOLO-World Model
,Stability AI Outpainting
,Perception Encoder Embedding Model
,Anthropic Claude
,Size Measurement
,Webhook Sink
,Mask Visualization
,Bounding Box Visualization
,Distance Measurement
,Pixelate Visualization
,Twilio SMS Notification
,Email Notification
,PTZ Tracking (ONVIF)
.md),Object Detection Model
,Detections Classes Replacement
,Halo Visualization
,LMM For Classification
,LMM
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v1
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(string
): String value.
Example JSON definition of step VLM as Classifier
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}