VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_statusis setTruewhenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/vlm_as_classifier@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier in version v2.
- inputs:
Label Visualization,Blur Visualization,Background Color Visualization,Contrast Equalization,Bounding Box Visualization,Camera Calibration,Polygon Visualization,Stability AI Outpainting,Image Slicer,Keypoint Visualization,Reference Path Visualization,Pixelate Visualization,Icon Visualization,Clip Comparison,Triangle Visualization,Anthropic Claude,Model Comparison Visualization,Corner Visualization,Florence-2 Model,Image Preprocessing,Google Gemini,Google Gemini,Color Visualization,SIFT Comparison,Line Counter Visualization,Grid Visualization,Stitch Images,Halo Visualization,Size Measurement,Stability AI Image Generation,Anthropic Claude,QR Code Generator,Circle Visualization,Image Contours,Dynamic Zone,Relative Static Crop,Dot Visualization,Polygon Zone Visualization,Ellipse Visualization,Llama 3.2 Vision,Image Blur,Dimension Collapse,Clip Comparison,Absolute Static Crop,Depth Estimation,Image Slicer,OpenAI,Morphological Transformation,Stability AI Inpainting,Dynamic Crop,Camera Focus,Crop Visualization,Image Threshold,Perspective Correction,Image Convert Grayscale,Mask Visualization,Trace Visualization,OpenAI,OpenAI,Florence-2 Model,Classification Label Visualization,Buffer,SIFT - outputs:
Label Visualization,Keypoint Detection Model,Time in Zone,Blur Visualization,Background Color Visualization,Keypoint Visualization,Bounding Box Visualization,Polygon Visualization,Reference Path Visualization,PTZ Tracking (ONVIF).md),Pixelate Visualization,Detections Classes Replacement,Single-Label Classification Model,Icon Visualization,Triangle Visualization,SAM 3,Template Matching,Roboflow Dataset Upload,Model Comparison Visualization,Corner Visualization,SAM 3,Color Visualization,SIFT Comparison,Object Detection Model,Line Counter Visualization,Email Notification,Halo Visualization,Dynamic Zone,Circle Visualization,Twilio SMS Notification,Time in Zone,Dot Visualization,Object Detection Model,Polygon Zone Visualization,Ellipse Visualization,Email Notification,Slack Notification,Model Monitoring Inference Aggregator,Instance Segmentation Model,Multi-Label Classification Model,Time in Zone,Roboflow Dataset Upload,Stability AI Inpainting,Gaze Detection,Single-Label Classification Model,Detections Consensus,Webhook Sink,Instance Segmentation Model,Crop Visualization,Perspective Correction,Trace Visualization,Mask Visualization,Multi-Label Classification Model,Roboflow Custom Metadata,Classification Label Visualization,Keypoint Detection Model,Segment Anything 2 Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier in version v2 has.
Bindings
-
input
image(image): The image which was the base to generate VLM prediction.vlm_output(language_model_output): The string with raw classification prediction to parse..classes(list_of_values): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status(boolean): Boolean flag.predictions(classification_prediction): Predictions from classifier.inference_id(inference_id): Inference identifier.
Example JSON definition of step VLM as Classifier in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_statusis setTruewhenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/vlm_as_classifier@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier in version v1.
- inputs:
Label Visualization,Blur Visualization,Background Color Visualization,Contrast Equalization,Bounding Box Visualization,Camera Calibration,Polygon Visualization,Stability AI Outpainting,Image Slicer,Keypoint Visualization,Reference Path Visualization,Pixelate Visualization,Icon Visualization,Clip Comparison,Triangle Visualization,Anthropic Claude,Model Comparison Visualization,Corner Visualization,Florence-2 Model,Image Preprocessing,Google Gemini,Google Gemini,Color Visualization,SIFT Comparison,Line Counter Visualization,Grid Visualization,Stitch Images,Halo Visualization,Size Measurement,Stability AI Image Generation,Anthropic Claude,QR Code Generator,Circle Visualization,Image Contours,Dynamic Zone,Relative Static Crop,Dot Visualization,Polygon Zone Visualization,Ellipse Visualization,Llama 3.2 Vision,Image Blur,Dimension Collapse,Clip Comparison,Absolute Static Crop,Depth Estimation,Image Slicer,OpenAI,Morphological Transformation,Stability AI Inpainting,Dynamic Crop,Camera Focus,Crop Visualization,Image Threshold,Perspective Correction,Image Convert Grayscale,Mask Visualization,Trace Visualization,OpenAI,OpenAI,Florence-2 Model,Classification Label Visualization,Buffer,SIFT - outputs:
Label Visualization,Blur Visualization,Background Color Visualization,Contrast Equalization,Reference Path Visualization,Stability AI Outpainting,Pixelate Visualization,Single-Label Classification Model,Perception Encoder Embedding Model,Seg Preview,Image Preprocessing,Color Visualization,SIFT Comparison,Email Notification,Cache Set,Circle Visualization,Object Detection Model,Moondream2,Model Monitoring Inference Aggregator,Path Deviation,LMM,Time in Zone,Morphological Transformation,Gaze Detection,Detections Consensus,Crop Visualization,OpenAI,Florence-2 Model,Classification Label Visualization,Segment Anything 2 Model,Time in Zone,YOLO-World Model,PTZ Tracking (ONVIF).md),Icon Visualization,Distance Measurement,Line Counter Visualization,Halo Visualization,Size Measurement,Dynamic Zone,Twilio SMS Notification,Time in Zone,Detections Stitch,Llama 3.2 Vision,Image Blur,Slack Notification,OpenAI,Multi-Label Classification Model,OpenAI,Dynamic Crop,Pixel Color Count,Mask Visualization,Google Vision OCR,LMM For Classification,Keypoint Visualization,Bounding Box Visualization,SAM 3,SAM 3,Object Detection Model,Path Deviation,Anthropic Claude,Polygon Zone Visualization,Ellipse Visualization,Line Counter,Email Notification,Clip Comparison,Roboflow Dataset Upload,SAM 3,CogVLM,Multi-Label Classification Model,Roboflow Custom Metadata,Keypoint Detection Model,Stitch OCR Detections,Keypoint Detection Model,Line Counter,Polygon Visualization,CLIP Embedding Model,Detections Classes Replacement,Cache Get,Triangle Visualization,Template Matching,Roboflow Dataset Upload,Anthropic Claude,Model Comparison Visualization,Corner Visualization,Florence-2 Model,Google Gemini,Google Gemini,Stability AI Image Generation,QR Code Generator,Dot Visualization,Local File Sink,Instance Segmentation Model,Stability AI Inpainting,Single-Label Classification Model,Webhook Sink,Instance Segmentation Model,Image Threshold,Perspective Correction,OpenAI,Trace Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier in version v1 has.
Bindings
-
input
image(image): The image which was the base to generate VLM prediction.vlm_output(language_model_output): The string with raw classification prediction to parse..classes(list_of_values): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status(boolean): Boolean flag.predictions(classification_prediction): Predictions from classifier.inference_id(string): String value.
Example JSON definition of step VLM as Classifier in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}