VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_statusis setTruewhenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/vlm_as_classifier@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier in version v2.
- inputs:
Blur Visualization,Classification Label Visualization,Circle Visualization,Crop Visualization,Image Contours,Relative Static Crop,Grid Visualization,OpenAI,Image Preprocessing,Perspective Correction,Ellipse Visualization,Absolute Static Crop,Clip Comparison,Stitch Images,Triangle Visualization,Contrast Equalization,Stability AI Inpainting,QR Code Generator,Image Slicer,Background Color Visualization,Polygon Zone Visualization,Stability AI Image Generation,Depth Estimation,Dot Visualization,Florence-2 Model,Dimension Collapse,Bounding Box Visualization,Camera Focus,Line Counter Visualization,Morphological Transformation,SIFT,Reference Path Visualization,Halo Visualization,SIFT Comparison,Icon Visualization,Buffer,Image Blur,Image Slicer,Polygon Visualization,Pixelate Visualization,Florence-2 Model,Image Threshold,Image Convert Grayscale,Clip Comparison,OpenAI,Color Visualization,Google Gemini,Label Visualization,Anthropic Claude,Llama 3.2 Vision,Google Gemini,Trace Visualization,Dynamic Zone,Dynamic Crop,Model Comparison Visualization,Size Measurement,Corner Visualization,Camera Calibration,Mask Visualization,Keypoint Visualization,Stability AI Outpainting - outputs:
Blur Visualization,Classification Label Visualization,Time in Zone,Circle Visualization,Crop Visualization,Single-Label Classification Model,Detections Classes Replacement,Perspective Correction,Twilio SMS Notification,Ellipse Visualization,Single-Label Classification Model,Triangle Visualization,Roboflow Dataset Upload,Stability AI Inpainting,Roboflow Dataset Upload,Background Color Visualization,Polygon Zone Visualization,Model Monitoring Inference Aggregator,Segment Anything 2 Model,Template Matching,Webhook Sink,Dot Visualization,Bounding Box Visualization,Line Counter Visualization,Instance Segmentation Model,Reference Path Visualization,Halo Visualization,Gaze Detection,Multi-Label Classification Model,SIFT Comparison,Icon Visualization,Polygon Visualization,Time in Zone,Pixelate Visualization,Slack Notification,Instance Segmentation Model,Color Visualization,PTZ Tracking (ONVIF).md),Keypoint Detection Model,Keypoint Detection Model,Object Detection Model,Label Visualization,Email Notification,Trace Visualization,Dynamic Zone,Multi-Label Classification Model,Detections Consensus,Model Comparison Visualization,Email Notification,Corner Visualization,Mask Visualization,Time in Zone,Keypoint Visualization,Roboflow Custom Metadata,Object Detection Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier in version v2 has.
Bindings
-
input
image(image): The image which was the base to generate VLM prediction.vlm_output(language_model_output): The string with raw classification prediction to parse..classes(list_of_values): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status(boolean): Boolean flag.predictions(classification_prediction): Predictions from classifier.inference_id(inference_id): Inference identifier.
Example JSON definition of step VLM as Classifier in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_statusis setTruewhenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/vlm_as_classifier@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier in version v1.
- inputs:
Blur Visualization,Classification Label Visualization,Circle Visualization,Crop Visualization,Image Contours,Relative Static Crop,Grid Visualization,OpenAI,Image Preprocessing,Perspective Correction,Ellipse Visualization,Absolute Static Crop,Clip Comparison,Stitch Images,Triangle Visualization,Contrast Equalization,Stability AI Inpainting,QR Code Generator,Image Slicer,Background Color Visualization,Polygon Zone Visualization,Stability AI Image Generation,Depth Estimation,Dot Visualization,Florence-2 Model,Dimension Collapse,Bounding Box Visualization,Camera Focus,Line Counter Visualization,Morphological Transformation,SIFT,Reference Path Visualization,Halo Visualization,SIFT Comparison,Icon Visualization,Buffer,Image Blur,Image Slicer,Polygon Visualization,Pixelate Visualization,Florence-2 Model,Image Threshold,Image Convert Grayscale,Clip Comparison,OpenAI,Color Visualization,Google Gemini,Label Visualization,Anthropic Claude,Llama 3.2 Vision,Google Gemini,Trace Visualization,Dynamic Zone,Dynamic Crop,Model Comparison Visualization,Size Measurement,Corner Visualization,Camera Calibration,Mask Visualization,Keypoint Visualization,Stability AI Outpainting - outputs:
Google Vision OCR,SAM 3,Image Preprocessing,LMM For Classification,Ellipse Visualization,Triangle Visualization,QR Code Generator,Background Color Visualization,Model Monitoring Inference Aggregator,Segment Anything 2 Model,Template Matching,Distance Measurement,Dot Visualization,Halo Visualization,Slack Notification,Color Visualization,Llama 3.2 Vision,Line Counter,Size Measurement,Email Notification,Corner Visualization,Mask Visualization,Time in Zone,Roboflow Custom Metadata,Stability AI Outpainting,Time in Zone,Crop Visualization,Perspective Correction,Single-Label Classification Model,Contrast Equalization,Polygon Zone Visualization,CLIP Embedding Model,Bounding Box Visualization,Icon Visualization,Image Blur,Time in Zone,Path Deviation,Anthropic Claude,Multi-Label Classification Model,Dynamic Crop,Path Deviation,Detections Consensus,Model Comparison Visualization,Cache Get,Local File Sink,Classification Label Visualization,Circle Visualization,Stability AI Inpainting,Moondream2,Florence-2 Model,Morphological Transformation,Reference Path Visualization,Gaze Detection,SIFT Comparison,Polygon Visualization,Florence-2 Model,Clip Comparison,Perception Encoder Embedding Model,Instance Segmentation Model,OpenAI,PTZ Tracking (ONVIF).md),Line Counter,Keypoint Detection Model,Object Detection Model,Google Gemini,Label Visualization,Email Notification,Trace Visualization,Dynamic Zone,YOLO-World Model,OpenAI,CogVLM,Stitch OCR Detections,Detections Stitch,Cache Set,Blur Visualization,Single-Label Classification Model,OpenAI,Detections Classes Replacement,Twilio SMS Notification,Seg Preview,Roboflow Dataset Upload,Roboflow Dataset Upload,Stability AI Image Generation,Webhook Sink,Line Counter Visualization,Instance Segmentation Model,Multi-Label Classification Model,Pixelate Visualization,Image Threshold,Keypoint Detection Model,LMM,Google Gemini,Pixel Color Count,Keypoint Visualization,Object Detection Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier in version v1 has.
Bindings
-
input
image(image): The image which was the base to generate VLM prediction.vlm_output(language_model_output): The string with raw classification prediction to parse..classes(list_of_values): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status(boolean): Boolean flag.predictions(classification_prediction): Predictions from classifier.inference_id(string): String value.
Example JSON definition of step VLM as Classifier in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}