VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_statusis setTruewhenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/vlm_as_classifier@v2to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier in version v2.
- inputs:
Llama 3.2 Vision,Dot Visualization,Stability AI Inpainting,Reference Path Visualization,Clip Comparison,Buffer,SIFT,Halo Visualization,Image Convert Grayscale,Stability AI Outpainting,QR Code Generator,Anthropic Claude,Triangle Visualization,Depth Estimation,Image Contours,Line Counter Visualization,Mask Visualization,Image Slicer,Size Measurement,Ellipse Visualization,Model Comparison Visualization,Dimension Collapse,Polygon Visualization,Background Color Visualization,Polygon Zone Visualization,Corner Visualization,Crop Visualization,Stitch Images,Contrast Equalization,Blur Visualization,Dynamic Crop,Image Slicer,Camera Focus,OpenAI,Dynamic Zone,Google Gemini,Color Visualization,Florence-2 Model,Classification Label Visualization,Label Visualization,OpenAI,Circle Visualization,Image Threshold,SIFT Comparison,Keypoint Visualization,Camera Calibration,Trace Visualization,Image Preprocessing,Morphological Transformation,Icon Visualization,Perspective Correction,Bounding Box Visualization,Clip Comparison,Absolute Static Crop,Grid Visualization,Pixelate Visualization,Image Blur,Relative Static Crop,Florence-2 Model,Stability AI Image Generation - outputs:
PTZ Tracking (ONVIF).md),Single-Label Classification Model,Dot Visualization,Time in Zone,Stability AI Inpainting,Reference Path Visualization,Multi-Label Classification Model,Halo Visualization,Object Detection Model,Model Monitoring Inference Aggregator,Multi-Label Classification Model,Time in Zone,Triangle Visualization,Keypoint Detection Model,Mask Visualization,Line Counter Visualization,Detections Classes Replacement,Ellipse Visualization,Model Comparison Visualization,Template Matching,Twilio SMS Notification,Time in Zone,Roboflow Custom Metadata,Single-Label Classification Model,Polygon Visualization,Polygon Zone Visualization,Background Color Visualization,Corner Visualization,Crop Visualization,Roboflow Dataset Upload,Blur Visualization,Keypoint Detection Model,Object Detection Model,Instance Segmentation Model,Dynamic Zone,Segment Anything 2 Model,Email Notification,Color Visualization,Classification Label Visualization,Gaze Detection,Label Visualization,Keypoint Visualization,SIFT Comparison,Trace Visualization,Detections Consensus,Circle Visualization,Instance Segmentation Model,Icon Visualization,Roboflow Dataset Upload,Bounding Box Visualization,Pixelate Visualization,Slack Notification,Perspective Correction,Webhook Sink
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier in version v2 has.
Bindings
-
input
image(image): The image which was the base to generate VLM prediction.vlm_output(language_model_output): The string with raw classification prediction to parse..classes(list_of_values): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status(boolean): Boolean flag.predictions(classification_prediction): Predictions from classifier.inference_id(inference_id): Inference identifier.
Example JSON definition of step VLM as Classifier in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1 (there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_statusis setTruewhenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/vlm_as_classifier@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier in version v1.
- inputs:
Llama 3.2 Vision,Dot Visualization,Stability AI Inpainting,Reference Path Visualization,Clip Comparison,Buffer,SIFT,Halo Visualization,Image Convert Grayscale,Stability AI Outpainting,QR Code Generator,Anthropic Claude,Triangle Visualization,Depth Estimation,Image Contours,Line Counter Visualization,Mask Visualization,Image Slicer,Size Measurement,Ellipse Visualization,Model Comparison Visualization,Dimension Collapse,Polygon Visualization,Background Color Visualization,Polygon Zone Visualization,Corner Visualization,Crop Visualization,Stitch Images,Contrast Equalization,Blur Visualization,Dynamic Crop,Image Slicer,Camera Focus,OpenAI,Dynamic Zone,Google Gemini,Color Visualization,Florence-2 Model,Classification Label Visualization,Label Visualization,OpenAI,Circle Visualization,Image Threshold,SIFT Comparison,Keypoint Visualization,Camera Calibration,Trace Visualization,Image Preprocessing,Morphological Transformation,Icon Visualization,Perspective Correction,Bounding Box Visualization,Clip Comparison,Absolute Static Crop,Grid Visualization,Pixelate Visualization,Image Blur,Relative Static Crop,Florence-2 Model,Stability AI Image Generation - outputs:
PTZ Tracking (ONVIF).md),Local File Sink,Stability AI Inpainting,Distance Measurement,Perception Encoder Embedding Model,QR Code Generator,Path Deviation,Time in Zone,Size Measurement,Polygon Zone Visualization,Contrast Equalization,Object Detection Model,Florence-2 Model,Gaze Detection,Detections Consensus,Roboflow Dataset Upload,Pixelate Visualization,Perspective Correction,Line Counter,Single-Label Classification Model,LMM For Classification,Detections Stitch,LMM,Multi-Label Classification Model,Model Monitoring Inference Aggregator,CogVLM,Model Comparison Visualization,Template Matching,Twilio SMS Notification,Single-Label Classification Model,Polygon Visualization,Keypoint Detection Model,Instance Segmentation Model,Label Visualization,OpenAI,Keypoint Visualization,Circle Visualization,Trace Visualization,Instance Segmentation Model,Stability AI Image Generation,Dot Visualization,Reference Path Visualization,CLIP Embedding Model,Object Detection Model,Stability AI Outpainting,Multi-Label Classification Model,Cache Set,Line Counter Visualization,Detections Classes Replacement,Ellipse Visualization,Roboflow Custom Metadata,Background Color Visualization,Roboflow Dataset Upload,Dynamic Zone,Google Gemini,Google Vision OCR,SIFT Comparison,Image Threshold,Image Preprocessing,Icon Visualization,YOLO-World Model,Image Blur,Florence-2 Model,Pixel Color Count,Llama 3.2 Vision,Line Counter,Time in Zone,Clip Comparison,Halo Visualization,Cache Get,Anthropic Claude,Triangle Visualization,Keypoint Detection Model,Mask Visualization,Stitch OCR Detections,Time in Zone,Moondream2,Corner Visualization,Crop Visualization,Blur Visualization,Dynamic Crop,OpenAI,Segment Anything 2 Model,Email Notification,Color Visualization,Classification Label Visualization,Morphological Transformation,OpenAI,Bounding Box Visualization,Path Deviation,Slack Notification,Webhook Sink
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier in version v1 has.
Bindings
-
input
image(image): The image which was the base to generate VLM prediction.vlm_output(language_model_output): The string with raw classification prediction to parse..classes(list_of_values): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status(boolean): Boolean flag.predictions(classification_prediction): Predictions from classifier.inference_id(string): String value.
Example JSON definition of step VLM as Classifier in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}