VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v2
.
- inputs:
Circle Visualization
,Background Color Visualization
,Corner Visualization
,Bounding Box Visualization
,Line Counter Visualization
,Image Preprocessing
,Trace Visualization
,Label Visualization
,Clip Comparison
,Polygon Zone Visualization
,Camera Focus
,Image Slicer
,Image Slicer
,Image Blur
,Anthropic Claude
,Crop Visualization
,Dot Visualization
,Google Gemini
,Relative Static Crop
,Model Comparison Visualization
,Stability AI Inpainting
,Dimension Collapse
,Pixelate Visualization
,Perspective Correction
,OpenAI
,Image Convert Grayscale
,Absolute Static Crop
,Mask Visualization
,Stability AI Image Generation
,Color Visualization
,Image Threshold
,Clip Comparison
,Dynamic Crop
,Halo Visualization
,Polygon Visualization
,Florence-2 Model
,Image Contours
,Dynamic Zone
,Buffer
,Camera Calibration
,SIFT
,Reference Path Visualization
,Florence-2 Model
,Classification Label Visualization
,Triangle Visualization
,SIFT Comparison
,Llama 3.2 Vision
,Keypoint Visualization
,Grid Visualization
,Ellipse Visualization
,Stitch Images
,Size Measurement
,Blur Visualization
- outputs:
Circle Visualization
,Background Color Visualization
,Corner Visualization
,Twilio SMS Notification
,Bounding Box Visualization
,Object Detection Model
,Slack Notification
,Line Counter Visualization
,Keypoint Detection Model
,Trace Visualization
,Label Visualization
,Polygon Zone Visualization
,Crop Visualization
,Dot Visualization
,Roboflow Dataset Upload
,Model Comparison Visualization
,Single-Label Classification Model
,Pixelate Visualization
,Perspective Correction
,Detections Consensus
,Gaze Detection
,Mask Visualization
,Time in Zone
,Webhook Sink
,Color Visualization
,Time in Zone
,Halo Visualization
,Polygon Visualization
,Template Matching
,Detections Classes Replacement
,Instance Segmentation Model
,Keypoint Detection Model
,Instance Segmentation Model
,Email Notification
,Object Detection Model
,Reference Path Visualization
,Multi-Label Classification Model
,Triangle Visualization
,Model Monitoring Inference Aggregator
,Classification Label Visualization
,Single-Label Classification Model
,SIFT Comparison
,Roboflow Dataset Upload
,Keypoint Visualization
,Multi-Label Classification Model
,Roboflow Custom Metadata
,Segment Anything 2 Model
,Ellipse Visualization
,Blur Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v2
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(inference_id
): Inference identifier.
Example JSON definition of step VLM as Classifier
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v1
.
- inputs:
Circle Visualization
,Background Color Visualization
,Corner Visualization
,Bounding Box Visualization
,Line Counter Visualization
,Image Preprocessing
,Trace Visualization
,Label Visualization
,Clip Comparison
,Polygon Zone Visualization
,Camera Focus
,Image Slicer
,Image Slicer
,Image Blur
,Anthropic Claude
,Crop Visualization
,Dot Visualization
,Google Gemini
,Relative Static Crop
,Model Comparison Visualization
,Stability AI Inpainting
,Dimension Collapse
,Pixelate Visualization
,Perspective Correction
,OpenAI
,Image Convert Grayscale
,Absolute Static Crop
,Mask Visualization
,Stability AI Image Generation
,Color Visualization
,Image Threshold
,Clip Comparison
,Dynamic Crop
,Halo Visualization
,Polygon Visualization
,Florence-2 Model
,Image Contours
,Dynamic Zone
,Buffer
,Camera Calibration
,SIFT
,Reference Path Visualization
,Florence-2 Model
,Classification Label Visualization
,Triangle Visualization
,SIFT Comparison
,Llama 3.2 Vision
,Keypoint Visualization
,Grid Visualization
,Ellipse Visualization
,Stitch Images
,Size Measurement
,Blur Visualization
- outputs:
Circle Visualization
,Background Color Visualization
,Corner Visualization
,Twilio SMS Notification
,Slack Notification
,LMM
,Polygon Zone Visualization
,Image Blur
,Cache Set
,Dot Visualization
,Path Deviation
,Google Gemini
,Roboflow Dataset Upload
,Single-Label Classification Model
,Stability AI Inpainting
,Pixelate Visualization
,Line Counter
,OpenAI
,Detections Consensus
,Gaze Detection
,Distance Measurement
,Stability AI Image Generation
,Webhook Sink
,Color Visualization
,Image Threshold
,Halo Visualization
,Polygon Visualization
,Detections Classes Replacement
,Instance Segmentation Model
,CogVLM
,Email Notification
,Object Detection Model
,Classification Label Visualization
,Single-Label Classification Model
,Llama 3.2 Vision
,Google Vision OCR
,Roboflow Dataset Upload
,Ellipse Visualization
,Size Measurement
,Pixel Color Count
,Cache Get
,Bounding Box Visualization
,Object Detection Model
,Line Counter Visualization
,Image Preprocessing
,Keypoint Detection Model
,Trace Visualization
,Label Visualization
,Local File Sink
,Anthropic Claude
,Crop Visualization
,Detections Stitch
,YOLO-World Model
,Model Comparison Visualization
,Perspective Correction
,OpenAI
,Path Deviation
,Mask Visualization
,Time in Zone
,Clip Comparison
,Time in Zone
,Dynamic Crop
,Template Matching
,Florence-2 Model
,Instance Segmentation Model
,Keypoint Detection Model
,Reference Path Visualization
,Multi-Label Classification Model
,Florence-2 Model
,Triangle Visualization
,Model Monitoring Inference Aggregator
,CLIP Embedding Model
,SIFT Comparison
,Keypoint Visualization
,Multi-Label Classification Model
,LMM For Classification
,Roboflow Custom Metadata
,Line Counter
,Segment Anything 2 Model
,Blur Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v1
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(string
): String value.
Example JSON definition of step VLM as Classifier
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}