VLM as Classifier¶
v2¶
Class: VLMAsClassifierBlockV2
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v2.VLMAsClassifierBlockV2
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v2
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v2
.
- inputs:
OpenAI
,Color Visualization
,SIFT Comparison
,Image Blur
,Triangle Visualization
,Llama 3.2 Vision
,Stitch Images
,Image Slicer
,Dimension Collapse
,Pixelate Visualization
,Image Threshold
,SIFT
,Circle Visualization
,Dot Visualization
,Label Visualization
,Corner Visualization
,Line Counter Visualization
,Perspective Correction
,Camera Calibration
,Image Convert Grayscale
,Reference Path Visualization
,Image Contours
,Mask Visualization
,Polygon Zone Visualization
,Florence-2 Model
,Polygon Visualization
,Classification Label Visualization
,Image Slicer
,Stability AI Image Generation
,Grid Visualization
,Google Gemini
,OpenAI
,Bounding Box Visualization
,Buffer
,Background Color Visualization
,Stability AI Inpainting
,Camera Focus
,Clip Comparison
,Model Comparison Visualization
,Clip Comparison
,Anthropic Claude
,Blur Visualization
,Image Preprocessing
,Florence-2 Model
,Size Measurement
,Dynamic Zone
,Ellipse Visualization
,Crop Visualization
,Relative Static Crop
,Absolute Static Crop
,Halo Visualization
,Keypoint Visualization
,Trace Visualization
,Dynamic Crop
,Depth Estimation
- outputs:
Color Visualization
,Multi-Label Classification Model
,SIFT Comparison
,Triangle Visualization
,Keypoint Detection Model
,Object Detection Model
,Instance Segmentation Model
,Detections Classes Replacement
,Pixelate Visualization
,Webhook Sink
,Circle Visualization
,Label Visualization
,Time in Zone
,Dot Visualization
,Twilio SMS Notification
,Corner Visualization
,Perspective Correction
,Slack Notification
,Line Counter Visualization
,Reference Path Visualization
,Multi-Label Classification Model
,Mask Visualization
,Polygon Zone Visualization
,Polygon Visualization
,Roboflow Custom Metadata
,Classification Label Visualization
,Roboflow Dataset Upload
,Detections Consensus
,Object Detection Model
,Bounding Box Visualization
,Email Notification
,Roboflow Dataset Upload
,Template Matching
,Background Color Visualization
,Instance Segmentation Model
,Single-Label Classification Model
,Model Comparison Visualization
,Blur Visualization
,Model Monitoring Inference Aggregator
,Segment Anything 2 Model
,Single-Label Classification Model
,Gaze Detection
,Dynamic Zone
,Crop Visualization
,Keypoint Visualization
,Keypoint Detection Model
,Trace Visualization
,Halo Visualization
,Ellipse Visualization
,Time in Zone
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v2
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(inference_id
): Inference identifier.
Example JSON definition of step VLM as Classifier
in version v2
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v2",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}
v1¶
Class: VLMAsClassifierBlockV1
(there are multiple versions of this block)
Source: inference.core.workflows.core_steps.formatters.vlm_as_classifier.v1.VLMAsClassifierBlockV1
Warning: This block has multiple versions. Please refer to the specific version for details. You can learn more about how versions work here: Versioning
The block expects string input that would be produced by blocks exposing Large Language Models (LLMs) and Visual Language Models (VLMs). Input is parsed to classification prediction and returned as block output.
Accepted formats:
-
valid JSON strings
-
JSON documents wrapped with Markdown tags (very common for GPT responses)
Example:
{"my": "json"}
Details regarding block behavior:
-
error_status
is setTrue
whenever parsing cannot be completed -
in case of multiple markdown blocks with raw JSON content - only first will be parsed
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/vlm_as_classifier@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
classes |
List[str] |
List of all classes used by the model, required to generate mapping between class name and class id.. | ✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to VLM as Classifier
in version v1
.
- inputs:
OpenAI
,Color Visualization
,SIFT Comparison
,Image Blur
,Triangle Visualization
,Llama 3.2 Vision
,Stitch Images
,Image Slicer
,Dimension Collapse
,Pixelate Visualization
,Image Threshold
,SIFT
,Circle Visualization
,Dot Visualization
,Label Visualization
,Corner Visualization
,Line Counter Visualization
,Perspective Correction
,Camera Calibration
,Image Convert Grayscale
,Reference Path Visualization
,Image Contours
,Mask Visualization
,Polygon Zone Visualization
,Florence-2 Model
,Polygon Visualization
,Classification Label Visualization
,Image Slicer
,Stability AI Image Generation
,Grid Visualization
,Google Gemini
,OpenAI
,Bounding Box Visualization
,Buffer
,Background Color Visualization
,Stability AI Inpainting
,Camera Focus
,Clip Comparison
,Model Comparison Visualization
,Clip Comparison
,Anthropic Claude
,Blur Visualization
,Image Preprocessing
,Florence-2 Model
,Size Measurement
,Dynamic Zone
,Ellipse Visualization
,Crop Visualization
,Relative Static Crop
,Absolute Static Crop
,Halo Visualization
,Keypoint Visualization
,Trace Visualization
,Dynamic Crop
,Depth Estimation
- outputs:
Color Visualization
,Multi-Label Classification Model
,SIFT Comparison
,Instance Segmentation Model
,OpenAI
,LMM
,Circle Visualization
,Slack Notification
,Reference Path Visualization
,Multi-Label Classification Model
,Local File Sink
,Cache Get
,Florence-2 Model
,Polygon Visualization
,Distance Measurement
,Roboflow Dataset Upload
,Stability AI Image Generation
,OpenAI
,Path Deviation
,Bounding Box Visualization
,Email Notification
,Roboflow Dataset Upload
,Instance Segmentation Model
,Single-Label Classification Model
,Clip Comparison
,Model Comparison Visualization
,Blur Visualization
,Model Monitoring Inference Aggregator
,Path Deviation
,Florence-2 Model
,Segment Anything 2 Model
,Size Measurement
,Single-Label Classification Model
,Gaze Detection
,Dynamic Zone
,Crop Visualization
,Keypoint Visualization
,Ellipse Visualization
,Trace Visualization
,Cache Set
,Dynamic Crop
,Time in Zone
,OpenAI
,Triangle Visualization
,Image Blur
,Keypoint Detection Model
,Llama 3.2 Vision
,Object Detection Model
,Line Counter
,CLIP Embedding Model
,Detections Classes Replacement
,Pixelate Visualization
,Webhook Sink
,Image Threshold
,Line Counter Visualization
,Label Visualization
,Time in Zone
,Dot Visualization
,Twilio SMS Notification
,Corner Visualization
,Perspective Correction
,Mask Visualization
,Polygon Zone Visualization
,Roboflow Custom Metadata
,Classification Label Visualization
,Line Counter
,CogVLM
,LMM For Classification
,Google Gemini
,Detections Consensus
,Object Detection Model
,Detections Stitch
,Background Color Visualization
,Template Matching
,Stability AI Inpainting
,Anthropic Claude
,Google Vision OCR
,Image Preprocessing
,YOLO-World Model
,Keypoint Detection Model
,Halo Visualization
,Pixel Color Count
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
VLM as Classifier
in version v1
has.
Bindings
-
input
image
(image
): The image which was the base to generate VLM prediction.vlm_output
(language_model_output
): The string with raw classification prediction to parse..classes
(list_of_values
): List of all classes used by the model, required to generate mapping between class name and class id..
-
output
error_status
(boolean
): Boolean flag.predictions
(classification_prediction
): Predictions from classifier.inference_id
(string
): String value.
Example JSON definition of step VLM as Classifier
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/vlm_as_classifier@v1",
"image": "$inputs.image",
"vlm_output": [
"$steps.lmm.output"
],
"classes": [
"$steps.lmm.classes",
"$inputs.classes",
[
"class_a",
"class_b"
]
]
}