LMM For Classification¶
Deprecated
This block is deprecated and may be removed in a future release.
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/lmm_for_classification@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
lmm_type |
str |
Type of LMM to be used. | ✅ |
classes |
List[str] |
List of classes that LMM shall classify against. | ✅ |
lmm_config |
LMMConfig |
Configuration of LMM. | ❌ |
remote_api_key |
str |
Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v.. |
✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification in version v1.
- inputs:
S3 Sink,Email Notification,Clip Comparison,Morphological Transformation,VLM As Detector,Qwen-VL,Twilio SMS/MMS Notification,MoonshotAI Kimi,Polygon Zone Visualization,Stitch OCR Detections,OpenAI-Compatible LLM,OpenAI,Heatmap Visualization,Keypoint Visualization,Email Notification,Llama 3.2 Vision,Anthropic Claude,Stability AI Image Generation,Google Vision OCR,Camera Focus,Label Visualization,Instance Segmentation Model,Local File Sink,Google Gemini,Motion Detection,Background Color Visualization,Qwen 3.5 API,Google Gemini,Polygon Visualization,SIFT Comparison,Grid Visualization,Florence-2 Model,OCR Model,VLM As Classifier,LMM For Classification,Keypoint Detection Model,Image Preprocessing,SIFT,Roboflow Dataset Upload,Dynamic Zone,Corner Visualization,Stability AI Outpainting,Halo Visualization,Multi-Label Classification Model,Qwen3.5-VL,Detections List Roll-Up,Blur Visualization,Morphological Transformation,Trace Visualization,Stitch OCR Detections,Reference Path Visualization,Halo Visualization,Model Comparison Visualization,Dot Visualization,Background Subtraction,Text Display,Absolute Static Crop,CSV Formatter,Florence-2 Model,Icon Visualization,Perspective Correction,Stability AI Inpainting,Image Convert Grayscale,QR Code Generator,OpenRouter,Model Monitoring Inference Aggregator,OpenAI,Llama 3.2 Vision,Image Threshold,Anthropic Claude,Dynamic Crop,Size Measurement,Clip Comparison,Contrast Enhancement,Bounding Box Visualization,Depth Estimation,Image Contours,EasyOCR,Relative Static Crop,Polygon Visualization,Google Gemma API,Qwen 3.6 API,Image Blur,Anthropic Claude,Triangle Visualization,Object Detection Model,Roboflow Custom Metadata,OpenAI,Slack Notification,Image Stack,Pixelate Visualization,Stitch Images,Single-Label Classification Model,OpenAI,Buffer,Image Slicer,Line Counter Visualization,Image Slicer,LMM,Roboflow Dataset Upload,Color Visualization,Google Gemini,Classification Label Visualization,Camera Focus,Camera Calibration,Ellipse Visualization,Mask Visualization,GLM-OCR,Crop Visualization,Circle Visualization,CogVLM,Dimension Collapse,Contrast Equalization,Roboflow Vision Events,Webhook Sink,Twilio SMS Notification,MoonshotAI Kimi,Google Gemma - outputs:
Object Detection Model,Perspective Correction,SAM 3,S3 Sink,Stability AI Inpainting,Email Notification,Keypoint Detection Model,Morphological Transformation,Path Deviation,Qwen-VL,Clip Comparison,SAM 3,Line Counter,Twilio SMS/MMS Notification,QR Code Generator,OpenRouter,YOLO-World Model,Model Monitoring Inference Aggregator,OpenAI,Llama 3.2 Vision,Line Counter,Time in Zone,MoonshotAI Kimi,Stitch OCR Detections,Polygon Zone Visualization,Image Threshold,Anthropic Claude,OpenAI-Compatible LLM,OpenAI,Dynamic Crop,Size Measurement,Heatmap Visualization,Email Notification,Keypoint Visualization,Llama 3.2 Vision,Anthropic Claude,Stability AI Image Generation,Seg Preview,Google Vision OCR,Cache Set,Label Visualization,SAM 3,Instance Segmentation Model,Path Deviation,Bounding Box Visualization,Local File Sink,Depth Estimation,Google Gemini,CLIP Embedding Model,Multi-Label Classification Model,Polygon Visualization,Google Gemma API,Background Color Visualization,Qwen 3.6 API,Instance Segmentation Model,Qwen 3.5 API,Google Gemini,Polygon Visualization,Image Blur,Moondream2,SIFT Comparison,Anthropic Claude,Florence-2 Model,Triangle Visualization,Time in Zone,Single-Label Classification Model,Roboflow Custom Metadata,OpenAI,Slack Notification,OpenAI,Instance Segmentation Model,LMM For Classification,Image Preprocessing,Roboflow Dataset Upload,Line Counter Visualization,Detections Classes Replacement,Segment Anything 2 Model,Stability AI Outpainting,Corner Visualization,Cache Get,Halo Visualization,LMM,Roboflow Dataset Upload,Time in Zone,Semantic Segmentation Model,Color Visualization,Google Gemini,Classification Label Visualization,Perception Encoder Embedding Model,Distance Measurement,Morphological Transformation,Trace Visualization,Detections Stitch,Stitch OCR Detections,Reference Path Visualization,Halo Visualization,Ellipse Visualization,Model Comparison Visualization,Dot Visualization,PTZ Tracking (ONVIF),Mask Visualization,Pixel Color Count,GLM-OCR,Crop Visualization,CogVLM,Circle Visualization,Text Display,Florence-2 Model,Contrast Equalization,Roboflow Vision Events,Webhook Sink,Icon Visualization,Twilio SMS Notification,MoonshotAI Kimi,Google Gemma
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
LMM For Classification in version v1 has.
Bindings
-
input
images(image): The image to infer on..lmm_type(string): Type of LMM to be used.classes(list_of_values): List of classes that LMM shall classify against.remote_api_key(Union[string,secret]): Holds API key required to call LMM model - in current state of development, we require OpenAI key whenlmm_type=gpt_4v..
-
output
raw_output(string): String value.top(top_class): String value representing top class predicted by classification model.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step LMM For Classification in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/lmm_for_classification@v1",
"images": "$inputs.image",
"lmm_type": "gpt_4v",
"classes": [
"a",
"b"
],
"lmm_config": {
"gpt_image_detail": "low",
"gpt_model_version": "gpt-4o",
"max_tokens": 200
},
"remote_api_key": "xxx-xxx"
}