LMM For Classification¶
Deprecated
This block is deprecated and may be removed in a future release.
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/lmm_for_classification@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
lmm_type |
str |
Type of LMM to be used. | ✅ |
classes |
List[str] |
List of classes that LMM shall classify against. | ✅ |
lmm_config |
LMMConfig |
Configuration of LMM. | ❌ |
remote_api_key |
str |
Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v.. |
✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Runtime compatibility¶
-
requires_internet— air-gapped / offline deployments - This block depends on a service that is not reachable from fully offline / air-gapped deployments.
-
hard— runtimehosted_serverless; executionremote - LMM_ENABLED=False on Roboflow Hosted Serverless: the /llm_v1 endpoint is not registered, so run_remotely() returns 404.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification in version v1.
- inputs:
Roboflow Asset Library Attributes,MoonshotAI Kimi,Image Blur,Reference Path Visualization,Event Writer,Slack Notification,Halo Visualization,VLM As Classifier,Image Stack,Google Gemma,Qwen 3.6 API,Clip Comparison,Dot Visualization,Label Visualization,Background Color Visualization,Llama 3.2 Vision,Email Notification,Pixelate Visualization,OpenAI-Compatible LLM,Google Gemini,Anthropic Claude,OpenAI,Trace Visualization,Llama 3.2 Vision,Clip Comparison,Camera Focus,OpenAI,GLM-OCR,PLC ModbusTCP,Buffer,MQTT Writer,SIFT Comparison,CSV Formatter,Webhook Sink,Image Contours,Local File Sink,Motion Detection,Google Gemini,MoonshotAI Kimi,Polygon Visualization,Dimension Collapse,SIFT,Classification Label Visualization,Multi-Label Classification Model,Keypoint Detection Model,Keypoint Visualization,Icon Visualization,Dynamic Crop,Stability AI Inpainting,Bounding Box Visualization,Polygon Zone Visualization,Stability AI Outpainting,Crop Visualization,Image Convert Grayscale,Mask Visualization,Halo Visualization,PLC EthernetIP,Text Display,Morphological Transformation,Anthropic Claude,Roboflow Dataset Upload,Object Detection Model,Ellipse Visualization,Size Measurement,Circle Visualization,Twilio SMS Notification,Email Notification,S3 Sink,Camera Focus,Image Slicer,LMM For Classification,OCR Model,Heatmap Visualization,OpenAI,Google Gemma API,Stitch Images,Morphological Transformation,EasyOCR,Current Time,Blur Visualization,Stitch OCR Detections,Detections List Roll-Up,Florence-2 Model,Google Gemini,Corner Visualization,OpenRouter,Model Comparison Visualization,Model Monitoring Inference Aggregator,Google Vision OCR,Image Threshold,LMM,Single-Label Classification Model,Polygon Visualization,Stability AI Image Generation,Line Counter Visualization,CogVLM,Relative Static Crop,Qwen3.5-VL,Grid Visualization,Image Preprocessing,Stitch OCR Detections,Anthropic Claude,OPC UA Writer Sink,Color Visualization,Dynamic Zone,Triangle Visualization,QR Code Generator,Contrast Enhancement,Roboflow Dataset Upload,Absolute Static Crop,Qwen 3.5 API,Background Subtraction,OpenAI,Image Slicer,Qwen-VL,Florence-2 Model,Perspective Correction,Twilio SMS/MMS Notification,Roboflow Vision Events,Microsoft SQL Server Sink,Instance Segmentation Model,Depth Estimation,Roboflow Custom Metadata,Contrast Equalization,Camera Calibration,VLM As Detector - outputs:
Cache Set,SAM 3,Size Measurement,MoonshotAI Kimi,Roboflow Asset Library Attributes,Circle Visualization,Semantic Segmentation Model,Path Deviation,Path Deviation,Twilio SMS Notification,Image Blur,Email Notification,S3 Sink,Reference Path Visualization,PTZ Tracking (ONVIF),Event Writer,Slack Notification,Halo Visualization,CLIP Embedding Model,Google Gemma,Qwen 3.6 API,Dot Visualization,SAM 3,Label Visualization,Background Color Visualization,Llama 3.2 Vision,Email Notification,LMM For Classification,OpenAI-Compatible LLM,Heatmap Visualization,Google Gemini,Anthropic Claude,Google Gemma API,OpenAI,Cache Get,Time in Zone,Morphological Transformation,Single-Label Classification Model,OpenAI,YOLO-World Model,Current Time,Trace Visualization,Llama 3.2 Vision,Stitch OCR Detections,Moondream2,OpenAI,Clip Comparison,GLM-OCR,Florence-2 Model,Google Gemini,Corner Visualization,OpenRouter,Pixel Color Count,Model Comparison Visualization,MQTT Writer,Webhook Sink,SIFT Comparison,SAM 3,Model Monitoring Inference Aggregator,Google Vision OCR,Image Threshold,Local File Sink,Instance Segmentation Model,Google Gemini,MoonshotAI Kimi,LMM,Polygon Visualization,Polygon Visualization,Segment Anything 2 Model,Time in Zone,Stability AI Image Generation,Classification Label Visualization,Line Counter Visualization,Line Counter,CogVLM,Qwen3.5-VL,Instance Segmentation Model,Keypoint Visualization,Image Preprocessing,Stitch OCR Detections,Anthropic Claude,OPC UA Writer Sink,Instance Segmentation Model,Icon Visualization,Color Visualization,Seg Preview,Triangle Visualization,QR Code Generator,Qwen 3.5 API,Roboflow Dataset Upload,Object Detection Model,Dynamic Crop,Stability AI Inpainting,Bounding Box Visualization,Polygon Zone Visualization,Stability AI Outpainting,Multi-Label Classification Model,Crop Visualization,OpenAI,Mask Visualization,Halo Visualization,Detections Stitch,Distance Measurement,Qwen-VL,Florence-2 Model,Perspective Correction,Roboflow Vision Events,Anthropic Claude,Morphological Transformation,Text Display,Microsoft SQL Server Sink,Twilio SMS/MMS Notification,Perception Encoder Embedding Model,Instance Segmentation Model,Roboflow Dataset Upload,Depth Estimation,Roboflow Custom Metadata,Contrast Equalization,Ellipse Visualization,Detections Classes Replacement,Keypoint Detection Model,SAM3 Video Tracker,Line Counter,Time in Zone
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
LMM For Classification in version v1 has.
Bindings
-
input
images(image): The image to infer on..lmm_type(string): Type of LMM to be used.classes(list_of_values): List of classes that LMM shall classify against.remote_api_key(Union[secret,string]): Holds API key required to call LMM model - in current state of development, we require OpenAI key whenlmm_type=gpt_4v..
-
output
raw_output(string): String value.top(top_class): String value representing top class predicted by classification model.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step LMM For Classification in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/lmm_for_classification@v1",
"images": "$inputs.image",
"lmm_type": "gpt_4v",
"classes": [
"a",
"b"
],
"lmm_config": {
"gpt_image_detail": "low",
"gpt_model_version": "gpt-4o",
"max_tokens": 200
},
"remote_api_key": "xxx-xxx"
}