LMM For Classification¶
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/lmm_for_classification@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
lmm_type |
str |
Type of LMM to be used. | ✅ |
classes |
List[str] |
List of classes that LMM shall classify against. | ✅ |
lmm_config |
LMMConfig |
Configuration of LMM. | ❌ |
remote_api_key |
str |
Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v.. |
✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification in version v1.
- inputs:
Dynamic Crop,OCR Model,Motion Detection,Email Notification,Image Blur,Background Subtraction,SIFT Comparison,OpenAI,Google Vision OCR,Google Gemini,Image Preprocessing,Instance Segmentation Model,Local File Sink,Single-Label Classification Model,Bounding Box Visualization,Model Monitoring Inference Aggregator,Anthropic Claude,Multi-Label Classification Model,Keypoint Detection Model,Email Notification,Slack Notification,Camera Focus,Twilio SMS/MMS Notification,Dot Visualization,Florence-2 Model,Roboflow Dataset Upload,CSV Formatter,Camera Focus,Stitch OCR Detections,Depth Estimation,Polygon Visualization,Perspective Correction,OpenAI,Camera Calibration,Corner Visualization,Icon Visualization,Image Slicer,Qwen3.5-VL,Line Counter Visualization,Heatmap Visualization,Morphological Transformation,Stability AI Image Generation,Google Gemini,Keypoint Visualization,VLM As Detector,Halo Visualization,Background Color Visualization,Label Visualization,Polygon Visualization,Pixelate Visualization,LMM,CogVLM,Contrast Equalization,Triangle Visualization,Stability AI Outpainting,Dimension Collapse,Mask Visualization,VLM As Classifier,Color Visualization,Text Display,Relative Static Crop,Reference Path Visualization,Stitch OCR Detections,Llama 3.2 Vision,OpenAI,Image Threshold,Clip Comparison,Classification Label Visualization,Clip Comparison,Webhook Sink,Circle Visualization,Polygon Zone Visualization,Image Contours,Image Convert Grayscale,Grid Visualization,Florence-2 Model,Buffer,Roboflow Custom Metadata,LMM For Classification,Dynamic Zone,SIFT,Halo Visualization,Object Detection Model,Anthropic Claude,Google Gemini,Detections List Roll-Up,Model Comparison Visualization,Blur Visualization,QR Code Generator,EasyOCR,Absolute Static Crop,Image Slicer,S3 Sink,Anthropic Claude,Stability AI Inpainting,Ellipse Visualization,Crop Visualization,Trace Visualization,Twilio SMS Notification,Stitch Images,OpenAI,Size Measurement,Roboflow Dataset Upload - outputs:
Dynamic Crop,Time in Zone,Email Notification,Image Blur,OpenAI,Google Vision OCR,SIFT Comparison,Google Gemini,Seg Preview,Image Preprocessing,Time in Zone,Instance Segmentation Model,Local File Sink,Bounding Box Visualization,Model Monitoring Inference Aggregator,Anthropic Claude,Email Notification,Slack Notification,Twilio SMS/MMS Notification,Detections Stitch,Dot Visualization,Florence-2 Model,Roboflow Dataset Upload,Cache Set,SAM 3,Depth Estimation,Stitch OCR Detections,Polygon Visualization,Perspective Correction,OpenAI,PTZ Tracking (ONVIF),Line Counter,Moondream2,Icon Visualization,Corner Visualization,Line Counter Visualization,Heatmap Visualization,Google Gemini,Stability AI Image Generation,Morphological Transformation,Distance Measurement,Keypoint Visualization,Halo Visualization,Background Color Visualization,Label Visualization,Polygon Visualization,LMM,CogVLM,Time in Zone,Contrast Equalization,Triangle Visualization,Stability AI Outpainting,Mask Visualization,Color Visualization,Instance Segmentation Model,Detections Classes Replacement,Text Display,Line Counter,Reference Path Visualization,Stitch OCR Detections,OpenAI,Llama 3.2 Vision,Image Threshold,Clip Comparison,Classification Label Visualization,Webhook Sink,Circle Visualization,Polygon Zone Visualization,Florence-2 Model,SAM 3,Roboflow Custom Metadata,Perception Encoder Embedding Model,LMM For Classification,Cache Get,YOLO-World Model,Halo Visualization,Anthropic Claude,Google Gemini,Model Comparison Visualization,QR Code Generator,Path Deviation,S3 Sink,Anthropic Claude,SAM 3,CLIP Embedding Model,Stability AI Inpainting,Ellipse Visualization,Crop Visualization,Path Deviation,Trace Visualization,Segment Anything 2 Model,Twilio SMS Notification,Size Measurement,OpenAI,Pixel Color Count,Roboflow Dataset Upload
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
LMM For Classification in version v1 has.
Bindings
-
input
images(image): The image to infer on..lmm_type(string): Type of LMM to be used.classes(list_of_values): List of classes that LMM shall classify against.remote_api_key(Union[string,secret]): Holds API key required to call LMM model - in current state of development, we require OpenAI key whenlmm_type=gpt_4v..
-
output
raw_output(string): String value.top(top_class): String value representing top class predicted by classification model.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step LMM For Classification in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/lmm_for_classification@v1",
"images": "$inputs.image",
"lmm_type": "gpt_4v",
"classes": [
"a",
"b"
],
"lmm_config": {
"gpt_image_detail": "low",
"gpt_model_version": "gpt-4o",
"max_tokens": 200
},
"remote_api_key": "xxx-xxx"
}