LMM For Classification¶
Deprecated
This block is deprecated and may be removed in a future release.
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/lmm_for_classification@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
lmm_type |
str |
Type of LMM to be used. | ✅ |
classes |
List[str] |
List of classes that LMM shall classify against. | ✅ |
lmm_config |
LMMConfig |
Configuration of LMM. | ❌ |
remote_api_key |
str |
Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v.. |
✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification in version v1.
- inputs:
Stitch Images,Image Threshold,Email Notification,Corner Visualization,Image Blur,Ellipse Visualization,OpenAI,Roboflow Dataset Upload,Object Detection Model,Depth Estimation,Stitch OCR Detections,EasyOCR,Absolute Static Crop,Dimension Collapse,CogVLM,Google Gemini,Stability AI Image Generation,Grid Visualization,Dynamic Crop,Image Slicer,Image Preprocessing,Relative Static Crop,SIFT,Morphological Transformation,Instance Segmentation Model,Line Counter Visualization,Trace Visualization,LMM For Classification,Halo Visualization,Dot Visualization,GLM-OCR,Model Monitoring Inference Aggregator,Roboflow Custom Metadata,Pixelate Visualization,Circle Visualization,Image Convert Grayscale,Icon Visualization,QR Code Generator,S3 Sink,Keypoint Detection Model,Twilio SMS Notification,Halo Visualization,Camera Focus,Anthropic Claude,OCR Model,Polygon Visualization,Text Display,Reference Path Visualization,Llama 3.2 Vision,CSV Formatter,Crop Visualization,Roboflow Dataset Upload,Mask Visualization,Heatmap Visualization,Webhook Sink,Label Visualization,Classification Label Visualization,Google Vision OCR,Detections List Roll-Up,Florence-2 Model,Florence-2 Model,VLM As Detector,Polygon Zone Visualization,Stability AI Inpainting,Google Gemini,Perspective Correction,Camera Calibration,Anthropic Claude,OpenAI,OpenAI,Qwen3.5-VL,Background Color Visualization,Anthropic Claude,Size Measurement,Email Notification,Background Subtraction,Contrast Equalization,SIFT Comparison,Multi-Label Classification Model,Keypoint Visualization,Stitch OCR Detections,LMM,Color Visualization,Single-Label Classification Model,Motion Detection,Dynamic Zone,OpenAI,Roboflow Vision Events,Local File Sink,VLM As Classifier,Twilio SMS/MMS Notification,Clip Comparison,Buffer,Triangle Visualization,Clip Comparison,Blur Visualization,Bounding Box Visualization,Camera Focus,Polygon Visualization,Google Gemini,Image Slicer,Image Contours,Model Comparison Visualization,Stability AI Outpainting,Slack Notification - outputs:
Image Threshold,Email Notification,Corner Visualization,Image Blur,Ellipse Visualization,OpenAI,Roboflow Dataset Upload,Time in Zone,Stitch OCR Detections,Depth Estimation,CogVLM,Google Gemini,Stability AI Image Generation,Time in Zone,Dynamic Crop,Instance Segmentation Model,Image Preprocessing,Morphological Transformation,Line Counter Visualization,Trace Visualization,LMM For Classification,Halo Visualization,Dot Visualization,GLM-OCR,Polygon Visualization,Model Monitoring Inference Aggregator,Cache Get,Roboflow Custom Metadata,Pixel Color Count,Circle Visualization,Icon Visualization,QR Code Generator,S3 Sink,Detections Classes Replacement,Twilio SMS Notification,Halo Visualization,Anthropic Claude,SAM 3,Polygon Visualization,Text Display,Reference Path Visualization,Instance Segmentation Model,Llama 3.2 Vision,Roboflow Dataset Upload,Crop Visualization,Mask Visualization,CLIP Embedding Model,Heatmap Visualization,Webhook Sink,Cache Set,Google Vision OCR,Label Visualization,Classification Label Visualization,Florence-2 Model,Segment Anything 2 Model,Florence-2 Model,Polygon Zone Visualization,Stability AI Inpainting,Google Gemini,SAM 3,Perspective Correction,Anthropic Claude,OpenAI,OpenAI,PTZ Tracking (ONVIF),Background Color Visualization,Anthropic Claude,Size Measurement,Email Notification,SIFT Comparison,Contrast Equalization,Keypoint Visualization,Time in Zone,Line Counter,Path Deviation,Stitch OCR Detections,LMM,Perception Encoder Embedding Model,Line Counter,SAM 3,Seg Preview,Color Visualization,OpenAI,Roboflow Vision Events,Local File Sink,Detections Stitch,Twilio SMS/MMS Notification,YOLO-World Model,Triangle Visualization,Clip Comparison,Bounding Box Visualization,Distance Measurement,Google Gemini,Path Deviation,Moondream2,Model Comparison Visualization,Stability AI Outpainting,Slack Notification
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
LMM For Classification in version v1 has.
Bindings
-
input
images(image): The image to infer on..lmm_type(string): Type of LMM to be used.classes(list_of_values): List of classes that LMM shall classify against.remote_api_key(Union[secret,string]): Holds API key required to call LMM model - in current state of development, we require OpenAI key whenlmm_type=gpt_4v..
-
output
raw_output(string): String value.top(top_class): String value representing top class predicted by classification model.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step LMM For Classification in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/lmm_for_classification@v1",
"images": "$inputs.image",
"lmm_type": "gpt_4v",
"classes": [
"a",
"b"
],
"lmm_config": {
"gpt_image_detail": "low",
"gpt_model_version": "gpt-4o",
"max_tokens": 200
},
"remote_api_key": "xxx-xxx"
}