LMM For Classification¶
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/lmm_for_classification@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
lmm_type |
str |
Type of LMM to be used. | ✅ |
classes |
List[str] |
List of classes that LMM shall classify against. | ✅ |
lmm_config |
LMMConfig |
Configuration of LMM. | ❌ |
remote_api_key |
str |
Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v.. |
✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification in version v1.
- inputs:
Clip Comparison,Halo Visualization,Anthropic Claude,Image Blur,Email Notification,Camera Focus,Text Display,CSV Formatter,Contrast Equalization,Object Detection Model,Blur Visualization,Roboflow Dataset Upload,Corner Visualization,Dynamic Crop,Classification Label Visualization,Relative Static Crop,Dimension Collapse,Trace Visualization,Multi-Label Classification Model,Camera Focus,Mask Visualization,SIFT,Stitch OCR Detections,Background Color Visualization,Polygon Visualization,Local File Sink,Florence-2 Model,Stitch OCR Detections,Size Measurement,Ellipse Visualization,Pixelate Visualization,SIFT Comparison,Roboflow Custom Metadata,LMM,Detections List Roll-Up,OpenAI,Circle Visualization,Dot Visualization,Twilio SMS Notification,Absolute Static Crop,Morphological Transformation,Crop Visualization,Single-Label Classification Model,Google Gemini,Dynamic Zone,Keypoint Visualization,Polygon Visualization,Google Vision OCR,Icon Visualization,Llama 3.2 Vision,Color Visualization,Image Contours,Stitch Images,OpenAI,VLM As Classifier,LMM For Classification,Email Notification,VLM As Detector,Anthropic Claude,OpenAI,Bounding Box Visualization,Triangle Visualization,Background Subtraction,Grid Visualization,Model Comparison Visualization,Reference Path Visualization,Line Counter Visualization,EasyOCR,Halo Visualization,OCR Model,Instance Segmentation Model,Slack Notification,Stability AI Inpainting,Image Slicer,Google Gemini,Florence-2 Model,Image Convert Grayscale,Stability AI Image Generation,Heatmap Visualization,Polygon Zone Visualization,Image Slicer,Label Visualization,Google Gemini,Depth Estimation,Image Preprocessing,OpenAI,Stability AI Outpainting,CogVLM,Webhook Sink,Image Threshold,Anthropic Claude,Buffer,Camera Calibration,Perspective Correction,QR Code Generator,Motion Detection,Keypoint Detection Model,Model Monitoring Inference Aggregator,Twilio SMS/MMS Notification,Roboflow Dataset Upload,Clip Comparison - outputs:
Anthropic Claude,Clip Comparison,Halo Visualization,Image Blur,Email Notification,CLIP Embedding Model,Text Display,Contrast Equalization,Segment Anything 2 Model,Roboflow Dataset Upload,Dynamic Crop,Corner Visualization,Classification Label Visualization,Trace Visualization,SAM 3,Mask Visualization,Stitch OCR Detections,Path Deviation,YOLO-World Model,Background Color Visualization,PTZ Tracking (ONVIF).md),Polygon Visualization,Moondream2,Local File Sink,Florence-2 Model,Stitch OCR Detections,Size Measurement,Line Counter,Ellipse Visualization,SIFT Comparison,Roboflow Custom Metadata,LMM,SAM 3,OpenAI,Dot Visualization,Circle Visualization,Twilio SMS Notification,Morphological Transformation,Crop Visualization,Google Gemini,Keypoint Visualization,Polygon Visualization,Google Vision OCR,Icon Visualization,Llama 3.2 Vision,OpenAI,Color Visualization,LMM For Classification,Email Notification,Anthropic Claude,Distance Measurement,OpenAI,Triangle Visualization,Detections Classes Replacement,Bounding Box Visualization,Model Comparison Visualization,Reference Path Visualization,Instance Segmentation Model,Line Counter Visualization,Halo Visualization,Slack Notification,Time in Zone,Stability AI Inpainting,Line Counter,Google Gemini,Florence-2 Model,Stability AI Image Generation,Heatmap Visualization,Pixel Color Count,Seg Preview,Polygon Zone Visualization,Label Visualization,Google Gemini,Depth Estimation,Image Preprocessing,Time in Zone,Cache Get,Detections Stitch,OpenAI,Time in Zone,Stability AI Outpainting,CogVLM,Webhook Sink,Image Threshold,Anthropic Claude,Instance Segmentation Model,Perception Encoder Embedding Model,Path Deviation,Cache Set,Perspective Correction,SAM 3,QR Code Generator,Model Monitoring Inference Aggregator,Twilio SMS/MMS Notification,Roboflow Dataset Upload
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
LMM For Classification in version v1 has.
Bindings
-
input
images(image): The image to infer on..lmm_type(string): Type of LMM to be used.classes(list_of_values): List of classes that LMM shall classify against.remote_api_key(Union[string,secret]): Holds API key required to call LMM model - in current state of development, we require OpenAI key whenlmm_type=gpt_4v..
-
output
raw_output(string): String value.top(top_class): String value representing top class predicted by classification model.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step LMM For Classification in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/lmm_for_classification@v1",
"images": "$inputs.image",
"lmm_type": "gpt_4v",
"classes": [
"a",
"b"
],
"lmm_config": {
"gpt_image_detail": "low",
"gpt_model_version": "gpt-4o",
"max_tokens": 200
},
"remote_api_key": "xxx-xxx"
}