LMM For Classification¶
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/lmm_for_classification@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
lmm_type |
str |
Type of LMM to be used. | ✅ |
classes |
List[str] |
List of classes that LMM shall classify against. | ✅ |
lmm_config |
LMMConfig |
Configuration of LMM. | ❌ |
remote_api_key |
str |
Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v.. |
✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification in version v1.
- inputs:
Llama 3.2 Vision,Blur Visualization,Dimension Collapse,Perspective Correction,Polygon Zone Visualization,Bounding Box Visualization,QR Code Generator,Pixelate Visualization,Trace Visualization,Roboflow Custom Metadata,Image Threshold,Polygon Visualization,Dynamic Crop,Icon Visualization,Image Slicer,Stability AI Outpainting,Model Comparison Visualization,Dynamic Zone,Clip Comparison,LMM,OpenAI,Classification Label Visualization,Stitch Images,Florence-2 Model,Mask Visualization,Single-Label Classification Model,Size Measurement,Relative Static Crop,Absolute Static Crop,SIFT Comparison,Google Gemini,Circle Visualization,Florence-2 Model,LMM For Classification,Ellipse Visualization,Image Convert Grayscale,Object Detection Model,OCR Model,Image Preprocessing,Color Visualization,Image Blur,Stability AI Image Generation,Anthropic Claude,Google Vision OCR,Keypoint Visualization,Camera Calibration,Local File Sink,EasyOCR,Image Slicer,Email Notification,VLM as Detector,Roboflow Dataset Upload,Background Color Visualization,Triangle Visualization,Slack Notification,Keypoint Detection Model,Halo Visualization,Corner Visualization,Google Gemini,Model Monitoring Inference Aggregator,Roboflow Dataset Upload,Dot Visualization,Image Contours,Multi-Label Classification Model,Twilio SMS Notification,Instance Segmentation Model,VLM as Classifier,CSV Formatter,Reference Path Visualization,Morphological Transformation,Motion Detection,OpenAI,Webhook Sink,Contrast Equalization,Camera Focus,Stitch OCR Detections,Stability AI Inpainting,CogVLM,Clip Comparison,Line Counter Visualization,Email Notification,Crop Visualization,Grid Visualization,OpenAI,Buffer,SIFT,Depth Estimation,Background Subtraction,Label Visualization,Anthropic Claude,OpenAI - outputs:
Line Counter,Path Deviation,Llama 3.2 Vision,Perception Encoder Embedding Model,SAM 3,Perspective Correction,Polygon Zone Visualization,Bounding Box Visualization,QR Code Generator,Distance Measurement,Trace Visualization,Roboflow Custom Metadata,Segment Anything 2 Model,Image Threshold,Polygon Visualization,Icon Visualization,Dynamic Crop,Stability AI Outpainting,Model Comparison Visualization,LMM,OpenAI,Cache Get,Classification Label Visualization,Size Measurement,Florence-2 Model,Mask Visualization,SAM 3,SIFT Comparison,Time in Zone,Moondream2,Google Gemini,Circle Visualization,Florence-2 Model,LMM For Classification,Time in Zone,Ellipse Visualization,Anthropic Claude,Image Preprocessing,Image Blur,Stability AI Image Generation,Color Visualization,Google Vision OCR,Keypoint Visualization,Local File Sink,Line Counter,Email Notification,Roboflow Dataset Upload,Background Color Visualization,Triangle Visualization,Slack Notification,Halo Visualization,Corner Visualization,Google Gemini,Model Monitoring Inference Aggregator,Roboflow Dataset Upload,Dot Visualization,Twilio SMS Notification,Instance Segmentation Model,Seg Preview,Reference Path Visualization,Morphological Transformation,OpenAI,Webhook Sink,PTZ Tracking (ONVIF).md),Detections Classes Replacement,Instance Segmentation Model,Detections Stitch,Contrast Equalization,YOLO-World Model,Stitch OCR Detections,Stability AI Inpainting,Clip Comparison,CogVLM,Line Counter Visualization,Cache Set,Path Deviation,CLIP Embedding Model,Email Notification,Crop Visualization,OpenAI,SAM 3,Anthropic Claude,Label Visualization,Pixel Color Count,Time in Zone,OpenAI
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
LMM For Classification in version v1 has.
Bindings
-
input
images(image): The image to infer on..lmm_type(string): Type of LMM to be used.classes(list_of_values): List of classes that LMM shall classify against.remote_api_key(Union[secret,string]): Holds API key required to call LMM model - in current state of development, we require OpenAI key whenlmm_type=gpt_4v..
-
output
raw_output(string): String value.top(top_class): String value representing top class predicted by classification model.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step LMM For Classification in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/lmm_for_classification@v1",
"images": "$inputs.image",
"lmm_type": "gpt_4v",
"classes": [
"a",
"b"
],
"lmm_config": {
"gpt_image_detail": "low",
"gpt_model_version": "gpt-4o",
"max_tokens": 200
},
"remote_api_key": "xxx-xxx"
}