LMM For Classification¶
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/lmm_for_classification@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
lmm_type |
str |
Type of LMM to be used. | ✅ |
classes |
List[str] |
List of classes that LMM shall classify against. | ✅ |
lmm_config |
LMMConfig |
Configuration of LMM. | ❌ |
remote_api_key |
str |
Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v.. |
✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification in version v1.
- inputs:
LMM,Background Color Visualization,Stitch Images,Image Slicer,Size Measurement,Dynamic Zone,Model Monitoring Inference Aggregator,Corner Visualization,Camera Calibration,Mask Visualization,Object Detection Model,Local File Sink,Model Comparison Visualization,Email Notification,Pixelate Visualization,Anthropic Claude,Relative Static Crop,Google Gemini,Florence-2 Model,Multi-Label Classification Model,Ellipse Visualization,Triangle Visualization,Camera Focus,OCR Model,QR Code Generator,Label Visualization,Roboflow Custom Metadata,Florence-2 Model,LMM For Classification,Blur Visualization,Dot Visualization,Stability AI Image Generation,Perspective Correction,Google Vision OCR,Llama 3.2 Vision,EasyOCR,Absolute Static Crop,Slack Notification,Morphological Transformation,Image Blur,Image Threshold,Clip Comparison,Stitch OCR Detections,Depth Estimation,Stability AI Outpainting,Halo Visualization,Stability AI Inpainting,Polygon Visualization,OpenAI,Grid Visualization,Roboflow Dataset Upload,CogVLM,Classification Label Visualization,Email Notification,VLM as Detector,Instance Segmentation Model,Bounding Box Visualization,Image Convert Grayscale,Polygon Zone Visualization,OpenAI,Clip Comparison,Keypoint Detection Model,Crop Visualization,Image Slicer,Icon Visualization,Color Visualization,Roboflow Dataset Upload,Dimension Collapse,Keypoint Visualization,Contrast Equalization,Buffer,Image Contours,Circle Visualization,OpenAI,VLM as Classifier,CSV Formatter,Reference Path Visualization,Twilio SMS Notification,Dynamic Crop,Webhook Sink,Single-Label Classification Model,SIFT,Line Counter Visualization,Image Preprocessing,Trace Visualization,SIFT Comparison - outputs:
LMM,Background Color Visualization,Size Measurement,Model Monitoring Inference Aggregator,Corner Visualization,Detections Classes Replacement,Mask Visualization,CLIP Embedding Model,Line Counter,Local File Sink,Model Comparison Visualization,Email Notification,Time in Zone,Anthropic Claude,Google Gemini,Florence-2 Model,Ellipse Visualization,Triangle Visualization,Segment Anything 2 Model,QR Code Generator,Label Visualization,Time in Zone,Pixel Color Count,Roboflow Custom Metadata,Cache Set,Florence-2 Model,LMM For Classification,Dot Visualization,Stability AI Image Generation,Perspective Correction,Google Vision OCR,Llama 3.2 Vision,Line Counter,Slack Notification,SAM 3,Morphological Transformation,Image Blur,Clip Comparison,Image Threshold,Stitch OCR Detections,Stability AI Outpainting,Halo Visualization,Stability AI Inpainting,Polygon Visualization,OpenAI,Roboflow Dataset Upload,Path Deviation,Distance Measurement,CogVLM,Classification Label Visualization,Trace Visualization,Email Notification,Instance Segmentation Model,Bounding Box Visualization,Polygon Zone Visualization,Perception Encoder Embedding Model,OpenAI,Detections Stitch,Crop Visualization,Cache Get,YOLO-World Model,Moondream2,Icon Visualization,Seg Preview,Color Visualization,Path Deviation,Roboflow Dataset Upload,Keypoint Visualization,Contrast Equalization,Instance Segmentation Model,Circle Visualization,OpenAI,Time in Zone,Reference Path Visualization,Twilio SMS Notification,Webhook Sink,Line Counter Visualization,PTZ Tracking (ONVIF).md),Image Preprocessing,Dynamic Crop,SIFT Comparison
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
LMM For Classification in version v1 has.
Bindings
-
input
images(image): The image to infer on..lmm_type(string): Type of LMM to be used.classes(list_of_values): List of classes that LMM shall classify against.remote_api_key(Union[secret,string]): Holds API key required to call LMM model - in current state of development, we require OpenAI key whenlmm_type=gpt_4v..
-
output
raw_output(string): String value.top(top_class): String value representing top class predicted by classification model.parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.prediction_type(prediction_type): String value with type of prediction.
Example JSON definition of step LMM For Classification in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/lmm_for_classification@v1",
"images": "$inputs.image",
"lmm_type": "gpt_4v",
"classes": [
"a",
"b"
],
"lmm_config": {
"gpt_image_detail": "low",
"gpt_model_version": "gpt-4o",
"max_tokens": 200
},
"remote_api_key": "xxx-xxx"
}