LMM For Classification¶
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/lmm_for_classification@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
lmm_type |
str |
Type of LMM to be used. | ✅ |
classes |
List[str] |
List of classes that LMM shall classify against. | ✅ |
lmm_config |
LMMConfig |
Configuration of LMM. | ❌ |
remote_api_key |
str |
Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v .. |
✅ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification
in version v1
.
- inputs:
Image Contours
,Stability AI Inpainting
,Corner Visualization
,CSV Formatter
,Google Gemini
,Line Counter Visualization
,Reference Path Visualization
,Keypoint Detection Model
,Model Monitoring Inference Aggregator
,Florence-2 Model
,Circle Visualization
,OCR Model
,Llama 3.2 Vision
,Relative Static Crop
,Dimension Collapse
,Roboflow Dataset Upload
,Dynamic Zone
,Image Convert Grayscale
,Pixelate Visualization
,Model Comparison Visualization
,Trace Visualization
,LMM
,Twilio SMS Notification
,Roboflow Dataset Upload
,Depth Estimation
,Label Visualization
,Classification Label Visualization
,Blur Visualization
,OpenAI
,Color Visualization
,Bounding Box Visualization
,Anthropic Claude
,Ellipse Visualization
,Instance Segmentation Model
,Polygon Zone Visualization
,Object Detection Model
,Roboflow Custom Metadata
,Image Slicer
,Image Slicer
,Crop Visualization
,Perspective Correction
,Halo Visualization
,Dot Visualization
,Mask Visualization
,Keypoint Visualization
,Local File Sink
,Absolute Static Crop
,Stitch OCR Detections
,Clip Comparison
,Image Blur
,OpenAI
,VLM as Classifier
,Clip Comparison
,Triangle Visualization
,Background Color Visualization
,SIFT Comparison
,Florence-2 Model
,Camera Calibration
,Google Vision OCR
,Image Threshold
,Single-Label Classification Model
,Buffer
,Image Preprocessing
,OpenAI
,CogVLM
,Slack Notification
,VLM as Detector
,Stability AI Image Generation
,SIFT
,Grid Visualization
,Camera Focus
,Stitch Images
,Stability AI Outpainting
,Size Measurement
,Polygon Visualization
,Multi-Label Classification Model
,Webhook Sink
,Dynamic Crop
,Email Notification
,LMM For Classification
- outputs:
Line Counter
,Stability AI Inpainting
,Corner Visualization
,Time in Zone
,Google Gemini
,Line Counter Visualization
,Reference Path Visualization
,Model Monitoring Inference Aggregator
,Florence-2 Model
,YOLO-World Model
,Circle Visualization
,Llama 3.2 Vision
,Roboflow Dataset Upload
,PTZ Tracking (ONVIF)
.md),Detections Classes Replacement
,Model Comparison Visualization
,Trace Visualization
,Detections Stitch
,Path Deviation
,Twilio SMS Notification
,LMM
,Roboflow Dataset Upload
,Label Visualization
,Classification Label Visualization
,Perception Encoder Embedding Model
,OpenAI
,Color Visualization
,Bounding Box Visualization
,Anthropic Claude
,Instance Segmentation Model
,Ellipse Visualization
,Pixel Color Count
,Polygon Zone Visualization
,Roboflow Custom Metadata
,Cache Get
,Crop Visualization
,Halo Visualization
,Perspective Correction
,Dot Visualization
,Mask Visualization
,Cache Set
,Keypoint Visualization
,Local File Sink
,Line Counter
,Image Blur
,OpenAI
,Path Deviation
,Clip Comparison
,Instance Segmentation Model
,Distance Measurement
,Triangle Visualization
,Segment Anything 2 Model
,Background Color Visualization
,CLIP Embedding Model
,SIFT Comparison
,Google Vision OCR
,Florence-2 Model
,Image Threshold
,Image Preprocessing
,OpenAI
,CogVLM
,Slack Notification
,Stability AI Image Generation
,Stability AI Outpainting
,Size Measurement
,Polygon Visualization
,Webhook Sink
,Dynamic Crop
,Time in Zone
,Email Notification
,LMM For Classification
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
LMM For Classification
in version v1
has.
Bindings
-
input
images
(image
): The image to infer on..lmm_type
(string
): Type of LMM to be used.classes
(list_of_values
): List of classes that LMM shall classify against.remote_api_key
(Union[secret
,string
]): Holds API key required to call LMM model - in current state of development, we require OpenAI key whenlmm_type=gpt_4v
..
-
output
raw_output
(string
): String value.top
(top_class
): String value representing top class predicted by classification model.parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.image
(image_metadata
): Dictionary with image metadata required by supervision.prediction_type
(prediction_type
): String value with type of prediction.
Example JSON definition of step LMM For Classification
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/lmm_for_classification@v1",
"images": "$inputs.image",
"lmm_type": "gpt_4v",
"classes": [
"a",
"b"
],
"lmm_config": {
"gpt_image_detail": "low",
"gpt_model_version": "gpt-4o",
"max_tokens": 200
},
"remote_api_key": "xxx-xxx"
}