LMM For Classification¶
Class: LMMForClassificationBlockV1
Source: inference.core.workflows.core_steps.models.foundation.lmm_classifier.v1.LMMForClassificationBlockV1
Classify an image into one or more categories using a Large Multimodal Model (LMM).
You can specify arbitrary classes to an LMMBlock.
The LLMBlock supports two LMMs:
- OpenAI's GPT-4 with Vision.
You need to provide your OpenAI API key to use the GPT-4 with Vision model.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/lmm_for_classification@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs | 
|---|---|---|---|
| name | str | Enter a unique identifier for this step.. | ❌ | 
| lmm_type | str | Type of LMM to be used. | ✅ | 
| classes | List[str] | List of classes that LMM shall classify against. | ✅ | 
| lmm_config | LMMConfig | Configuration of LMM. | ❌ | 
| remote_api_key | str | Holds API key required to call LMM model - in current state of development, we require OpenAI key when lmm_type=gpt_4v.. | ✅ | 
The Refs column marks possibility to parametrise the property with dynamic values available 
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to LMM For Classification in version v1.
- inputs: Grid Visualization,Circle Visualization,Model Monitoring Inference Aggregator,Roboflow Dataset Upload,QR Code Generator,Image Slicer,Dot Visualization,Single-Label Classification Model,Blur Visualization,Slack Notification,Clip Comparison,Perspective Correction,Roboflow Dataset Upload,Anthropic Claude,Background Color Visualization,OpenAI,Florence-2 Model,Object Detection Model,Llama 3.2 Vision,Dynamic Crop,Crop Visualization,OCR Model,EasyOCR,Trace Visualization,Keypoint Detection Model,Image Threshold,Triangle Visualization,Reference Path Visualization,Model Comparison Visualization,Dimension Collapse,Polygon Visualization,Corner Visualization,Image Slicer,Florence-2 Model,Image Blur,Size Measurement,SIFT Comparison,Bounding Box Visualization,Dynamic Zone,Stitch OCR Detections,Buffer,Keypoint Visualization,Image Convert Grayscale,Clip Comparison,Line Counter Visualization,SIFT,Icon Visualization,Stability AI Inpainting,VLM as Detector,Google Vision OCR,Polygon Zone Visualization,OpenAI,Webhook Sink,Camera Calibration,Instance Segmentation Model,CogVLM,Mask Visualization,Camera Focus,Twilio SMS Notification,Stability AI Outpainting,Classification Label Visualization,Multi-Label Classification Model,LMM,Image Preprocessing,Color Visualization,Morphological Transformation,Depth Estimation,LMM For Classification,Ellipse Visualization,Stability AI Image Generation,Email Notification,Halo Visualization,Stitch Images,Local File Sink,Roboflow Custom Metadata,Absolute Static Crop,Google Gemini,VLM as Classifier,Image Contours,Pixelate Visualization,CSV Formatter,Label Visualization,OpenAI,Contrast Equalization,Relative Static Crop
- outputs: CLIP Embedding Model,Detections Stitch,Detections Classes Replacement,Circle Visualization,Time in Zone,Model Monitoring Inference Aggregator,Path Deviation,Roboflow Dataset Upload,QR Code Generator,Dot Visualization,Perception Encoder Embedding Model,Slack Notification,Perspective Correction,Roboflow Dataset Upload,Anthropic Claude,OpenAI,Background Color Visualization,Florence-2 Model,Llama 3.2 Vision,Dynamic Crop,Crop Visualization,Trace Visualization,Image Threshold,Triangle Visualization,Reference Path Visualization,Model Comparison Visualization,Cache Set,Polygon Visualization,Corner Visualization,Florence-2 Model,Image Blur,Size Measurement,SIFT Comparison,Moondream2,Bounding Box Visualization,Line Counter,Stitch OCR Detections,Keypoint Visualization,YOLO-World Model,Clip Comparison,Line Counter Visualization,Distance Measurement,Icon Visualization,Stability AI Inpainting,Google Vision OCR,Polygon Zone Visualization,OpenAI,Line Counter,Webhook Sink,Instance Segmentation Model,CogVLM,Time in Zone,Cache Get,Mask Visualization,Twilio SMS Notification,Stability AI Outpainting,Classification Label Visualization,LMM,Image Preprocessing,Time in Zone,Morphological Transformation,Color Visualization,LMM For Classification,Instance Segmentation Model,Ellipse Visualization,Stability AI Image Generation,Segment Anything 2 Model,Email Notification,Halo Visualization,Local File Sink,Roboflow Custom Metadata,Google Gemini,PTZ Tracking (ONVIF).md),Pixel Color Count,Path Deviation,Label Visualization,OpenAI,Contrast Equalization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds 
LMM For Classification in version v1  has.
Bindings
- 
input - images(- image): The image to infer on..
- lmm_type(- string): Type of LMM to be used.
- classes(- list_of_values): List of classes that LMM shall classify against.
- remote_api_key(Union[- string,- secret]): Holds API key required to call LMM model - in current state of development, we require OpenAI key when- lmm_type=gpt_4v..
 
- 
output - raw_output(- string): String value.
- top(- top_class): String value representing top class predicted by classification model.
- parent_id(- parent_id): Identifier of parent for step output.
- root_parent_id(- parent_id): Identifier of parent for step output.
- image(- image_metadata): Dictionary with image metadata required by supervision.
- prediction_type(- prediction_type): String value with type of prediction.
 
Example JSON definition of step LMM For Classification in version v1
{
    "name": "<your_step_name_here>",
    "type": "roboflow_core/lmm_for_classification@v1",
    "images": "$inputs.image",
    "lmm_type": "gpt_4v",
    "classes": [
        "a",
        "b"
    ],
    "lmm_config": {
        "gpt_image_detail": "low",
        "gpt_model_version": "gpt-4o",
        "max_tokens": 200
    },
    "remote_api_key": "xxx-xxx"
}