CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference, effective since release 0.38.0.
We are leaving this block in ecosystem until release 0.42.0 for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0 - this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/cog_vlm@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM in version v1.
- inputs:
Dynamic Crop,OCR Model,Email Notification,Image Blur,Background Subtraction,SIFT Comparison,OpenAI,Google Vision OCR,Google Gemini,Image Preprocessing,Instance Segmentation Model,Local File Sink,Single-Label Classification Model,Bounding Box Visualization,Model Monitoring Inference Aggregator,Anthropic Claude,Multi-Label Classification Model,Keypoint Detection Model,Email Notification,Slack Notification,Camera Focus,Twilio SMS/MMS Notification,Dot Visualization,Florence-2 Model,Roboflow Dataset Upload,CSV Formatter,Camera Focus,Stitch OCR Detections,Depth Estimation,Polygon Visualization,Perspective Correction,OpenAI,Camera Calibration,Corner Visualization,Icon Visualization,Image Slicer,Qwen3.5-VL,Line Counter Visualization,Heatmap Visualization,Morphological Transformation,Stability AI Image Generation,Google Gemini,Keypoint Visualization,VLM As Detector,Halo Visualization,Background Color Visualization,Label Visualization,Polygon Visualization,Pixelate Visualization,LMM,CogVLM,Contrast Equalization,Triangle Visualization,Stability AI Outpainting,Mask Visualization,VLM As Classifier,Color Visualization,Text Display,Relative Static Crop,Reference Path Visualization,Stitch OCR Detections,Llama 3.2 Vision,OpenAI,Image Threshold,Clip Comparison,Classification Label Visualization,Webhook Sink,Circle Visualization,Polygon Zone Visualization,Image Contours,Image Convert Grayscale,Grid Visualization,Florence-2 Model,Roboflow Custom Metadata,LMM For Classification,SIFT,Halo Visualization,Object Detection Model,Anthropic Claude,Google Gemini,Model Comparison Visualization,Blur Visualization,QR Code Generator,EasyOCR,Absolute Static Crop,Image Slicer,S3 Sink,Anthropic Claude,Stability AI Inpainting,Ellipse Visualization,Crop Visualization,Trace Visualization,Twilio SMS Notification,Stitch Images,OpenAI,Roboflow Dataset Upload - outputs:
OCR Model,Image Blur,Google Gemini,Local File Sink,Single-Label Classification Model,Keypoint Detection Model,Gaze Detection,Depth Estimation,Polygon Visualization,Detections List Roll-Up,Image Slicer,Line Counter Visualization,Morphological Transformation,Distance Measurement,Keypoint Visualization,Keypoint Detection Model,Background Color Visualization,Label Visualization,QR Code Detection,Polygon Visualization,LMM,Single-Label Classification Model,Triangle Visualization,Stability AI Outpainting,Reference Path Visualization,OpenAI,Clip Comparison,Delta Filter,VLM As Classifier,LMM For Classification,Dynamic Zone,Velocity,SAM 3,Detections Transformation,Ellipse Visualization,Crop Visualization,SIFT Comparison,Size Measurement,Time in Zone,Motion Detection,Email Notification,SIFT Comparison,Seg Preview,Instance Segmentation Model,Anthropic Claude,Email Notification,Twilio SMS/MMS Notification,Camera Focus,Stitch OCR Detections,Moondream2,Expression,Qwen3.5-VL,Corner Visualization,Halo Visualization,Detection Event Log,Pixelate Visualization,Dimension Collapse,VLM As Classifier,Data Aggregator,Detections Classes Replacement,Relative Static Crop,Circle Visualization,Grid Visualization,Mask Area Measurement,Florence-2 Model,SmolVLM2,Perception Encoder Embedding Model,YOLO-World Model,Object Detection Model,Byte Tracker,Template Matching,Anthropic Claude,Google Gemini,Model Comparison Visualization,QR Code Generator,Image Slicer,S3 Sink,CLIP Embedding Model,Stability AI Inpainting,Segment Anything 2 Model,Detections Filter,Roboflow Dataset Upload,Dynamic Crop,Barcode Detection,Background Subtraction,Google Vision OCR,Image Preprocessing,Object Detection Model,Bounding Box Visualization,Model Monitoring Inference Aggregator,Multi-Label Classification Model,Identify Outliers,Camera Focus,Dot Visualization,Florence-2 Model,Roboflow Dataset Upload,CSV Formatter,OpenAI,Line Counter,Rate Limiter,Heatmap Visualization,Google Gemini,Stability AI Image Generation,CogVLM,Time in Zone,Mask Visualization,Color Visualization,Detections Combine,Dominant Color,Text Display,Bounding Rectangle,Llama 3.2 Vision,Image Threshold,Clip Comparison,Classification Label Visualization,Polygon Zone Visualization,Image Contours,Continue If,Roboflow Custom Metadata,Halo Visualization,Semantic Segmentation Model,Blur Visualization,Path Deviation,Absolute Static Crop,Anthropic Claude,Cosine Similarity,Identify Changes,Path Deviation,Trace Visualization,Twilio SMS Notification,Stitch Images,Detections Stabilizer,Detections Merge,OpenAI,Qwen2.5-VL,Time in Zone,Multi-Label Classification Model,Detections Stitch,Slack Notification,VLM As Detector,Cache Set,SAM 3,Perspective Correction,PTZ Tracking (ONVIF),Icon Visualization,Camera Calibration,Overlap Filter,Byte Tracker,VLM As Detector,JSON Parser,Qwen3-VL,Contrast Equalization,Instance Segmentation Model,Line Counter,Stitch OCR Detections,Webhook Sink,First Non Empty Or Default,Image Convert Grayscale,Byte Tracker,Buffer,SAM 3,Cache Get,SIFT,Detections Consensus,Property Definition,Detection Offset,EasyOCR,OpenAI,Pixel Color Count
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM in version v1 has.
Bindings
-
input
-
output
parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.structured_output(dictionary): Dictionary.raw_output(string): String value.*(*): Equivalent of any element.
Example JSON definition of step CogVLM in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}