CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference, effective since release 0.38.0.
We are leaving this block in ecosystem until release 0.42.0 for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0 - this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/cog_vlm@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM in version v1.
- inputs:
Clip Comparison,Florence-2 Model,Morphological Transformation,Google Gemini,LMM,Instance Segmentation Model,Polygon Zone Visualization,Email Notification,Keypoint Visualization,Roboflow Custom Metadata,Camera Focus,Anthropic Claude,Multi-Label Classification Model,Image Threshold,LMM For Classification,Keypoint Detection Model,Anthropic Claude,Email Notification,Reference Path Visualization,Stitch OCR Detections,Camera Focus,Image Slicer,Stability AI Image Generation,Stability AI Outpainting,Stitch Images,Blur Visualization,OpenAI,Roboflow Dataset Upload,Depth Estimation,Google Gemini,CogVLM,Image Preprocessing,Local File Sink,Florence-2 Model,Image Convert Grayscale,Dynamic Crop,Dot Visualization,Triangle Visualization,OCR Model,Crop Visualization,Twilio SMS Notification,Perspective Correction,Twilio SMS/MMS Notification,EasyOCR,Grid Visualization,Google Gemini,Trace Visualization,QR Code Generator,Pixelate Visualization,OpenAI,Camera Calibration,Roboflow Dataset Upload,Webhook Sink,Single-Label Classification Model,Object Detection Model,VLM as Detector,Background Subtraction,Bounding Box Visualization,Contrast Equalization,Halo Visualization,Model Comparison Visualization,Label Visualization,Slack Notification,OpenAI,Circle Visualization,Image Contours,Background Color Visualization,Image Blur,Mask Visualization,VLM as Classifier,Google Vision OCR,Llama 3.2 Vision,Color Visualization,Corner Visualization,Classification Label Visualization,OpenAI,Line Counter Visualization,Ellipse Visualization,Icon Visualization,Model Monitoring Inference Aggregator,Image Slicer,Absolute Static Crop,Polygon Visualization,SIFT Comparison,Stability AI Inpainting,Relative Static Crop,SIFT,CSV Formatter,Text Display - outputs:
Clip Comparison,Morphological Transformation,Email Notification,Motion Detection,Detections Stitch,Anthropic Claude,Pixel Color Count,Detections Merge,Keypoint Detection Model,Reference Path Visualization,Stitch OCR Detections,Camera Focus,Expression,Stability AI Image Generation,Stitch Images,Stability AI Outpainting,Time in Zone,Rate Limiter,Bounding Rectangle,Roboflow Dataset Upload,Depth Estimation,Detections Transformation,CogVLM,Local File Sink,Identify Outliers,JSON Parser,SAM 3,Dynamic Crop,Time in Zone,Perception Encoder Embedding Model,Moondream2,Dot Visualization,Triangle Visualization,Cosine Similarity,Crop Visualization,PTZ Tracking (ONVIF).md),Twilio SMS Notification,Twilio SMS/MMS Notification,Perspective Correction,EasyOCR,Dimension Collapse,First Non Empty Or Default,Pixelate Visualization,Detections Consensus,OpenAI,Roboflow Dataset Upload,Buffer,Single-Label Classification Model,Object Detection Model,Barcode Detection,SIFT Comparison,Cache Set,Contrast Equalization,Byte Tracker,Halo Visualization,Model Comparison Visualization,Slack Notification,Byte Tracker,Dynamic Zone,Cache Get,Qwen2.5-VL,Image Contours,Image Blur,Background Color Visualization,Mask Visualization,Google Vision OCR,Path Deviation,Corner Visualization,Color Visualization,Clip Comparison,Template Matching,Line Counter Visualization,Icon Visualization,Ellipse Visualization,Velocity,Image Slicer,Detections Stabilizer,Absolute Static Crop,Stability AI Inpainting,SAM 3,Distance Measurement,Relative Static Crop,SIFT,CSV Formatter,Detections Filter,Blur Visualization,Multi-Label Classification Model,Instance Segmentation Model,Florence-2 Model,Google Gemini,LMM,Instance Segmentation Model,Polygon Zone Visualization,Keypoint Visualization,Roboflow Custom Metadata,Camera Focus,Multi-Label Classification Model,Detection Offset,Image Threshold,LMM For Classification,Anthropic Claude,Delta Filter,Email Notification,Gaze Detection,Overlap Filter,Property Definition,Image Slicer,SmolVLM2,OpenAI,Detection Event Log,YOLO-World Model,Google Gemini,Image Preprocessing,Florence-2 Model,VLM as Detector,Image Convert Grayscale,Time in Zone,Byte Tracker,OCR Model,Seg Preview,Path Deviation,Continue If,SAM 3,Detections List Roll-Up,Grid Visualization,Google Gemini,Line Counter,Object Detection Model,Trace Visualization,QR Code Generator,CLIP Embedding Model,Camera Calibration,Webhook Sink,QR Code Detection,VLM as Detector,Data Aggregator,Background Subtraction,Bounding Box Visualization,Label Visualization,OpenAI,Circle Visualization,VLM as Classifier,Dominant Color,Size Measurement,Llama 3.2 Vision,Classification Label Visualization,Single-Label Classification Model,Segment Anything 2 Model,OpenAI,Detections Combine,Detections Classes Replacement,Model Monitoring Inference Aggregator,Line Counter,VLM as Classifier,Polygon Visualization,SIFT Comparison,Keypoint Detection Model,Qwen3-VL,Identify Changes,Text Display
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM in version v1 has.
Bindings
-
input
-
output
parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.structured_output(dictionary): Dictionary.raw_output(string): String value.*(*): Equivalent of any element.
Example JSON definition of step CogVLM in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}