CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference, effective since release 0.38.0.
We are leaving this block in ecosystem until release 0.42.0 for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0 - this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/cog_vlm@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | โ |
prompt |
str |
Text prompt to the CogVLM model. | โ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | โ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM in version v1.
- inputs:
Roboflow Dataset Upload,Line Counter Visualization,Stability AI Outpainting,Object Detection Model,Email Notification,Google Gemma API,Image Slicer,OCR Model,Google Vision OCR,Image Preprocessing,Google Gemini,Instance Segmentation Model,EasyOCR,Color Visualization,OpenAI,Ellipse Visualization,Polygon Visualization,Anthropic Claude,Relative Static Crop,Webhook Sink,Model Comparison Visualization,Trace Visualization,Stitch OCR Detections,Camera Focus,Roboflow Custom Metadata,Qwen 3.5 API,OpenAI,Single-Label Classification Model,VLM As Classifier,Image Threshold,Stitch Images,Heatmap Visualization,Qwen 3.6 API,SIFT Comparison,Morphological Transformation,Florence-2 Model,Halo Visualization,CogVLM,Crop Visualization,Camera Calibration,Florence-2 Model,GLM-OCR,Dot Visualization,S3 Sink,Twilio SMS Notification,Icon Visualization,Model Monitoring Inference Aggregator,Local File Sink,Google Gemini,Roboflow Dataset Upload,Image Contours,Pixelate Visualization,Twilio SMS/MMS Notification,Polygon Zone Visualization,Reference Path Visualization,Blur Visualization,Anthropic Claude,Background Subtraction,Text Display,Clip Comparison,CSV Formatter,VLM As Detector,LMM,Stability AI Image Generation,Perspective Correction,Anthropic Claude,Bounding Box Visualization,Depth Estimation,Classification Label Visualization,Image Slicer,Absolute Static Crop,Image Blur,Stability AI Inpainting,Multi-Label Classification Model,Polygon Visualization,Image Convert Grayscale,SIFT,Roboflow Vision Events,OpenAI,Google Gemini,Label Visualization,Corner Visualization,Grid Visualization,Dynamic Crop,Contrast Equalization,Keypoint Visualization,Triangle Visualization,Qwen3.5-VL,QR Code Generator,Halo Visualization,Circle Visualization,Camera Focus,Mask Visualization,LMM For Classification,Morphological Transformation,OpenAI,Contrast Enhancement,Keypoint Detection Model,MoonshotAI Kimi,Llama 3.2 Vision,Background Color Visualization,Email Notification,Slack Notification,Stitch OCR Detections - outputs:
Gaze Detection,Image Slicer,Distance Measurement,Bounding Rectangle,Ellipse Visualization,ByteTrack Tracker,Relative Static Crop,Detections Classes Replacement,Barcode Detection,Trace Visualization,Qwen 3.5 API,Camera Focus,Buffer,SAM 3,Image Threshold,SORT Tracker,Florence-2 Model,Detections Transformation,Path Deviation,Semantic Segmentation Model,Twilio SMS Notification,Google Gemini,Roboflow Dataset Upload,Clip Comparison,VLM As Classifier,Line Counter,Background Subtraction,Detections Merge,Perspective Correction,Overlap Filter,Rate Limiter,SmolVLM2,SIFT,Roboflow Vision Events,Label Visualization,Expression,Grid Visualization,Per-Class Confidence Filter,Property Definition,Halo Visualization,Circle Visualization,Segment Anything 2 Model,MoonshotAI Kimi,Slack Notification,Detections Stabilizer,Object Detection Model,Stability AI Outpainting,Google Vision OCR,Image Preprocessing,Object Detection Model,Cosine Similarity,OpenAI,Detection Event Log,Byte Tracker,Anthropic Claude,Time in Zone,YOLO-World Model,Inner Workflow,Perception Encoder Embedding Model,Semantic Segmentation Model,Single-Label Classification Model,Detections List Roll-Up,Mask Area Measurement,Stitch Images,Instance Segmentation Model,CogVLM,Florence-2 Model,Camera Calibration,Multi-Label Classification Model,Time in Zone,SAM 3,Local File Sink,Icon Visualization,First Non Empty Or Default,Image Contours,JSON Parser,Time in Zone,Reference Path Visualization,Dimension Collapse,Anthropic Claude,VLM As Detector,LMM,Identify Changes,Multi-Label Classification Model,Absolute Static Crop,Image Convert Grayscale,OpenAI,Corner Visualization,Dynamic Crop,Keypoint Visualization,QR Code Generator,LMM For Classification,Morphological Transformation,Contrast Enhancement,Background Color Visualization,PTZ Tracking (ONVIF),Roboflow Dataset Upload,Line Counter Visualization,Mask Edge Snap,OCR Model,Qwen2.5-VL,Instance Segmentation Model,Color Visualization,Multi-Label Classification Model,Data Aggregator,Polygon Visualization,Single-Label Classification Model,Byte Tracker,Detections Consensus,Cache Set,Webhook Sink,Continue If,Stitch OCR Detections,Object Detection Model,OpenAI,Size Measurement,Heatmap Visualization,Halo Visualization,Path Deviation,GLM-OCR,Dot Visualization,S3 Sink,Seg Preview,Model Monitoring Inference Aggregator,Dynamic Zone,Pixelate Visualization,Twilio SMS/MMS Notification,Polygon Zone Visualization,Motion Detection,Blur Visualization,Text Display,CSV Formatter,Stability AI Image Generation,Anthropic Claude,Line Counter,Bounding Box Visualization,Velocity,Depth Estimation,Stability AI Inpainting,Polygon Visualization,VLM As Detector,Google Gemini,Qwen3.5-VL,Contrast Equalization,Triangle Visualization,Mask Visualization,Dominant Color,OpenAI,Llama 3.2 Vision,Email Notification,CLIP Embedding Model,Detections Stitch,Email Notification,Google Gemma API,Identify Outliers,Google Gemini,EasyOCR,Detections Combine,SAM2 Video Tracker,Qwen3-VL,Model Comparison Visualization,Roboflow Custom Metadata,Detection Offset,Instance Segmentation Model,VLM As Classifier,Template Matching,Qwen 3.6 API,SIFT Comparison,Morphological Transformation,Crop Visualization,OC-SORT Tracker,QR Code Detection,Detections Filter,Keypoint Detection Model,Clip Comparison,Pixel Color Count,Classification Label Visualization,Image Slicer,Image Blur,Byte Tracker,SAM 3,Single-Label Classification Model,Delta Filter,Keypoint Detection Model,Moondream2,SIFT Comparison,Camera Focus,Keypoint Detection Model,Stitch OCR Detections,Cache Get
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM in version v1 has.
Bindings
-
input
-
output
parent_id(parent_id): Identifier of parent for step output.root_parent_id(parent_id): Identifier of parent for step output.image(image_metadata): Dictionary with image metadata required by supervision.structured_output(dictionary): Dictionary.raw_output(string): String value.*(*): Equivalent of any element.
Example JSON definition of step CogVLM in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}