CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference
, effective since release 0.38.0
.
We are leaving this block in ecosystem until release 0.42.0
for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0
- this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/cog_vlm@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM
in version v1
.
- inputs:
Image Contours
,Triangle Visualization
,Florence-2 Model
,CogVLM
,LMM
,Google Gemini
,Background Color Visualization
,Dynamic Crop
,Roboflow Dataset Upload
,Halo Visualization
,Classification Label Visualization
,Local File Sink
,VLM as Classifier
,Slack Notification
,Twilio SMS Notification
,Google Vision OCR
,SIFT Comparison
,Stitch OCR Detections
,OCR Model
,Crop Visualization
,Single-Label Classification Model
,CSV Formatter
,OpenAI
,VLM as Detector
,Instance Segmentation Model
,Email Notification
,Anthropic Claude
,Blur Visualization
,Camera Calibration
,Color Visualization
,Ellipse Visualization
,Relative Static Crop
,Object Detection Model
,Bounding Box Visualization
,Webhook Sink
,Grid Visualization
,Model Comparison Visualization
,Corner Visualization
,Camera Focus
,Stitch Images
,Pixelate Visualization
,LMM For Classification
,Image Slicer
,Line Counter Visualization
,Multi-Label Classification Model
,Reference Path Visualization
,Roboflow Custom Metadata
,Keypoint Detection Model
,Florence-2 Model
,OpenAI
,Image Slicer
,Mask Visualization
,Label Visualization
,Stability AI Image Generation
,SIFT
,Image Convert Grayscale
,Stability AI Outpainting
,Image Threshold
,Model Monitoring Inference Aggregator
,Polygon Zone Visualization
,Stability AI Inpainting
,Llama 3.2 Vision
,Perspective Correction
,Trace Visualization
,Image Preprocessing
,Roboflow Dataset Upload
,Absolute Static Crop
,Circle Visualization
,Clip Comparison
,Dot Visualization
,Depth Estimation
,OpenAI
,Keypoint Visualization
,Polygon Visualization
,Image Blur
- outputs:
Triangle Visualization
,Continue If
,SIFT Comparison
,Detections Classes Replacement
,VLM as Classifier
,Twilio SMS Notification
,Google Vision OCR
,OCR Model
,Size Measurement
,Bounding Rectangle
,SmolVLM2
,Instance Segmentation Model
,Velocity
,Cosine Similarity
,LMM For Classification
,Cache Set
,Perception Encoder Embedding Model
,Reference Path Visualization
,Keypoint Detection Model
,Object Detection Model
,Model Monitoring Inference Aggregator
,Property Definition
,SIFT
,Image Convert Grayscale
,Single-Label Classification Model
,Buffer
,Llama 3.2 Vision
,Clip Comparison
,Camera Focus
,OpenAI
,Keypoint Visualization
,Polygon Visualization
,Image Blur
,Image Slicer
,Time in Zone
,CogVLM
,Google Gemini
,CLIP Embedding Model
,YOLO-World Model
,Roboflow Dataset Upload
,Crop Visualization
,Single-Label Classification Model
,Detections Transformation
,VLM as Detector
,Anthropic Claude
,Segment Anything 2 Model
,Bounding Box Visualization
,QR Code Detection
,Line Counter Visualization
,Image Slicer
,Multi-Label Classification Model
,Florence-2 Model
,Cache Get
,Identify Changes
,Time in Zone
,Roboflow Dataset Upload
,Absolute Static Crop
,Dimension Collapse
,Clip Comparison
,Local File Sink
,Slack Notification
,Keypoint Detection Model
,Detections Stabilizer
,SIFT Comparison
,Rate Limiter
,Line Counter
,Expression
,Delta Filter
,Email Notification
,OpenAI
,CSV Formatter
,Instance Segmentation Model
,Camera Calibration
,Ellipse Visualization
,Path Deviation
,Detections Stitch
,Model Comparison Visualization
,Corner Visualization
,Dominant Color
,Byte Tracker
,Pixelate Visualization
,Overlap Filter
,First Non Empty Or Default
,Roboflow Custom Metadata
,OpenAI
,Mask Visualization
,Label Visualization
,Stability AI Outpainting
,Polygon Zone Visualization
,Stability AI Inpainting
,Perspective Correction
,Image Preprocessing
,Dot Visualization
,Distance Measurement
,Detections Filter
,Qwen2.5-VL
,Depth Estimation
,Image Contours
,Florence-2 Model
,Background Color Visualization
,VLM as Detector
,Dynamic Crop
,Classification Label Visualization
,Halo Visualization
,Detections Merge
,Stitch OCR Detections
,PTZ Tracking (ONVIF)
.md),JSON Parser
,Detections Consensus
,Blur Visualization
,Color Visualization
,Template Matching
,Line Counter
,Relative Static Crop
,Object Detection Model
,Pixel Color Count
,Webhook Sink
,Identify Outliers
,Grid Visualization
,Barcode Detection
,Stitch Images
,Byte Tracker
,Path Deviation
,Dynamic Zone
,Multi-Label Classification Model
,Byte Tracker
,Stability AI Image Generation
,Moondream2
,Image Threshold
,Trace Visualization
,Circle Visualization
,Data Aggregator
,Gaze Detection
,Detection Offset
,LMM
,VLM as Classifier
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM
in version v1
has.
Bindings
-
input
-
output
parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.image
(image_metadata
): Dictionary with image metadata required by supervision.structured_output
(dictionary
): Dictionary.raw_output
(string
): String value.*
(*
): Equivalent of any element.
Example JSON definition of step CogVLM
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}