CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference
, effective since release 0.38.0
.
We are leaving this block in ecosystem until release 0.42.0
for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0
- this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/cog_vlm@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM
in version v1
.
- inputs:
Image Preprocessing
,Instance Segmentation Model
,Relative Static Crop
,Background Color Visualization
,Line Counter Visualization
,SIFT
,Twilio SMS Notification
,OpenAI
,Reference Path Visualization
,Grid Visualization
,Roboflow Dataset Upload
,Blur Visualization
,Image Contours
,Roboflow Custom Metadata
,Pixelate Visualization
,LMM For Classification
,Object Detection Model
,Keypoint Detection Model
,Llama 3.2 Vision
,Camera Focus
,Color Visualization
,OpenAI
,Roboflow Dataset Upload
,Ellipse Visualization
,Label Visualization
,Image Slicer
,Crop Visualization
,Stitch Images
,Polygon Zone Visualization
,CSV Formatter
,Dot Visualization
,Bounding Box Visualization
,Stitch OCR Detections
,Florence-2 Model
,Image Slicer
,CogVLM
,Webhook Sink
,LMM
,Single-Label Classification Model
,Perspective Correction
,Email Notification
,Multi-Label Classification Model
,Google Vision OCR
,OCR Model
,Image Blur
,Stability AI Image Generation
,Dynamic Crop
,Image Convert Grayscale
,Stability AI Inpainting
,Clip Comparison
,Model Monitoring Inference Aggregator
,Model Comparison Visualization
,Triangle Visualization
,Classification Label Visualization
,Trace Visualization
,Anthropic Claude
,VLM as Classifier
,Depth Estimation
,Corner Visualization
,SIFT Comparison
,Camera Calibration
,Absolute Static Crop
,Keypoint Visualization
,Mask Visualization
,Image Threshold
,VLM as Detector
,Halo Visualization
,Florence-2 Model
,Circle Visualization
,Slack Notification
,Google Gemini
,Local File Sink
,Polygon Visualization
- outputs:
Identify Changes
,Line Counter Visualization
,Background Color Visualization
,Relative Static Crop
,First Non Empty Or Default
,Barcode Detection
,Detections Stabilizer
,Detections Transformation
,Object Detection Model
,Cache Set
,Camera Focus
,Roboflow Dataset Upload
,Color Visualization
,Expression
,Detections Filter
,SmolVLM2
,CSV Formatter
,Object Detection Model
,Bounding Box Visualization
,OCR Model
,Rate Limiter
,Path Deviation
,JSON Parser
,Clip Comparison
,Identify Outliers
,Model Comparison Visualization
,Anthropic Claude
,Corner Visualization
,Line Counter
,Multi-Label Classification Model
,Qwen2.5-VL
,Keypoint Visualization
,Data Aggregator
,Mask Visualization
,Image Threshold
,Florence-2 Model
,VLM as Detector
,QR Code Detection
,Polygon Zone Visualization
,Google Gemini
,Bounding Rectangle
,Image Preprocessing
,Twilio SMS Notification
,Roboflow Custom Metadata
,Reference Path Visualization
,Blur Visualization
,Detections Classes Replacement
,Image Contours
,Keypoint Detection Model
,Ellipse Visualization
,Byte Tracker
,Crop Visualization
,Stitch Images
,Line Counter
,Stitch OCR Detections
,LMM
,Webhook Sink
,Email Notification
,Size Measurement
,Dimension Collapse
,Velocity
,Image Blur
,Image Convert Grayscale
,Stability AI Inpainting
,Triangle Visualization
,Classification Label Visualization
,Keypoint Detection Model
,Path Deviation
,Time in Zone
,VLM as Classifier
,Detections Merge
,Local File Sink
,OpenAI
,Roboflow Dataset Upload
,Grid Visualization
,Gaze Detection
,LMM For Classification
,Llama 3.2 Vision
,Instance Segmentation Model
,Pixel Color Count
,Label Visualization
,YOLO-World Model
,Moondream2
,Dot Visualization
,Byte Tracker
,Overlap Filter
,Detections Consensus
,Template Matching
,Single-Label Classification Model
,Google Vision OCR
,Dynamic Crop
,Depth Estimation
,SIFT Comparison
,SIFT Comparison
,Byte Tracker
,Cosine Similarity
,Camera Calibration
,Detection Offset
,Halo Visualization
,Clip Comparison
,Polygon Visualization
,Instance Segmentation Model
,SIFT
,Dynamic Zone
,CLIP Embedding Model
,VLM as Detector
,Pixelate Visualization
,Dominant Color
,Detections Stitch
,Property Definition
,OpenAI
,CogVLM
,Florence-2 Model
,Image Slicer
,Cache Get
,Perspective Correction
,Delta Filter
,Multi-Label Classification Model
,Time in Zone
,Stability AI Image Generation
,Model Monitoring Inference Aggregator
,Distance Measurement
,Trace Visualization
,Single-Label Classification Model
,Buffer
,VLM as Classifier
,Segment Anything 2 Model
,Absolute Static Crop
,Continue If
,Circle Visualization
,Slack Notification
,Image Slicer
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM
in version v1
has.
Bindings
-
input
-
output
parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.image
(image_metadata
): Dictionary with image metadata required by supervision.structured_output
(dictionary
): Dictionary.raw_output
(string
): String value.*
(*
): Equivalent of any element.
Example JSON definition of step CogVLM
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}