CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference
, effective since release 0.38.0
.
We are leaving this block in ecosystem until release 0.42.0
for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0
- this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/cog_vlm@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM
in version v1
.
- inputs:
Blur Visualization
,Camera Focus
,CogVLM
,Image Threshold
,Polygon Zone Visualization
,Stability AI Inpainting
,VLM as Detector
,Relative Static Crop
,Image Preprocessing
,Slack Notification
,Keypoint Visualization
,Background Color Visualization
,Grid Visualization
,Local File Sink
,Image Convert Grayscale
,Trace Visualization
,Instance Segmentation Model
,Absolute Static Crop
,Roboflow Custom Metadata
,Color Visualization
,Perspective Correction
,OpenAI
,Twilio SMS Notification
,OCR Model
,Multi-Label Classification Model
,Classification Label Visualization
,Circle Visualization
,Google Vision OCR
,Camera Calibration
,Pixelate Visualization
,Image Slicer
,Clip Comparison
,Stitch OCR Detections
,Label Visualization
,Halo Visualization
,Triangle Visualization
,Reference Path Visualization
,Image Slicer
,OpenAI
,Single-Label Classification Model
,Webhook Sink
,Line Counter Visualization
,Google Gemini
,Roboflow Dataset Upload
,Llama 3.2 Vision
,Image Blur
,Corner Visualization
,Florence-2 Model
,SIFT Comparison
,Email Notification
,Object Detection Model
,LMM
,LMM For Classification
,Anthropic Claude
,Image Contours
,Roboflow Dataset Upload
,Dynamic Crop
,Polygon Visualization
,CSV Formatter
,Depth Estimation
,SIFT
,Florence-2 Model
,Ellipse Visualization
,Mask Visualization
,Model Monitoring Inference Aggregator
,Keypoint Detection Model
,Stitch Images
,Bounding Box Visualization
,Dot Visualization
,Stability AI Image Generation
,Model Comparison Visualization
,VLM as Classifier
,Crop Visualization
- outputs:
Line Counter
,Single-Label Classification Model
,Slack Notification
,YOLO-World Model
,Image Convert Grayscale
,Absolute Static Crop
,Perspective Correction
,OpenAI
,Distance Measurement
,Image Slicer
,OpenAI
,Cosine Similarity
,Halo Visualization
,Byte Tracker
,Corner Visualization
,Email Notification
,Object Detection Model
,Detections Classes Replacement
,Template Matching
,Roboflow Dataset Upload
,Overlap Filter
,Dynamic Crop
,Cache Set
,VLM as Classifier
,Depth Estimation
,Dynamic Zone
,Model Monitoring Inference Aggregator
,Cache Get
,Llama 3.2 Vision
,Anthropic Claude
,Crop Visualization
,Blur Visualization
,Dominant Color
,Image Threshold
,Stability AI Inpainting
,Relative Static Crop
,Path Deviation
,Clip Comparison
,Moondream2
,Twilio SMS Notification
,OCR Model
,Data Aggregator
,Google Vision OCR
,Pixelate Visualization
,Property Definition
,Stitch OCR Detections
,Image Slicer
,Time in Zone
,Webhook Sink
,JSON Parser
,Line Counter Visualization
,Byte Tracker
,Image Blur
,Florence-2 Model
,Delta Filter
,Detections Stabilizer
,LMM For Classification
,Instance Segmentation Model
,Keypoint Detection Model
,Detections Stitch
,Bounding Rectangle
,Stability AI Image Generation
,VLM as Classifier
,Identify Changes
,Detection Offset
,Camera Focus
,First Non Empty Or Default
,Polygon Zone Visualization
,Detections Filter
,Time in Zone
,Local File Sink
,Grid Visualization
,Instance Segmentation Model
,Trace Visualization
,Roboflow Custom Metadata
,Continue If
,Circle Visualization
,Dimension Collapse
,Clip Comparison
,Triangle Visualization
,QR Code Detection
,Gaze Detection
,Line Counter
,Size Measurement
,Byte Tracker
,LMM
,Detections Consensus
,Velocity
,Buffer
,Stitch Images
,Segment Anything 2 Model
,Object Detection Model
,Model Comparison Visualization
,Keypoint Detection Model
,SIFT Comparison
,CLIP Embedding Model
,CogVLM
,Expression
,SmolVLM2
,VLM as Detector
,Image Preprocessing
,Keypoint Visualization
,Background Color Visualization
,Pixel Color Count
,Barcode Detection
,Color Visualization
,Multi-Label Classification Model
,Classification Label Visualization
,Camera Calibration
,Label Visualization
,Reference Path Visualization
,Single-Label Classification Model
,Google Gemini
,Roboflow Dataset Upload
,Qwen2.5-VL
,Identify Outliers
,Multi-Label Classification Model
,Detections Transformation
,SIFT Comparison
,Image Contours
,Polygon Visualization
,CSV Formatter
,SIFT
,Florence-2 Model
,Detections Merge
,Mask Visualization
,Ellipse Visualization
,Rate Limiter
,Bounding Box Visualization
,Dot Visualization
,VLM as Detector
,Path Deviation
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM
in version v1
has.
Bindings
-
input
-
output
parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.image
(image_metadata
): Dictionary with image metadata required by supervision.structured_output
(dictionary
): Dictionary.raw_output
(string
): String value.*
(*
): Equivalent of any element.
Example JSON definition of step CogVLM
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}