CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference
, effective since release 0.38.0
.
We are leaving this block in ecosystem until release 0.42.0
for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0
- this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/cog_vlm@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM
in version v1
.
- inputs:
Crop Visualization
,Keypoint Detection Model
,Ellipse Visualization
,VLM as Classifier
,Stability AI Inpainting
,Stability AI Image Generation
,Blur Visualization
,Circle Visualization
,Pixelate Visualization
,Model Comparison Visualization
,Webhook Sink
,Stability AI Outpainting
,Bounding Box Visualization
,Grid Visualization
,Background Color Visualization
,Llama 3.2 Vision
,Image Contours
,LMM
,Object Detection Model
,Twilio SMS Notification
,Label Visualization
,Triangle Visualization
,Dynamic Crop
,Florence-2 Model
,Google Vision OCR
,Florence-2 Model
,Keypoint Visualization
,OCR Model
,Halo Visualization
,Corner Visualization
,Line Counter Visualization
,LMM For Classification
,Dot Visualization
,Anthropic Claude
,Slack Notification
,Roboflow Dataset Upload
,Icon Visualization
,Roboflow Custom Metadata
,Depth Estimation
,CogVLM
,Polygon Zone Visualization
,Stitch Images
,Image Slicer
,OpenAI
,Email Notification
,Relative Static Crop
,Google Gemini
,SIFT Comparison
,Image Slicer
,Image Threshold
,Perspective Correction
,Camera Focus
,Camera Calibration
,Classification Label Visualization
,Reference Path Visualization
,Color Visualization
,Model Monitoring Inference Aggregator
,Instance Segmentation Model
,Local File Sink
,OpenAI
,Clip Comparison
,Roboflow Dataset Upload
,Mask Visualization
,QR Code Generator
,OpenAI
,CSV Formatter
,Stitch OCR Detections
,SIFT
,Polygon Visualization
,Image Convert Grayscale
,VLM as Detector
,Image Blur
,Multi-Label Classification Model
,Trace Visualization
,Absolute Static Crop
,Image Preprocessing
,Single-Label Classification Model
- outputs:
Keypoint Detection Model
,Detections Filter
,Rate Limiter
,VLM as Classifier
,Data Aggregator
,Blur Visualization
,Circle Visualization
,Pixelate Visualization
,YOLO-World Model
,Model Comparison Visualization
,Webhook Sink
,Detections Transformation
,Time in Zone
,Time in Zone
,Grid Visualization
,Dominant Color
,Llama 3.2 Vision
,Image Contours
,Label Visualization
,Triangle Visualization
,Dynamic Crop
,Florence-2 Model
,Property Definition
,Florence-2 Model
,QR Code Detection
,Multi-Label Classification Model
,Corner Visualization
,Line Counter Visualization
,LMM For Classification
,CLIP Embedding Model
,Size Measurement
,Anthropic Claude
,Roboflow Dataset Upload
,Cache Set
,Roboflow Custom Metadata
,Distance Measurement
,First Non Empty Or Default
,Polygon Zone Visualization
,Stitch Images
,Image Slicer
,Relative Static Crop
,Segment Anything 2 Model
,Google Gemini
,SIFT Comparison
,Dimension Collapse
,Buffer
,Image Slicer
,Image Threshold
,Camera Focus
,Qwen2.5-VL
,Expression
,JSON Parser
,Overlap Filter
,Path Deviation
,Color Visualization
,OpenAI
,Barcode Detection
,Clip Comparison
,Roboflow Dataset Upload
,Mask Visualization
,CSV Formatter
,Stitch OCR Detections
,Delta Filter
,Detections Stabilizer
,Moondream2
,Detection Offset
,Multi-Label Classification Model
,Line Counter
,Detections Merge
,Absolute Static Crop
,Identify Changes
,Single-Label Classification Model
,Crop Visualization
,Ellipse Visualization
,Stability AI Inpainting
,Stability AI Image Generation
,Keypoint Detection Model
,Cache Get
,Stability AI Outpainting
,Detections Consensus
,Single-Label Classification Model
,Bounding Box Visualization
,Path Deviation
,Cosine Similarity
,VLM as Classifier
,Identify Outliers
,Background Color Visualization
,LMM
,Object Detection Model
,Twilio SMS Notification
,Detections Stitch
,VLM as Detector
,Instance Segmentation Model
,Keypoint Visualization
,OCR Model
,Halo Visualization
,PTZ Tracking (ONVIF)
.md),SmolVLM2
,Time in Zone
,Perception Encoder Embedding Model
,Template Matching
,Continue If
,Dot Visualization
,Clip Comparison
,Slack Notification
,Icon Visualization
,Depth Estimation
,CogVLM
,Byte Tracker
,OpenAI
,Email Notification
,Detections Classes Replacement
,SIFT Comparison
,Line Counter
,Byte Tracker
,Pixel Color Count
,Gaze Detection
,Perspective Correction
,Byte Tracker
,Classification Label Visualization
,Instance Segmentation Model
,Reference Path Visualization
,Local File Sink
,Model Monitoring Inference Aggregator
,Camera Calibration
,Velocity
,Bounding Rectangle
,QR Code Generator
,OpenAI
,SIFT
,Polygon Visualization
,VLM as Detector
,Image Convert Grayscale
,Dynamic Zone
,Image Blur
,Trace Visualization
,Google Vision OCR
,Image Preprocessing
,Object Detection Model
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM
in version v1
has.
Bindings
-
input
-
output
parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.image
(image_metadata
): Dictionary with image metadata required by supervision.structured_output
(dictionary
): Dictionary.raw_output
(string
): String value.*
(*
): Equivalent of any element.
Example JSON definition of step CogVLM
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}