CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference
, effective since release 0.38.0
.
We are leaving this block in ecosystem until release 0.42.0
for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0
- this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/cog_vlm@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM
in version v1
.
- inputs:
Crop Visualization
,Grid Visualization
,Stability AI Inpainting
,LMM For Classification
,Google Gemini
,OpenAI
,Image Blur
,SIFT
,Florence-2 Model
,Google Vision OCR
,Keypoint Detection Model
,Single-Label Classification Model
,Roboflow Dataset Upload
,Stability AI Outpainting
,Perspective Correction
,Line Counter Visualization
,Dot Visualization
,Dynamic Crop
,Local File Sink
,Llama 3.2 Vision
,Bounding Box Visualization
,Depth Estimation
,Image Contours
,Background Color Visualization
,Ellipse Visualization
,Anthropic Claude
,Camera Focus
,Roboflow Custom Metadata
,Color Visualization
,Roboflow Dataset Upload
,Stitch Images
,Label Visualization
,Relative Static Crop
,Object Detection Model
,Reference Path Visualization
,Multi-Label Classification Model
,Stitch OCR Detections
,VLM as Classifier
,Webhook Sink
,Instance Segmentation Model
,Keypoint Visualization
,Model Monitoring Inference Aggregator
,Slack Notification
,OpenAI
,Email Notification
,Blur Visualization
,Image Threshold
,Image Slicer
,SIFT Comparison
,Image Preprocessing
,Stability AI Image Generation
,Trace Visualization
,Classification Label Visualization
,CogVLM
,Mask Visualization
,Image Convert Grayscale
,Twilio SMS Notification
,Polygon Zone Visualization
,OCR Model
,CSV Formatter
,Camera Calibration
,Icon Visualization
,Model Comparison Visualization
,Pixelate Visualization
,QR Code Generator
,Florence-2 Model
,Image Slicer
,Halo Visualization
,Absolute Static Crop
,VLM as Detector
,OpenAI
,Circle Visualization
,Corner Visualization
,Triangle Visualization
,LMM
,Clip Comparison
,Polygon Visualization
- outputs:
Crop Visualization
,Grid Visualization
,Object Detection Model
,Detections Merge
,Byte Tracker
,VLM as Classifier
,Florence-2 Model
,Google Vision OCR
,Dimension Collapse
,Velocity
,Single-Label Classification Model
,Barcode Detection
,Data Aggregator
,Perspective Correction
,Line Counter Visualization
,Dot Visualization
,Local File Sink
,Dynamic Crop
,Instance Segmentation Model
,Bounding Box Visualization
,Ellipse Visualization
,Anthropic Claude
,Roboflow Custom Metadata
,Moondream2
,Keypoint Detection Model
,Label Visualization
,Relative Static Crop
,Identify Outliers
,Property Definition
,Object Detection Model
,Delta Filter
,Multi-Label Classification Model
,Distance Measurement
,VLM as Classifier
,Keypoint Visualization
,Gaze Detection
,Slack Notification
,Time in Zone
,OpenAI
,Email Notification
,Blur Visualization
,Image Preprocessing
,Path Deviation
,Twilio SMS Notification
,JSON Parser
,Template Matching
,OCR Model
,Detections Stitch
,Camera Calibration
,Time in Zone
,Multi-Label Classification Model
,Florence-2 Model
,Image Slicer
,Single-Label Classification Model
,Halo Visualization
,Absolute Static Crop
,Corner Visualization
,VLM as Detector
,Qwen2.5-VL
,OpenAI
,Stability AI Inpainting
,Cache Get
,LMM For Classification
,Google Gemini
,Line Counter
,Detections Classes Replacement
,Image Blur
,YOLO-World Model
,Clip Comparison
,Detections Stabilizer
,SIFT
,Detections Transformation
,Keypoint Detection Model
,First Non Empty Or Default
,Roboflow Dataset Upload
,Line Counter
,Detections Consensus
,Perception Encoder Embedding Model
,Stability AI Outpainting
,Dominant Color
,Llama 3.2 Vision
,Depth Estimation
,Detections Filter
,Image Contours
,Background Color Visualization
,CLIP Embedding Model
,Camera Focus
,Color Visualization
,Cache Set
,Roboflow Dataset Upload
,Identify Changes
,Bounding Rectangle
,Stitch Images
,SmolVLM2
,QR Code Detection
,Byte Tracker
,SIFT Comparison
,Reference Path Visualization
,Expression
,Stitch OCR Detections
,Webhook Sink
,Buffer
,Instance Segmentation Model
,Path Deviation
,Model Monitoring Inference Aggregator
,Segment Anything 2 Model
,Image Threshold
,SIFT Comparison
,Image Slicer
,Dynamic Zone
,Trace Visualization
,Stability AI Image Generation
,CogVLM
,Classification Label Visualization
,Mask Visualization
,PTZ Tracking (ONVIF)
.md),Image Convert Grayscale
,Overlap Filter
,Byte Tracker
,Pixel Color Count
,Polygon Zone Visualization
,Rate Limiter
,CSV Formatter
,Icon Visualization
,Model Comparison Visualization
,QR Code Generator
,Pixelate Visualization
,Continue If
,Cosine Similarity
,VLM as Detector
,OpenAI
,Detection Offset
,Circle Visualization
,Size Measurement
,Triangle Visualization
,LMM
,Clip Comparison
,Polygon Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM
in version v1
has.
Bindings
-
input
-
output
parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.image
(image_metadata
): Dictionary with image metadata required by supervision.structured_output
(dictionary
): Dictionary.raw_output
(string
): String value.*
(*
): Equivalent of any element.
Example JSON definition of step CogVLM
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}