CogVLM¶
Class: CogVLMBlockV1
Source: inference.core.workflows.core_steps.models.foundation.cog_vlm.v1.CogVLMBlockV1
CogVLM reached End Of Life
Due to dependencies conflicts with newer models and security vulnerabilities discovered in transformers
library patched in the versions of library incompatible with the model we announced End Of Life for CogVLM
support in inference
, effective since release 0.38.0
.
We are leaving this block in ecosystem until release 0.42.0
for clients to get informed about change that
was introduced.
Starting as of now, all Workflows using the block stop being functional (runtime error will be raised),
after inference release 0.42.0
- this block will be removed and Execution Engine will raise compilation
error seeing the block in Workflow definition.
Ask a question to CogVLM, an open source vision-language model.
This model requires a GPU and can only be run on self-hosted devices, and is not available on the Roboflow Hosted API.
This model was previously part of the LMM block.
Type identifier¶
Use the following identifier in step "type"
field: roboflow_core/cog_vlm@v1
to add the block as
as step in your workflow.
Properties¶
Name | Type | Description | Refs |
---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Text prompt to the CogVLM model. | ✅ |
json_output_format |
Dict[str, str] |
Holds dictionary that maps name of requested output field into its description. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow
runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to CogVLM
in version v1
.
- inputs:
Camera Calibration
,Image Preprocessing
,Image Contours
,Line Counter Visualization
,Halo Visualization
,Keypoint Detection Model
,Florence-2 Model
,OpenAI
,Corner Visualization
,Stability AI Image Generation
,Model Comparison Visualization
,LMM For Classification
,Clip Comparison
,Keypoint Visualization
,Crop Visualization
,Depth Estimation
,SIFT
,Image Blur
,Stitch Images
,Blur Visualization
,Image Convert Grayscale
,Background Color Visualization
,Bounding Box Visualization
,Image Slicer
,Dynamic Crop
,Local File Sink
,Perspective Correction
,Stitch OCR Detections
,Circle Visualization
,Triangle Visualization
,Dot Visualization
,Roboflow Dataset Upload
,CogVLM
,Mask Visualization
,Image Slicer
,Instance Segmentation Model
,Object Detection Model
,VLM as Classifier
,Ellipse Visualization
,LMM
,Color Visualization
,Email Notification
,Stability AI Inpainting
,Classification Label Visualization
,OpenAI
,OpenAI
,OCR Model
,Absolute Static Crop
,Slack Notification
,VLM as Detector
,Twilio SMS Notification
,Image Threshold
,Webhook Sink
,Llama 3.2 Vision
,Pixelate Visualization
,Trace Visualization
,Camera Focus
,Label Visualization
,Roboflow Dataset Upload
,Grid Visualization
,Google Vision OCR
,Stability AI Outpainting
,CSV Formatter
,Google Gemini
,Single-Label Classification Model
,Polygon Visualization
,SIFT Comparison
,Multi-Label Classification Model
,Relative Static Crop
,Model Monitoring Inference Aggregator
,Roboflow Custom Metadata
,Reference Path Visualization
,Florence-2 Model
,Polygon Zone Visualization
,Anthropic Claude
- outputs:
Line Counter
,Keypoint Detection Model
,Model Comparison Visualization
,Image Blur
,Continue If
,Bounding Box Visualization
,Distance Measurement
,Roboflow Dataset Upload
,Template Matching
,Mask Visualization
,VLM as Classifier
,Detections Classes Replacement
,JSON Parser
,CLIP Embedding Model
,Ellipse Visualization
,Email Notification
,OpenAI
,Slack Notification
,Twilio SMS Notification
,VLM as Detector
,Qwen2.5-VL
,Label Visualization
,Cache Get
,Single-Label Classification Model
,Buffer
,Velocity
,Identify Changes
,Model Monitoring Inference Aggregator
,Barcode Detection
,Roboflow Custom Metadata
,Reference Path Visualization
,Data Aggregator
,Florence-2 Model
,OCR Model
,Image Preprocessing
,Line Counter Visualization
,Florence-2 Model
,Image Contours
,Single-Label Classification Model
,Clip Comparison
,Detections Merge
,Stitch Images
,SIFT
,Depth Estimation
,Image Convert Grayscale
,Image Slicer
,Perspective Correction
,Stitch OCR Detections
,Circle Visualization
,Triangle Visualization
,Byte Tracker
,CogVLM
,Multi-Label Classification Model
,Segment Anything 2 Model
,VLM as Classifier
,OpenAI
,Classification Label Visualization
,Detection Offset
,Trace Visualization
,Detections Stabilizer
,Camera Focus
,Grid Visualization
,Clip Comparison
,YOLO-World Model
,Google Gemini
,PTZ Tracking (ONVIF)
.md),Relative Static Crop
,Detections Stitch
,Object Detection Model
,Path Deviation
,Identify Outliers
,Gaze Detection
,Halo Visualization
,OpenAI
,Stability AI Image Generation
,Keypoint Visualization
,Rate Limiter
,Crop Visualization
,Line Counter
,Size Measurement
,SmolVLM2
,Image Slicer
,Instance Segmentation Model
,Instance Segmentation Model
,QR Code Detection
,Property Definition
,Bounding Rectangle
,Dynamic Zone
,First Non Empty Or Default
,Byte Tracker
,Webhook Sink
,Google Vision OCR
,Stability AI Outpainting
,CSV Formatter
,Polygon Visualization
,Dominant Color
,VLM as Detector
,Anthropic Claude
,Detections Transformation
,Camera Calibration
,Dimension Collapse
,LMM For Classification
,Corner Visualization
,Cosine Similarity
,Pixel Color Count
,Time in Zone
,Perception Encoder Embedding Model
,Background Color Visualization
,Blur Visualization
,Dynamic Crop
,Local File Sink
,Dot Visualization
,Detections Filter
,SIFT Comparison
,Delta Filter
,Path Deviation
,Object Detection Model
,LMM
,Color Visualization
,Stability AI Inpainting
,Cache Set
,Byte Tracker
,Moondream2
,Absolute Static Crop
,Image Threshold
,Detections Consensus
,Time in Zone
,Keypoint Detection Model
,Llama 3.2 Vision
,Pixelate Visualization
,Roboflow Dataset Upload
,Overlap Filter
,SIFT Comparison
,Multi-Label Classification Model
,Expression
,Polygon Zone Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
CogVLM
in version v1
has.
Bindings
-
input
-
output
parent_id
(parent_id
): Identifier of parent for step output.root_parent_id
(parent_id
): Identifier of parent for step output.image
(image_metadata
): Dictionary with image metadata required by supervision.structured_output
(dictionary
): Dictionary.raw_output
(string
): String value.*
(*
): Equivalent of any element.
Example JSON definition of step CogVLM
in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/cog_vlm@v1",
"images": "$inputs.image",
"prompt": "my prompt",
"json_output_format": {
"count": "number of cats in the picture"
}
}