Qwen3.5-VL¶
Class: Qwen35VLBlockV1
Source: inference.core.workflows.core_steps.models.foundation.qwen3_5vl.v1.Qwen35VLBlockV1
This workflow block runs Qwen3.5-VL—a vision language model that accepts an image and an optional text prompt—and returns a text answer based on a conversation template.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/qwen3_5vl@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Optional text prompt to provide additional context to Qwen3.5-VL. Otherwise it will just be a default one, which may affect the desired model behavior.. | ❌ |
model_version |
str |
The Qwen3.5-VL model to be used for inference.. | ✅ |
system_prompt |
str |
Optional system prompt to provide additional context to Qwen3.5-VL.. | ❌ |
enable_thinking |
bool |
If true, enables Qwen3.5-VL's thinking mode, which allows the model to generate reasoning tokens before answering. The thinking output will be returned in the 'thinking' field.. | ❌ |
max_new_tokens |
int |
Maximum number of tokens to generate. If not set, the model's default will be used. Consider increasing for thinking mode.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Qwen3.5-VL in version v1.
- inputs:
Camera Focus,Label Visualization,Image Threshold,Stitch Images,Grid Visualization,Background Subtraction,Camera Focus,Polygon Zone Visualization,Multi-Label Classification Model,Color Visualization,Keypoint Detection Model,Mask Visualization,Circle Visualization,Instance Segmentation Model,Heatmap Visualization,SIFT,Crop Visualization,Image Slicer,Stability AI Outpainting,QR Code Generator,Bounding Box Visualization,Polygon Visualization,Text Display,Pixelate Visualization,Image Blur,Relative Static Crop,SIFT Comparison,Stability AI Image Generation,Morphological Transformation,Dynamic Crop,Stability AI Inpainting,Background Color Visualization,Image Slicer,Object Detection Model,Line Counter Visualization,Icon Visualization,Image Preprocessing,Blur Visualization,Dot Visualization,Ellipse Visualization,Halo Visualization,Polygon Visualization,Triangle Visualization,Model Comparison Visualization,Trace Visualization,Corner Visualization,Single-Label Classification Model,Image Convert Grayscale,Reference Path Visualization,Absolute Static Crop,Image Contours,Classification Label Visualization,Depth Estimation,Halo Visualization,Camera Calibration,Contrast Equalization,Perspective Correction,Keypoint Visualization - outputs:
Moondream2,Stitch OCR Detections,Image Threshold,OpenAI,Size Measurement,Mask Visualization,Time in Zone,Instance Segmentation Model,Circle Visualization,Path Deviation,Detections Consensus,Seg Preview,Crop Visualization,SAM 3,Stability AI Outpainting,QR Code Generator,Text Display,Anthropic Claude,Line Counter,Path Deviation,Clip Comparison,OpenAI,Segment Anything 2 Model,Stability AI Image Generation,Local File Sink,Google Gemini,Google Gemini,Slack Notification,Distance Measurement,Florence-2 Model,Ellipse Visualization,Dot Visualization,Halo Visualization,Anthropic Claude,Model Comparison Visualization,OpenAI,Corner Visualization,Email Notification,Classification Label Visualization,Instance Segmentation Model,Roboflow Dataset Upload,Depth Estimation,Contrast Equalization,Cache Get,Detections Stitch,Label Visualization,Stitch OCR Detections,Llama 3.2 Vision,Time in Zone,Polygon Zone Visualization,Color Visualization,SAM 3,Heatmap Visualization,OpenAI,CogVLM,Florence-2 Model,Model Monitoring Inference Aggregator,Line Counter,Cache Set,Roboflow Dataset Upload,Polygon Visualization,Bounding Box Visualization,Roboflow Custom Metadata,Pixel Color Count,Image Blur,SIFT Comparison,Detections Classes Replacement,Webhook Sink,Dynamic Crop,Stability AI Inpainting,Background Color Visualization,Morphological Transformation,Perception Encoder Embedding Model,LMM,Line Counter Visualization,Image Preprocessing,Icon Visualization,PTZ Tracking (ONVIF),Twilio SMS Notification,Triangle Visualization,Google Vision OCR,Polygon Visualization,Google Gemini,Anthropic Claude,Trace Visualization,Email Notification,Twilio SMS/MMS Notification,CLIP Embedding Model,SAM 3,Time in Zone,Reference Path Visualization,YOLO-World Model,Halo Visualization,LMM For Classification,Perspective Correction,Keypoint Visualization
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Qwen3.5-VL in version v1 has.
Bindings
-
input
images(image): The image to infer on..model_version(roboflow_model_id): The Qwen3.5-VL model to be used for inference..
-
output
parsed_output(dictionary): Dictionary.thinking(string): String value.
Example JSON definition of step Qwen3.5-VL in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/qwen3_5vl@v1",
"images": "$inputs.image",
"prompt": "What is in this image?",
"model_version": "qwen3_5-0.8b",
"system_prompt": "You are a helpful assistant.",
"enable_thinking": "<block_does_not_provide_example>",
"max_new_tokens": "<block_does_not_provide_example>"
}