Qwen3-VL¶
Class: Qwen3VLBlockV1
Source: inference.core.workflows.core_steps.models.foundation.qwen3vl.v1.Qwen3VLBlockV1
This workflow block runs Qwen3-VL—a vision language model that accepts an image and an optional text prompt—and returns a text answer based on a conversation template.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/qwen3vl@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Optional text prompt to provide additional context to Qwen3-VL. Otherwise it will just be a default one, which may affect the desired model behavior.. | ❌ |
model_version |
str |
The Qwen3-VL model to be used for inference.. | ✅ |
system_prompt |
str |
Optional system prompt to provide additional context to Qwen3-VL.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Qwen3-VL in version v1.
- inputs:
SIFT,Trace Visualization,Perspective Correction,Circle Visualization,Blur Visualization,Dot Visualization,Classification Label Visualization,Stitch Images,Single-Label Classification Model,Image Slicer,Ellipse Visualization,Camera Calibration,Heatmap Visualization,Image Threshold,Crop Visualization,Relative Static Crop,Grid Visualization,Morphological Transformation,Triangle Visualization,Text Display,Reference Path Visualization,Multi-Label Classification Model,Image Slicer,Instance Segmentation Model,Icon Visualization,QR Code Generator,Depth Estimation,Mask Visualization,Stability AI Outpainting,Stability AI Image Generation,Keypoint Detection Model,Camera Focus,Pixelate Visualization,Halo Visualization,Absolute Static Crop,Model Comparison Visualization,Object Detection Model,Label Visualization,Image Preprocessing,Stability AI Inpainting,Corner Visualization,Image Convert Grayscale,Background Color Visualization,Color Visualization,SIFT Comparison,Polygon Visualization,Polygon Zone Visualization,Halo Visualization,Background Subtraction,Keypoint Visualization,Line Counter Visualization,Bounding Box Visualization,Camera Focus,Contrast Equalization,Dynamic Crop,Image Contours,Image Blur,Polygon Visualization - outputs:
Detections Consensus
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Qwen3-VL in version v1 has.
Bindings
-
input
images(image): The image to infer on..model_version(roboflow_model_id): The Qwen3-VL model to be used for inference..
-
output
parsed_output(dictionary): Dictionary.
Example JSON definition of step Qwen3-VL in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/qwen3vl@v1",
"images": "$inputs.image",
"prompt": "What is in this image?",
"model_version": "qwen3vl-2b-instruct",
"system_prompt": "You are a helpful assistant."
}