Qwen3-VL¶
Class: Qwen3VLBlockV1
Source: inference.core.workflows.core_steps.models.foundation.qwen3vl.v1.Qwen3VLBlockV1
This workflow block runs Qwen3-VL—a vision language model that accepts an image and an optional text prompt—and returns a text answer based on a conversation template.
Type identifier¶
Use the following identifier in step "type" field: roboflow_core/qwen3vl@v1to add the block as
as step in your workflow.
Properties¶
| Name | Type | Description | Refs |
|---|---|---|---|
name |
str |
Enter a unique identifier for this step.. | ❌ |
prompt |
str |
Optional text prompt to provide additional context to Qwen3-VL. Otherwise it will just be a default one, which may affect the desired model behavior.. | ❌ |
model_version |
str |
The Qwen3-VL model to be used for inference.. | ✅ |
system_prompt |
str |
Optional system prompt to provide additional context to Qwen3-VL.. | ❌ |
The Refs column marks possibility to parametrise the property with dynamic values available
in workflow runtime. See Bindings for more info.
Available Connections¶
Compatible Blocks
Check what blocks you can connect to Qwen3-VL in version v1.
- inputs:
Triangle Visualization,Morphological Transformation,Ellipse Visualization,Instance Segmentation Model,SIFT,Blur Visualization,Stitch Images,Halo Visualization,Stability AI Outpainting,Camera Focus,Single-Label Classification Model,Dynamic Crop,Model Comparison Visualization,Circle Visualization,Keypoint Visualization,Pixelate Visualization,Image Slicer,Color Visualization,Trace Visualization,Line Counter Visualization,Keypoint Detection Model,Label Visualization,Icon Visualization,SIFT Comparison,Dot Visualization,QR Code Generator,Object Detection Model,Corner Visualization,Camera Focus,Image Slicer,Depth Estimation,Contrast Equalization,Grid Visualization,Text Display,Reference Path Visualization,Image Threshold,Perspective Correction,Image Contours,Bounding Box Visualization,Polygon Zone Visualization,Polygon Visualization,Background Subtraction,Background Color Visualization,Halo Visualization,Stability AI Inpainting,Image Convert Grayscale,Crop Visualization,Camera Calibration,Polygon Visualization,Image Blur,Relative Static Crop,Heatmap Visualization,Absolute Static Crop,Classification Label Visualization,Multi-Label Classification Model,Mask Visualization,Image Preprocessing,Stability AI Image Generation - outputs:
Detections Consensus
Input and Output Bindings¶
The available connections depend on its binding kinds. Check what binding kinds
Qwen3-VL in version v1 has.
Bindings
-
input
images(image): The image to infer on..model_version(roboflow_model_id): The Qwen3-VL model to be used for inference..
-
output
parsed_output(dictionary): Dictionary.
Example JSON definition of step Qwen3-VL in version v1
{
"name": "<your_step_name_here>",
"type": "roboflow_core/qwen3vl@v1",
"images": "$inputs.image",
"prompt": "What is in this image?",
"model_version": "qwen3vl-2b-instruct",
"system_prompt": "You are a helpful assistant."
}